Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Ubuntu x86_64) Segmentation Fault Running Q4_1_O Model #32

Closed
cryscan opened this issue Apr 18, 2023 · 12 comments
Closed

(Ubuntu x86_64) Segmentation Fault Running Q4_1_O Model #32

cryscan opened this issue Apr 18, 2023 · 12 comments

Comments

@cryscan
Copy link

cryscan commented Apr 18, 2023

System: Ubuntu 20.04.6 LTS
GCC: 9.4.0
CPU: Intel(R) Xeon(R) Platinum 8358P

Issue:

$ python rwkv/chat_with_bot.py /path/to/models/Raven-14B-v9-Q4.bin 
Loading 20B tokenizer
System info: AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | 
Loading RWKV model
Processing 92 prompt tokens, may take a while
Segmentation fault (core dumped)
@saharNooby
Copy link
Collaborator

Can it be reproduced with 169M model?

@BuilderGuy1
Copy link

BuilderGuy1 commented Apr 18, 2023

I'm having the same issue with Apple Silicon. I tested latest 14B & 7B, each with all 3 quantization options. I also tried with 3B v10 Q4_1_0.

@dmahurin
Copy link

dmahurin commented Apr 19, 2023

Same issue on Apple M2, with all revisions with the Q4_1_0. Tried 3B and 169M.

@poisson-fish
Copy link

confirm this issue under Arch WSL with RWKV-4-Raven-14B-v9-Eng99%-Other1%-20230412-ctx8192_q4_1_0.bin

@L-M-Sherlock
Copy link
Contributor

I also encounter this problem in my mac m1. Tried 7B and 3B. It occurred before Processing 92 prompt tokens.

@saharNooby
Copy link
Collaborator

Probably address misalignment. I'm working on it in #33

@saharNooby
Copy link
Collaborator

Alignment fix merged. Please clone the repo from scratch and try again:

git clone --recursive https://github.com/saharNooby/rwkv.cpp.git

@L-M-Sherlock
Copy link
Contributor

L-M-Sherlock commented Apr 20, 2023

Thanks! #33 solved my problem. But another bug appeared. The bot only repeats,> Bob: Hello, Bob..

image

python rwkv/convert_pytorch_to_ggml.py ./RWKV-4-Raven-3B-v9x-Eng49%-Chn50%-Other1%-20230417-ctx4096.pth ./rwkv.cpp-3B.bin float16
python rwkv/quantize.py ./rwkv.cpp-3B.bin ./rwkv.cpp-3B-Q4_1_0.bin 4
python rwkv/chat_with_bot.py ./rwkv.cpp-3B-Q4_1_0.bin

@saharNooby
Copy link
Collaborator

saharNooby commented Apr 20, 2023

@BuilderGuy1 @poisson-fish @dmahurin If possible, can you also confirm that segfault is fixed? (please clone from scratch or don't forget to update git submodules)

@L-M-Sherlock Thanks for the input. Looks like default prompt is not good for Raven, related issue is #22

@dmahurin
Copy link

dmahurin commented Apr 20, 2023

Thanks @saharNooby. It works now on Apple M2 with 3B and 169M.

It also works with rwkv-4_raven-7b-v9 and rwkv-4_raven-14b-v9, though 14b is slow on M2.

@saharNooby
Copy link
Collaborator

Thanks for testing it!

@poisson-fish
Copy link

poisson-fish commented Apr 20, 2023

sorry for late reply @saharNooby, the previous crash is fixed however now I get SIGSEGV:

python rwkv/chat_with_bot.py ./build/models/RWKV/RWKV-4-Raven-14B-v9-Eng99\%-Other1\%-20230412-ctx8192_q4_1
_0.bin
Loading 20B tokenizer
System info: AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
Loading RWKV model
Processing 92 prompt tokens, may take a while
~/Documents/Projects/cpp/llamapi/rwkv.cpp/rwkv/rwkv_cpp_model.py:100: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  state_out.storage().data_ptr(),
~/Documents/Projects/cpp/llamapi/rwkv.cpp/rwkv/rwkv_cpp_model.py:101: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly.  To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
  logits_out.storage().data_ptr()
fish: Job 1, 'python rwkv/chat_with_bot.py ./…' terminated by signal SIGSEGV (Address boundary error)

can open new issue if necessary.

edit: disregard, required a submodule update and works now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants