-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mpt30b not showing any response. #5
Comments
I'm having the same problem. Processing goes to 100% for a few seconds but returns empty answers. It goes around 24Gb of RAM usage. I'm running mpt-30b-chat.ggmlv0.q5_1.bin model instead of default q4_0. PC: Ryzen 5900X and 32 Gb RAM. |
For cases like this I recommend docker because of the environment issues. I have windows as well, here's how I run it. Use a container like so:
Clone the repo:
Follow directions in the readme for the rest: https://github.com/abacaj/mpt-30B-inference#setup. |
Thank you. Great work by the way! |
Likely has to do with ctransformers library, since that is how the bindings work from python -> ggml (though I'm not certain of it) |
I have observed that when processing user queries, the CPU usage increases but I do not receive a response. |
python3 inference.py |
Issue Fixed Replace files with |
I'm also facing this issue on Windows. |
during the inference after the user input the model waits for few seconds but does not respond anything just returns empty. I'm using it on dell optiplex 7070 micro with intel core i7 9700t with 8 cores and 32gb ram.
The text was updated successfully, but these errors were encountered: