New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement basic CLBlast support #110
Conversation
Unzip the OpenCL SDK and CLBlast distribution into the repo root, then enable RWKV_CLBLAST and regenerate makefiles to pick them up. Currently builds and runs.
Somehow the path handling is different here which requires me to be quite a bit more annoying about it.
Hey @saharNooby macos is failing again for another reason that isn't my fault, I'm starting to think github is just cursed |
Finally I realized how to push into PRs... It turns out I was trying to push into your I'll try various hacks here to get it MacOS build. |
Well, that seems to have fixed it. I think the biggest problem we have right now is that we don't seem to be able to test these libraries on CI or offer them in GitHub releases. We should probably try to do something about that. |
I'm not sure I understand. You talking about cuBLAS and CLBlast? |
OMG LOL IT FIXED THAT ISSUE FOR WHICH SANITIZER WAS ENABLED |
Yes, currently people can't get prebuilt binaries for either of those features, and they aren't tested in CI.
LOL |
I'll add it into my backlog, seems easy enough to do. |
I would really prefer to have CLBlast build documented. PR desc looks good enough, maybe format it a little and put it into But I will not block this PR because of this, I can write the doc later myself. |
The PR's currently blocked anyway because I have only tested the small world models with the little sequence.c and confirmed the logits output is identical, but I have not tested any other models (in particular the larger raven models) and that probably needs to work before we merge this. I have no reason to believe that it doesn't but need to make sure |
Used code for CLBlast from PR RWKV#110
@Mathmagician8191 has done some testing with this i think and i'm not really capable of writing documentation on this right now (on account of dissociative identity disorder hehe) but the code seems functional at least |
Most of the work was getting CMake to find it. Just enable
RWKV_CLBLAST
and then drop the OpenCL & CLBlast distributions into the repository root like so:the actual folders after unzipping, of course!!
Marked as draft due to lack of testing—I unfortunately lost my bespoke chat script at some point and so can't really do my own experimentation immediately, but I do want to put this out there and have it available for others to see and test out for themselves.
Performance seems to be almost exactly on-par with CUDA in my experience. So maybe this will be getting CUDA-like performance out of Intel and AMD GPUs - exciting :D
It took me about 2 hours and 30 minutes of real time to complete this pull request :)