Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add wkv v5 custom operator #148

Merged
merged 3 commits into from Nov 14, 2023
Merged

Add wkv v5 custom operator #148

merged 3 commits into from Nov 14, 2023

Conversation

saharNooby
Copy link
Collaborator

Before the change (model is RWKV-5-World-3B-v2-OnlyForTest_86%25_trained-20231108-ctx4096-Q5_1.bin):

Will allocate 180 MB
CPU, 24 threads, sequence of 1: 89 ms per token

Will allocate 290 MB (sequence_length = 2)
CPU, 24 threads, sequence of 2: 59 ms per token

Will allocate 946 MB (sequence_length = 8)
CPU, 24 threads, sequence of 8: 41 ms per token

Will allocate 3568 MB (sequence_length = 32)
CPU, 24 threads, sequence of 32: 45 ms per token

Will allocate 7064 MB (sequence_length = 64)
CPU, 24 threads, sequence of 64: 49 ms per token

After the change:

Will allocate 102 MB
CPU, 24 threads, sequence of 1: 70 ms per token

Will allocate 112 MB (sequence_length = 2)
CPU, 24 threads, sequence of 2: 37 ms per token

Will allocate 170 MB (sequence_length = 8)
CPU, 24 threads, sequence of 8: 17 ms per token

Will allocate 399 MB (sequence_length = 32)
CPU, 24 threads, sequence of 32: 14 ms per token

Will allocate 706 MB (sequence_length = 64)
CPU, 24 threads, sequence of 64: 14 ms per token

@saharNooby saharNooby merged commit 74f50ae into master Nov 14, 2023
24 checks passed
@saharNooby saharNooby deleted the add-wkv-v5-custom-operator branch November 14, 2023 15:06
@saharNooby saharNooby mentioned this pull request Apr 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant