New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
基本支持RWKV6 #171
base: master
Are you sure you want to change the base?
基本支持RWKV6 #171
Conversation
Hi! Although I've stepped down as the maintainer of
Edit: I may be mistaken about performance issues, because The second point still stands -- there must be quality assurance. |
Hey there! Thanks for the pull request for the long-awaited RWKV V6 support. I'll soon get to reviewing the code, it's on my schedule! Unfortunately I cannot read Chinese, so I've had to get some help translating. |
Sorry, it's not convenient for me to reply at school. |
I'll go find a small model to test |
w = ggml_reshape_4d(ctx, w, 1, head_size, head_count, sequence_length); | ||
|
||
// w = torch.exp(-torch.exp(w)) | ||
w = rwkv_exp(ctx,rwkv_1_minus_x(ctx,rwkv_exp(ctx, w))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this call to rwkv_1_minus_x() belong here? The python function isn't w = torch.exp(1 - torch.exp(w))
, so shouldn't it be w = rwkv_exp (ctx, ggml_neg (ctx, rwkv_exp (ctx, w)));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rwkv_1_minus_x
is defined as 1 - x
, ggml_neg
is defined as -x
. So the code looks correct to me -- indeed, we want exp(1 - exp(w))
, not exp(-exp(w))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This refers to the line here: https://github.com/BlinkDL/ChatRWKV/blob/8c7956743703afddd9bbb09ec5fbaf95e5b05227/RWKV_v6_demo.py#L187
w = torch.exp(-torch.exp(w.float()))
There's no subtraction operation, only negation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, okay. I misread your comment, sorry; looks like you are right :)
基本支持RWKV6
转换,加载,lora好了,但是计算图有问题,rwkv_graph.inc:348行附近怎么都改不好,我打了感叹号标记