基本支持RWKV6 #171

YuChuXi · 2024-04-04T12:03:58Z

基本支持RWKV6
转换，加载，lora好了，但是计算图有问题，rwkv_graph.inc:348行附近怎么都改不好，我打了感叹号标记

saharNooby · 2024-04-06T14:39:34Z

Hi!

Although I've stepped down as the maintainer of rwkv.cpp, I think my input will still be valuable:

rwkv_att_v6 looks like a naive (as in "creates and operates on full-blown matrices") implementation of the attention. If it's true, then v6 models will be too slow and probably unusable on large context lengths. See comparisons for v5. I recommend pulling an optimized implementation from somewhere and adapting it into rwkv.cpp, as was done with v5.
there are no tests for v6 models, no Tiny RWKV trained, etc. I would not merge v6 support without proper tests added, because it is asking for something to be unknowingly broken.

Edit: I may be mistaken about performance issues, because rwkv_att_v6 actually calls rwkv_wkv_v5. In any case, I recommend doing latency/memory usage measurements and comparing them with v5 models.

The second point still stands -- there must be quality assurance.

LaylBongers · 2024-04-09T08:26:38Z

Hey there! Thanks for the pull request for the long-awaited RWKV V6 support. I'll soon get to reviewing the code, it's on my schedule! Unfortunately I cannot read Chinese, so I've had to get some help translating.
As already mentioned, to make sure support is (and continues to be) functional, we need to include some tests.

YuChuXi · 2024-04-10T09:32:35Z

Sorry, it's not convenient for me to reply at school.
This pr only supports the conversion and loading of RWKV6.
The file "rwkv_graph.inc" has an issue near lines 342-347 that I cannot solve.
No matter how I modify it, I may encounter errors such as "GGML_ASSERT: /media/yuchuxi/YuZi/Project/Mozi/rwkv.cpp/ggml/src/ggml.c:6499: ggml_is_contiguous(a)
"

YuChuXi · 2024-04-10T09:33:42Z

I'll go find a small model to test

NoahBPeterson · 2024-04-21T18:13:38Z

rwkv_graph.inc

+    w = ggml_reshape_4d(ctx, w, 1, head_size, head_count, sequence_length);
+
+    // w = torch.exp(-torch.exp(w))
+    w = rwkv_exp(ctx,rwkv_1_minus_x(ctx,rwkv_exp(ctx, w)));


Does this call to rwkv_1_minus_x() belong here? The python function isn't w = torch.exp(1 - torch.exp(w)), so shouldn't it be w = rwkv_exp (ctx, ggml_neg (ctx, rwkv_exp (ctx, w)));

rwkv_1_minus_x is defined as 1 - x, ggml_neg is defined as -x. So the code looks correct to me -- indeed, we want exp(1 - exp(w)), not exp(-exp(w)).

This refers to the line here: https://github.com/BlinkDL/ChatRWKV/blob/8c7956743703afddd9bbb09ec5fbaf95e5b05227/RWKV_v6_demo.py#L187

w = torch.exp(-torch.exp(w.float()))

There's no subtraction operation, only negation

Oh, okay. I misread your comment, sorry; looks like you are right :)

YuChuXi added 10 commits April 3, 2024 20:16

att G

d89754c

evel

6ba24eb

fix Dv6

3ef4086

fix load error

cdb1064

a?

85f8031

load

d56d64f

v6 but a?

1abd4e8

l368 附近谁来修一下

e2d7143

lorainto

2912032

清除调试用的代码

08db1e9

NoahBPeterson reviewed Apr 21, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

基本支持RWKV6 #171

基本支持RWKV6 #171

YuChuXi commented Apr 4, 2024 •

edited

saharNooby commented Apr 6, 2024 •

edited

LaylBongers commented Apr 9, 2024

YuChuXi commented Apr 10, 2024

YuChuXi commented Apr 10, 2024

NoahBPeterson Apr 21, 2024 •

edited

saharNooby Apr 27, 2024

NoahBPeterson Apr 27, 2024

saharNooby Apr 27, 2024

基本支持RWKV6 #171

Are you sure you want to change the base?

基本支持RWKV6 #171

Conversation

YuChuXi commented Apr 4, 2024 • edited

saharNooby commented Apr 6, 2024 • edited

LaylBongers commented Apr 9, 2024

YuChuXi commented Apr 10, 2024

YuChuXi commented Apr 10, 2024

NoahBPeterson Apr 21, 2024 • edited

Choose a reason for hiding this comment

saharNooby Apr 27, 2024

Choose a reason for hiding this comment

NoahBPeterson Apr 27, 2024

Choose a reason for hiding this comment

saharNooby Apr 27, 2024

Choose a reason for hiding this comment

YuChuXi commented Apr 4, 2024 •

edited

saharNooby commented Apr 6, 2024 •

edited

NoahBPeterson Apr 21, 2024 •

edited