Skip to content

Releases: RWKV/rwkv.cpp

master-1198892

29 Apr 12:44
1198892
Compare
Choose a tag to compare
Add support for Q5_0, Q5_1 and Q8_0 formats; remove Q4_1_O format (#44)

* Remove Q4_3 support

* Add Q5_0, Q5_1, Q8_0 support

* Add more clear message when loading Q4_3 model

* Remove Q4_1_O format

* Fix indentation in .gitmodules

* Simplify sanitizer matrix

master-06dac0f

29 Apr 16:37
06dac0f
Compare
Choose a tag to compare
Use main ggml repo (#45)

master-c736ef5

22 Apr 15:36
c736ef5
Compare
Choose a tag to compare
Improve chat_with_bot.py script (#39)

master-3587ff9

22 Apr 15:30
3587ff9
Compare
Choose a tag to compare
Sync ggml with upstream (#38)

* Sync ggml with upstream

* Remove file filters from Actions triggers

* Update ggml

* Add Q4_2 and Q4_3 support

* Improve output of perplexity measuring script

* Add tests for new formats

* Add token limit argument to perplexity measuring script

* Update README

* Update README

* Update ggml

* Use master branch of ggml

master-1be9fda

20 Apr 06:02
1be9fda
Compare
Choose a tag to compare
Add robust automatic testing (#33)

master-7b28076

18 Apr 12:54
Compare
Choose a tag to compare
Fix Q4_1_O optimization

master-2ef7ee0

18 Apr 05:51
Compare
Choose a tag to compare
Optimize Q4_1_O by moving outlier multiplication out of the dequantiz…

…e+dot loop

master-0a8157d

17 Apr 15:19
0a8157d
Compare
Choose a tag to compare
Merge pull request #28 from saharNooby/ggml-to-submodule

Move ggml to submodule

master-84e0698

08 Apr 11:54
84e0698
Compare
Choose a tag to compare
Merge pull request #16 from saharNooby/outliers-preserving-quantizati…

…on-PR

Add Q4_1_O quantization format that preserves outliers in weights and does dot in FP32

master-5d99741

08 Apr 15:38
5d99741
Compare
Choose a tag to compare
Merge pull request #18 from yorkzero831/master

Update github action to support linux and macos asset uploading