Skip to content

Releases: RWKV/rwkv.cpp

master-39ed572

23 Sep 13:20
39ed572
Compare
Choose a tag to compare
Various improvements (#131)

* Implement model head offloading

* Guess the tokenizer from n_vocab

* Make PyTorch optional for inference

* Add function to offload layers

* Add rwkv_eval_sequence_in_chunks

master-0df970a

23 Sep 16:03
0df970a
Compare
Choose a tag to compare
Decrease memory padding for serial and sequential contexts (#132)

master-6caa45e

20 Sep 16:49
6caa45e
Compare
Choose a tag to compare
Python API restructurization & code style improvements (#130)

* Replace tabs with 4 spaces

* Refactor tests

* Rename Python scripts directory to "python"

* Create a separate package for the official Python API

* Move Python inference example to a separate file

* Add missing const

* Refactor extras

* Split rwkv.cpp into smaller files

* Clean up cpp code

* Rename rwkv package to rwkv_cpp

* Add missing type hints

* Rewrite automatic library lookup

* Add compatibility warning

* Fix MacOS build

* Fix MacOS build

master-8db73b1

19 Sep 14:46
8db73b1
Compare
Choose a tag to compare
Update ggml (#128)

* Fix quantize.py doc

* Add Q5 format compatibility test

* Update ggml

* Add documentation about limitations of sequence mode

* Fix most compiler warnings

* Clean up CMakeLists.txt

* Assert contiguity instead of assuming it

* Update README.md

* Fix warnings

* Try to fix compilation error

* Attempt to fix Ubuntu build

* Attempt to fix Ubuntu build

* Restore all build jobs

* Allow sequence lengths of up to 64 out of the box by forking ggml

master-d6c691e

09 Sep 07:14
d6c691e
Compare
Choose a tag to compare
add other language bindings (#126)

* add other language bindings

* Update README.md

---------

Co-authored-by: Alex <saharNooby@users.noreply.github.com>

master-2d3cdd7

20 Aug 06:31
2d3cdd7
Compare
Choose a tag to compare
only append to cpu string if not initialized (#125)

* only append to cpu string if not initialized

* Fix code style

---------

Co-authored-by: Alex <saharNooby@users.noreply.github.com>

master-84f34c5

21 Jul 13:37
84f34c5
Compare
Choose a tag to compare
Implement basic CLBlast support (#110)

* Get this thing building

Unzip the OpenCL SDK and CLBlast distribution into the repo root,
then enable RWKV_CLBLAST and regenerate makefiles to pick them up.

Currently builds and runs.

* Really offload tensors to OpenCL rather than cuBLAS

* Fix CLBlast builds in CMake release mode

Somehow the path handling is different here which requires me to
be quite a bit more annoying about it.

* Remove `brew update`

* Try building without sanitizer (maybe it would work this time?)

---------

Co-authored-by: saharNooby <saharnooby@protonmail.com>

master-f685aa4

19 Jul 09:36
f685aa4
Compare
Choose a tag to compare
Fix "'NoneType' object has no attribute 'cast'" error when model is f…

…reed (#117)

master-25ee75e

18 Jul 09:39
25ee75e
Compare
Choose a tag to compare
Expose n_vocab, n_embed, n_layer to the Python interface (#118)

master-84634c0

27 Jun 09:29
84634c0
Compare
Choose a tag to compare
Elide logits if the logits pointer parameter is NULL (#107)

* Completely skip calculation of logits if nobody cares

This speeds up sequence mode evaluations by up to 20% if you ingest
a large prompt and then only retrieve the logits at the very end.

Note that you must pass a NULL pointer to the logits parameter in
order to take advantage of this optimization.

* logits_out=NULL documentation