Releases · RWKV/rwkv.cpp

23 Sep 13:20

39ed572

master-39ed572

Various improvements (#131)

* Implement model head offloading

* Guess the tokenizer from n_vocab

* Make PyTorch optional for inference

* Add function to offload layers

* Add rwkv_eval_sequence_in_chunks

Assets 7

23 Sep 16:03

github-actions

master-0df970a

0df970a

master-0df970a

Decrease memory padding for serial and sequential contexts (#132)

Assets 7

20 Sep 16:49

github-actions

master-6caa45e

6caa45e

master-6caa45e

Python API restructurization & code style improvements (#130)

* Replace tabs with 4 spaces

* Refactor tests

* Rename Python scripts directory to "python"

* Create a separate package for the official Python API

* Move Python inference example to a separate file

* Add missing const

* Refactor extras

* Split rwkv.cpp into smaller files

* Clean up cpp code

* Rename rwkv package to rwkv_cpp

* Add missing type hints

* Rewrite automatic library lookup

* Add compatibility warning

* Fix MacOS build

* Fix MacOS build

Assets 7

19 Sep 14:46

github-actions

master-8db73b1

8db73b1

master-8db73b1

Update ggml (#128)

* Fix quantize.py doc

* Add Q5 format compatibility test

* Update ggml

* Add documentation about limitations of sequence mode

* Fix most compiler warnings

* Clean up CMakeLists.txt

* Assert contiguity instead of assuming it

* Update README.md

* Fix warnings

* Try to fix compilation error

* Attempt to fix Ubuntu build

* Attempt to fix Ubuntu build

* Restore all build jobs

* Allow sequence lengths of up to 64 out of the box by forking ggml

Assets 7

09 Sep 07:14

github-actions

master-d6c691e

d6c691e

master-d6c691e

add other language bindings (#126)

* add other language bindings

* Update README.md

---------

Co-authored-by: Alex <saharNooby@users.noreply.github.com>

Assets 7

20 Aug 06:31

github-actions

master-2d3cdd7

2d3cdd7

master-2d3cdd7

only append to cpu string if not initialized (#125)

* only append to cpu string if not initialized

* Fix code style

---------

Co-authored-by: Alex <saharNooby@users.noreply.github.com>

Assets 7

21 Jul 13:37

github-actions

master-84f34c5

84f34c5

master-84f34c5

Implement basic CLBlast support (#110)

* Get this thing building

Unzip the OpenCL SDK and CLBlast distribution into the repo root,
then enable RWKV_CLBLAST and regenerate makefiles to pick them up.

Currently builds and runs.

* Really offload tensors to OpenCL rather than cuBLAS

* Fix CLBlast builds in CMake release mode

Somehow the path handling is different here which requires me to
be quite a bit more annoying about it.

* Remove `brew update`

* Try building without sanitizer (maybe it would work this time?)

---------

Co-authored-by: saharNooby <saharnooby@protonmail.com>

Assets 7

19 Jul 09:36

github-actions

master-f685aa4

f685aa4

master-f685aa4

Fix "'NoneType' object has no attribute 'cast'" error when model is f…

…reed (#117)

Assets 7

18 Jul 09:39

github-actions

master-25ee75e

25ee75e

master-25ee75e

Expose n_vocab, n_embed, n_layer to the Python interface (#118)

Assets 7

27 Jun 09:29

github-actions

master-84634c0

84634c0

master-84634c0

Elide logits if the logits pointer parameter is NULL (#107)

* Completely skip calculation of logits if nobody cares

This speeds up sequence mode evaluations by up to 20% if you ingest
a large prompt and then only retrieve the logits at the very end.

Note that you must pass a NULL pointer to the logits parameter in
order to take advantage of this optimization.

* logits_out=NULL documentation

Assets 7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: RWKV/rwkv.cpp

master-39ed572

master-0df970a

master-6caa45e

master-8db73b1

master-d6c691e

master-2d3cdd7

master-84f34c5

master-f685aa4

master-25ee75e

master-84634c0