Skip to content

Commit

Permalink
Update README
Browse files Browse the repository at this point in the history
  • Loading branch information
saharNooby committed Apr 22, 2023
1 parent e7e4389 commit c82d92b
Showing 1 changed file with 7 additions and 3 deletions.
10 changes: 7 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,9 +89,13 @@ python rwkv/quantize.py ~/Downloads/rwkv.cpp-169M.bin ~/Downloads/rwkv.cpp-169M-

Formats available:

- `4`: `Q4_1_O`, OK quality, moderately fast (20% slower than `FP16`).
- `3`: `Q4_1`, worst quality, fast (comparable to `FP16`).
- `2`: `Q4_0`, poor quality, very fast.
- `6`: `Q4_3`, OK quality, fast.
- `5`: `Q4_2`, poor quality, fast.
- `4`: `Q4_1_O`, best quality, slow (20% slower than `FP16`).
- `3`: `Q4_1`, poor quality, very fast.
- `2`: `Q4_0`, worst quality, very fast.

If you use `rwkv.cpp` for anything serious (just having fun is serious enough), please test all available formats for perplexity and latency on a representative dataset, and decide correct trade-off for yourself.

### 4. Run the model

Expand Down

0 comments on commit c82d92b

Please sign in to comment.