Releases: RWKV/rwkv.cpp
Releases · RWKV/rwkv.cpp
master-d26791b
Silence PyTorch warnings by using untyped storage (#72)
master-b61d94a
Flush output every token in generate_completions.py (#73)
master-83983bb
last second move things over in the error enum (#71) I realized I didn't give enough space for additional failure modes to be added in the future, and I should do this as soon as possible to prevent things from being made that depend on the old constants
master-7cbfbc5
Switch to fstat64 (#70) Switch to fstat64
master-3ca9c7f
Move graph building into its own function (#69) step towards #50 and loading models from memory among other things
master-9e2a0de
Add rwkv_set_print_errors and rwkv_get_last_error (#68) * Add rwkv_set_print_errors and rwkv_get_last_error Fixes #63 This allows retrieving errors from the library without having to pipe stderr. Also it was annoying that rwkv.cpp assumed control of the caller process by doing things like calling abort() when it shouldn't, so I also fixed that. The basic way this works is: 1. by default, not much is different, except more errors are caught, and rwkv.cpp should never abort the process or throw a C++ exception. 2. the difference comes when you call rwkv_set_print_errors (working title): 1. errors will no longer be printed to stderr automatically 2. errors will be assigned to a thread-local variable (during init/quantization) or a context-local variable (during eval) 3. the last error can be retrieved using rwkv_get_last_error I also overhauled the assert macros so more error cases are handled: - the file is now closed if rwkv_init_from_file exits early - the ggml context is freed if rwkv_init_from_file exits early - if parameters cannot be found an error will be set about it I also made some optimizations: - just use fstat instead of opening the file twice - deduplicated some code / removed edge cases that do not exist - switched to ggml inplace operations where they exist test_tiny_rwkv.c seems to run perfectly fine. The Python scripts also. The built DLL is perfectly backwards compatible with existing API consumers like the python library, because it does not remove or change any functions, only adds some optional ones. The sad thing is that this will break every PR because the error handling in this library was terrible and needed to be totally redone. But I think it is worth it. * Fix typo Co-authored-by: Alex <saharNooby@users.noreply.github.com> * Visual Studio lied and _fileno is incorrect * Fix trailing comma in assert macros This was an accident left over from something that didn't pan out, some compilers do not like when function arguments have a trailing comma. * Include header file for fstat * Remove uses of std::make_unique * Fix width of format string argument on all platforms * Use C free for smart pointers * Revert "Use C free for smart pointers" and try nothrow * Initialize cgraph to zero * Fix ggml_cgraph initialization * Zero-initialize allocations --------- Co-authored-by: Alex <saharNooby@users.noreply.github.com>
master-1c363e6
Fix encoding issue when loading prompt data (#58) * Fix encoding issue when loading prompt data * Update chat_with_bot.py Fix code style --------- Co-authored-by: Alex <saharNooby@users.noreply.github.com>
master-a3178b2
Various improvements (#52) * Update ggml * Add link to pre-quantized models in README * Enable W4 for MSVC * Fix warnings, clean up code * Fix LoRA merge script
master-5eb8f09
Various improvements (#47) * Update ggml * Pack only rwkv.dll for Windows releases Test executables would not be packed anymore. * Move test code into a separate file * Remove redundant zeroing * Refactor chat script
master-3621172
punish repetitions & break if END_OF_TEXT & decouple prompts from cha… …t script (#37) * punish repetitions & break if END_OF_TEXT * decouple prompts from chat_with_bot.py * improve code style * Update rwkv/chat_with_bot.py Co-authored-by: Alex <saharNooby@users.noreply.github.com> * Update rwkv/chat_with_bot.py Co-authored-by: Alex <saharNooby@users.noreply.github.com> * add types * JSON prompt --------- Co-authored-by: Alex <saharNooby@users.noreply.github.com>