Commit graph

310 commits

Author SHA1 Message Date
Andrei Betlen 7230599593 Disable mmap when applying lora weights. Closes #107 2023-04-23 14:53:17 -04:00
Andrei Betlen e99caedbbd Update llama.cpp 2023-04-22 19:50:28 -04:00
Andrei Betlen 643b73e155 Bump version 2023-04-21 19:38:54 -04:00
Andrei Betlen 1eb130a6b2 Update llama.cpp 2023-04-21 17:40:27 -04:00
Andrei Betlen ba3959eafd Update llama.cpp 2023-04-20 05:15:31 -04:00
Andrei Betlen 207adbdf13 Bump version 2023-04-20 01:48:24 -04:00
Andrei Betlen 3d290623f5 Update llama.cpp 2023-04-20 01:08:15 -04:00
Andrei Betlen e4647c75ec Add use_mmap flag to server 2023-04-19 15:57:46 -04:00
Andrei Betlen 207ebbc8dc Update llama.cpp 2023-04-19 14:02:11 -04:00
Andrei Betlen 0df4d69c20 If lora base is not set avoid re-loading the model by passing NULL 2023-04-18 23:45:25 -04:00
Andrei Betlen 95c0dc134e Update type signature to allow for null pointer to be passed. 2023-04-18 23:44:46 -04:00
Andrei Betlen 453e517fd5 Add seperate lora_base path for applying LoRA to quantized models using original unquantized model weights. 2023-04-18 10:20:46 -04:00
Andrei Betlen 32ca803bd8 Merge branch 'main' of github.com:abetlen/llama_cpp_python into main 2023-04-18 02:22:39 -04:00
Andrei Betlen b2d44aa633 Update llama.cpp 2023-04-18 02:22:35 -04:00
Andrei 4ce6670bbd
Merge pull request #87 from SagsMug/main
Fix TypeError in low_level chat
2023-04-18 02:11:40 -04:00
Andrei Betlen eb7f278cc6 Add lora_path parameter to Llama model 2023-04-18 01:43:44 -04:00
Andrei Betlen 35abf89552 Add bindings for LoRA adapters. Closes #88 2023-04-18 01:30:04 -04:00
Andrei Betlen 3f68e95097 Update llama.cpp 2023-04-18 01:29:27 -04:00
Mug 1b73a15e62 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python 2023-04-17 14:45:42 +02:00
Mug 53d17ad003 Fixed end of text wrong type, and fix n_predict behaviour 2023-04-17 14:45:28 +02:00
Andrei Betlen b2a24bddac Update docs 2023-04-15 22:31:14 -04:00
Andrei Betlen e38485a66d Bump version. 2023-04-15 20:27:55 -04:00
Andrei Betlen 89856ef00d Bugfix: only eval new tokens 2023-04-15 17:32:53 -04:00
Andrei Betlen 887f3b73ac Update llama.cpp 2023-04-15 12:16:05 -04:00
Andrei Betlen 92c077136d Add experimental cache 2023-04-15 12:03:09 -04:00
Andrei Betlen a6372a7ae5 Update stop sequences for chat 2023-04-15 12:02:48 -04:00
Andrei Betlen 83b2be6dc4 Update chat parameters 2023-04-15 11:58:43 -04:00
Andrei Betlen 62087514c6 Update chat prompt 2023-04-15 11:58:19 -04:00
Andrei Betlen 02f9fb82fb Bugfix 2023-04-15 11:39:52 -04:00
Andrei Betlen 3cd67c7bd7 Add type annotations 2023-04-15 11:39:21 -04:00
Andrei Betlen d7de0e8014 Bugfix 2023-04-15 00:08:04 -04:00
Andrei Betlen e90e122f2a Use clear 2023-04-14 23:33:18 -04:00
Andrei Betlen ac7068a469 Track generated tokens internally 2023-04-14 23:33:00 -04:00
Andrei Betlen 25b646c2fb Update llama.cpp 2023-04-14 23:32:05 -04:00
Andrei Betlen 6e298d8fca Set kv cache size to f16 by default 2023-04-14 22:21:19 -04:00
Andrei Betlen 9c8c2c37dc Update llama.cpp 2023-04-14 10:01:57 -04:00
Andrei Betlen 6c7cec0c65 Fix completion request 2023-04-14 10:01:15 -04:00
Andrei Betlen 6153baab2d Clean up logprobs implementation 2023-04-14 09:59:33 -04:00
Andrei Betlen 26cc4ee029 Fix signature for stop parameter 2023-04-14 09:59:08 -04:00
Andrei Betlen 7dc0838fff Bump version 2023-04-13 00:35:05 -04:00
Andrei Betlen 6595ad84bf Add field to disable reseting between generations 2023-04-13 00:28:00 -04:00
Andrei Betlen 22fa5a621f Revert "Deprecate generate method"
This reverts commit 6cf5876538.
2023-04-13 00:19:55 -04:00
Andrei Betlen 4f5f99ef2a Formatting 2023-04-12 22:40:12 -04:00
Andrei Betlen 0daf16defc Enable logprobs on completion endpoint 2023-04-12 19:08:11 -04:00
Andrei Betlen 19598ac4e8 Fix threading bug. Closes #62 2023-04-12 19:07:53 -04:00
Andrei Betlen 005c78d26c Update llama.cpp 2023-04-12 14:29:00 -04:00
Andrei Betlen c854c2564b Don't serialize stateful parameters 2023-04-12 14:07:14 -04:00
Andrei Betlen 2f9b649005 Style fix 2023-04-12 14:06:22 -04:00
Andrei Betlen 6cf5876538 Deprecate generate method 2023-04-12 14:06:04 -04:00
Andrei Betlen b3805bb9cc Implement logprobs parameter for text completion. Closes #2 2023-04-12 14:05:11 -04:00