Commit graph

308 commits

Author SHA1 Message Date
Andrei Betlen c088a2b3a7 Un-skip tests 2023-05-01 15:46:03 -04:00
Andrei Betlen bf3d0dcb2c Fix tests 2023-05-01 15:28:46 -04:00
Andrei Betlen 5034bbf499 Bump version 2023-05-01 15:23:59 -04:00
Andrei Betlen f073ef0571 Update llama.cpp 2023-05-01 15:23:01 -04:00
Andrei Betlen 9ff9cdd7fc Fix import error 2023-05-01 15:11:15 -04:00
Andrei Betlen 2f8a3adaa4 Temporarily skip sampling tests. 2023-05-01 15:01:49 -04:00
Andrei Betlen dbe0ad86c8 Update test dependencies 2023-05-01 14:50:01 -04:00
Andrei Betlen 350a1769e1 Update sampling api 2023-05-01 14:47:55 -04:00
Andrei Betlen 7837c3fdc7 Fix return types and import comments 2023-05-01 14:02:06 -04:00
Andrei Betlen 55d6308537 Fix test dependencies 2023-05-01 11:39:18 -04:00
Andrei Betlen ccf1ed54ae Merge branch 'main' of github.com:abetlen/llama_cpp_python into main 2023-05-01 11:35:14 -04:00
Andrei 79ba9ed98d
Merge pull request #125 from Stonelinks/app-server-module-importable
Make app server module importable
2023-05-01 11:31:08 -04:00
Andrei Betlen 80184a286c Update llama.cpp 2023-05-01 10:44:28 -04:00
Lucas Doyle efe8e6f879 llama_cpp server: slight refactor to init_llama function
Define an init_llama function that starts llama with supplied settings instead of just doing it in the global context of app.py

This allows the test to be less brittle by not needing to mess with os.environ, then importing the app
2023-04-29 11:42:23 -07:00
Lucas Doyle 6d8db9d017 tests: simple test for server module 2023-04-29 11:42:20 -07:00
Lucas Doyle 468377b0e2 llama_cpp server: app is now importable, still runnable as a module 2023-04-29 11:41:25 -07:00
Andrei 755f9fa455
Merge pull request #118 from SagsMug/main
Fix UnicodeDecodeError permanently
2023-04-29 07:19:01 -04:00
Mug 18a0c10032 Remove excessive errors="ignore" and add utf8 test 2023-04-29 12:19:22 +02:00
Andrei Betlen 523825e91d Update README 2023-04-28 17:12:03 -04:00
Andrei Betlen e00beb13b5 Update README 2023-04-28 17:08:18 -04:00
Andrei Betlen 5423d047c7 Bump version 2023-04-28 15:33:08 -04:00
Andrei Betlen ea0faabae1 Update llama.cpp 2023-04-28 15:32:43 -04:00
Mug b7d14efc8b Python weirdness 2023-04-28 13:20:31 +02:00
Mug eed61289b6 Dont detect off tokens, detect off detokenized utf8 2023-04-28 13:16:18 +02:00
Mug 3a98747026 One day, i'll fix off by 1 errors permanently too 2023-04-28 12:54:28 +02:00
Mug c39547a986 Detect multi-byte responses and wait 2023-04-28 12:50:30 +02:00
Andrei Betlen 9339929f56 Update llama.cpp 2023-04-26 20:00:54 -04:00
Mug 5f81400fcb Also ignore errors on input prompts 2023-04-26 14:45:51 +02:00
Mug 3c130f00ca Remove try catch from chat 2023-04-26 14:38:53 +02:00
Mug be2c961bc9 Merge branch 'main' of https://github.com/abetlen/llama-cpp-python 2023-04-26 14:38:09 +02:00
Mug c4a8491d42 Fix decode errors permanently 2023-04-26 14:37:06 +02:00
Andrei Betlen cbd26fdcc1 Update llama.cpp 2023-04-25 19:03:41 -04:00
Andrei Betlen 3cab3ef4cb Update n_batch for server 2023-04-25 09:11:32 -04:00
Andrei Betlen cc706fb944 Add ctx check and re-order __init__. Closes #112 2023-04-25 09:00:53 -04:00
Andrei Betlen 996e31d861 Bump version 2023-04-25 01:37:07 -04:00
Andrei Betlen 848c83dfd0 Add FORCE_CMAKE option 2023-04-25 01:36:37 -04:00
Andrei Betlen 9dddb3a607 Bump version 2023-04-25 00:19:44 -04:00
Andrei Betlen d484c5634e Bugfix: Check cache keys as prefix to prompt tokens 2023-04-24 22:18:54 -04:00
Andrei Betlen b75fa96bf7 Update docs 2023-04-24 19:56:57 -04:00
Andrei Betlen cbe95bbb75 Add cache implementation using llama state 2023-04-24 19:54:41 -04:00
Andrei Betlen 2c359a28ff Merge branch 'main' of github.com:abetlen/llama_cpp_python into main 2023-04-24 17:51:27 -04:00
Andrei Betlen 197cf80601 Add save/load state api for Llama class 2023-04-24 17:51:25 -04:00
Andrei Betlen c4c332fc51 Update llama.cpp 2023-04-24 17:42:09 -04:00
Andrei Betlen 280a047dd6 Update llama.cpp 2023-04-24 15:52:24 -04:00
Andrei Betlen 86f8e5ad91 Refactor internal state for Llama class 2023-04-24 15:47:54 -04:00
Andrei f37456133a
Merge pull request #108 from eiery/main
Update n_batch default to 512 to match upstream llama.cpp
2023-04-24 13:48:09 -04:00
Andrei Betlen 02cf881317 Update llama.cpp 2023-04-24 09:30:10 -04:00
eiery aa12d8a81f
Update llama.py
update n_batch default to 512 to match upstream llama.cpp
2023-04-23 20:56:40 -04:00
Andrei Betlen 7230599593 Disable mmap when applying lora weights. Closes #107 2023-04-23 14:53:17 -04:00
Andrei Betlen e99caedbbd Update llama.cpp 2023-04-22 19:50:28 -04:00