Commit graph

531 commits

Author SHA1 Message Date
kddubey 6b2e0e05b4
perf: Don't convert logprobs arrays to lists (#1021) 2023-12-18 14:28:12 -05:00
Brandon Roberts 62944df142
Bugfix: Remove f16_kv, add offload_kqv field (#1019)
F16_KV appears to have been removed here: af99c6fbfc

This addresses two issues:

 - #995 which just requests to add the KV cache offloading param
 - #1006 a NULL ptr exception when using the embeddings (introduced by
   leaving f16_kv in the fields struct)
2023-12-18 14:27:11 -05:00
Daniele Morotti f1c631dc53
Bug fixed with n_ctx=0 (#1015)
If the n_ctx is set to 0 the code should use the maximum context length of the selected model, but it didn't work. There was a problem with the initialization of this parameter and a related problem with 'n_batch'.
2023-12-16 18:59:50 -05:00
kddubey 5a8944672f
Fix logits_to_logprobs for 2-D and 3-D logits (#1002)
* Fix logits_to_logprobs for 2-D and 3-D logits

* Set dtype to single

* Test size
2023-12-16 18:59:26 -05:00
Andrei Betlen 534b1ea9b5 Update llama.cpp 2023-12-16 18:57:43 -05:00
Andrei Betlen cbce061ffd Bump version 2023-12-13 21:52:29 -05:00
yhfgyyf 8b4db732bd
Add qwen chat format (#1005) 2023-12-13 21:43:43 -05:00
Andrei Betlen 690c563b60 Merge branch 'main' of github.com:abetlen/llama_cpp_python into main 2023-12-13 21:43:19 -05:00
Andrei Betlen c0fc0a1e82 Update llama.cpp 2023-12-13 21:43:16 -05:00
Radoslav Gerganov 8e44a32075
Add support for running the server with SSL (#994) 2023-12-11 20:47:11 -05:00
Tanner Hobson ef22e478db
Replace logits_to_logprobs implementation with numpy equivalent to llama.cpp (#991)
See #990. This change makes the logits_to_logprobs function equivalent to the version in the llama.cpp repository. It uses numpy so it's much faster than the previous version.
2023-12-11 20:46:27 -05:00
zocainViken ac35f68e4d
Fix UnsupportedOperation: fileno in suppress_stdout_stderr (#961)
* bug fixing

* llava from readme got this error: UnsupportedOperation: fileno   quick fix by checking hasattr

* multi modal params fix: add logits = True -> to make llava work

* multi modal params fix: add logits = True -> to make llava work

---------

Co-authored-by: Andrei <abetlen@gmail.com>
2023-12-11 20:44:51 -05:00
chiensen b938cccf05
Add Pygmalion chat format (#986) 2023-12-11 20:44:04 -05:00
Andrei Betlen c1e73e73a3 Bump version 2023-12-11 10:26:42 -05:00
Andrei Betlen ec26f364cc Remove f16_kv 2023-12-11 10:25:37 -05:00
Andrei Betlen f1edc66b21 Update llama.cpp 2023-12-11 10:21:35 -05:00
kddubey b069d06346
Fix #891 (#952) 2023-11-29 05:39:52 -05:00
Andrei Betlen ad963a0961 Bump version 2023-11-28 04:58:20 -05:00
Andrei Betlen e3941d9c67 Make building llava optional 2023-11-28 04:55:21 -05:00
Andrei Betlen 7f3704b896 Bump version 2023-11-27 19:14:25 -05:00
Andrei Betlen 396dbf0b2b docs: Improve low-level docstrings 2023-11-27 19:03:02 -05:00
Andrei Betlen a928893d03 Merge branch 'main' of github.com:abetlen/llama_cpp_python into main 2023-11-26 15:57:13 -05:00
Andrei Betlen 6308f21d5e docs: Update Llama docs 2023-11-26 15:56:40 -05:00
Gardner Bickford c2d63a7148
fix: Typo in the Open Orca chat format #874 (#947) 2023-11-26 15:39:18 -05:00
Andrei Betlen f03a38e62a Update llama.cpp 2023-11-26 15:38:22 -05:00
Andrei Betlen 1a7bf2037b docs: Update openapi endpoint names 2023-11-24 03:39:29 -05:00
Andrei Betlen 4026166e68 docs: Update completion and chat_completion parameter docstrings 2023-11-24 03:24:19 -05:00
Andrei Betlen 8c3aa7858b Merge branch 'main' of github.com:abetlen/llama_cpp_python into main 2023-11-24 00:15:36 -05:00
Andrei Betlen de2e2bc083 misc fix verbose printing in functionary model 2023-11-23 20:14:23 -05:00
Andrei Betlen 36048d46af Update llama.cpp 2023-11-23 16:26:00 -05:00
mrfakename d68fc07b1b
Add Zephyr format (#937) 2023-11-23 01:20:08 -05:00
caiyesd 4184835078
Add chat format to support baichuan (#938)
Signed-off-by: caiyesd <caiyesd@gmail.com>
2023-11-23 01:19:50 -05:00
Andrei Betlen c647f01609 Add from_json_schema to LlamaGrammar 2023-11-23 00:27:00 -05:00
Andrei Betlen be1f64d569 docs: Add docstrings from llama.cpp 2023-11-23 00:26:26 -05:00
Andrei Betlen b6bb7ac76a docs: Add Llama class example 2023-11-22 23:10:04 -05:00
caiyesd b8f29f4bf0
Add baichuan-2 chat format (#936)
Signed-off-by: caiyesd <caiyesd@gmail.com>
2023-11-22 06:08:06 -05:00
Andrei Betlen 8b6ca22846 Fix type warnings for json schema grammar converter 2023-11-21 13:32:00 -05:00
Andrei Betlen 230fc8b535 Bump version 2023-11-21 05:04:55 -05:00
Andrei Betlen 128dc4731f Fix #569 2023-11-21 04:39:05 -05:00
Andrei Betlen 7a3f87846b Format 2023-11-21 04:02:20 -05:00
Andrei Betlen 422ebc89ce Fix: Add logit_bias to all completion api methods 2023-11-21 04:01:36 -05:00
Andrei Betlen 07e47f55ba Add support for logit_bias outside of server api. Closes #827 2023-11-21 03:59:46 -05:00
Maarten ter Huurne c21edb6908
Do not set grammar to None for new LlamaGrammar objects (#834)
* Do not set `grammar` to `None` for new `LlamaGrammar` objects

The `grammar` attribute is written by `init()`, but that method always
returns `None`, so `__init__()` would then discard the previously
written object.

* Add minimal test for grammar parsing
2023-11-21 00:23:18 -05:00
mrfakename ef65fc5ff4
Add MistralLite, Intel, and OpenChat prompt formats (#927)
* Add MistralLite format

* Update llama_chat_format.py

* Update llama_chat_format.py
2023-11-21 00:19:25 -05:00
TK-Master b8438f70b5
Added support for min_p (#921)
* Added support for min_p

My small contribution to this great project.

Ref: https://github.com/ggerganov/llama.cpp/pull/3841

Closes: https://github.com/abetlen/llama-cpp-python/issues/911

* Fix for negative temp (sample_softmax)
2023-11-20 23:21:33 -05:00
Andrei Betlen a34d480141 Fix #929 2023-11-20 22:50:59 -05:00
Andrei Betlen 2c2afa320f Update llama.cpp 2023-11-20 14:11:33 -05:00
Andrei Betlen f2901d840e Bump version 2023-11-14 14:10:00 -05:00
Andrei Betlen 01846a76b9 Bump version 2023-11-10 16:36:12 -05:00
Andrei Betlen b7e60b66f4 Bump version 2023-11-10 06:21:24 -05:00