Commit graph

1599 commits

Author SHA1 Message Date
Andrei Betlen 125b2358c9 feat: Update llama.cpp 2024-03-28 12:06:46 -04:00
Andrei Betlen 901fe02461 feat: Update llama.cpp 2024-03-26 22:58:53 -04:00
Andrei Betlen b64fa4e2c0 feat: Update llama.cpp 2024-03-25 23:09:07 -04:00
Andrei Betlen a93b9149f8 feat: Update llama.cpp 2024-03-25 11:10:14 -04:00
Andrei Betlen 364678bde5 feat: Update llama.cpp 2024-03-24 12:27:49 -04:00
Andrei Betlen d11ccc3036 fix(server): minor type fixes 2024-03-23 17:14:15 -04:00
Andrei Betlen c1325dcdfb fix: tool_call missing first token. 2024-03-22 23:44:04 -04:00
Andrei Betlen e325a831f0 feat: Update llama.cpp 2024-03-22 23:43:29 -04:00
Andrei Betlen c89be28ef9 feat: Update llama.cpp 2024-03-20 20:50:47 -04:00
Andrei Betlen 3db03b7302 feat: Update llama.cpp 2024-03-20 13:27:43 -04:00
bretello 740f3f3812
fix: set LLAMA_METAL_EMBED_LIBRARY=on on MacOS arm64 (#1289) 2024-03-20 12:46:09 -04:00
Andrei Betlen f7decc9562 docs: Add chat examples to openapi ui 2024-03-19 10:52:53 -04:00
Andrei 60d8498f21
feat: Add tools/functions variables to Jinja2ChatFormatter, add function response formatting for all simple chat formats (#1273)
* Add tools/functions variables to Jinja2ChatFormatter

Also fixed missing tools/tool_choices parameters in chat_formatter_to_chat_completion_handler().

* Set grammar when doing explicit function calling

* Add function / tool response for all chat formats

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2024-03-19 04:55:57 -04:00
Andrei Betlen 18d7ce918f feat: Update llama.cpp 2024-03-19 04:40:24 -04:00
Andrei Betlen 7d4a5ec59f Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main 2024-03-18 11:37:33 -04:00
Andrei Betlen bf64752535 chore: Bump version 2024-03-18 11:37:30 -04:00
Jeffrey Fong 8a60c7bc8c
fix: Fix and optimize functionary chat handler (#1282)
* fix functionary chat logic

* further fixes

---------

Co-authored-by: Andrei <abetlen@gmail.com>
2024-03-18 10:40:57 -04:00
Andrei Betlen 8d298b4750 feat: Update llama.cpp 2024-03-18 10:26:36 -04:00
Andrei Betlen 6eb25231e4 feat: Update llama.cpp 2024-03-15 12:58:45 -04:00
Andrei Betlen 20e6815252 fix: json mode 2024-03-15 12:58:34 -04:00
Andrei Betlen 1a9b8af2dd feat: Update llama.cpp 2024-03-14 11:46:48 -04:00
Andrei Betlen 4084aabe86 fix: set default pooling type to unspecified 2024-03-14 10:04:57 -04:00
Andrei Betlen d318cc8b83 fix: Set default pooling_type to mean, check for null pointer. 2024-03-14 09:17:41 -04:00
Andrei Betlen dd0ee56217 feat: Update llama.cpp 2024-03-13 15:57:35 -04:00
Andrei Betlen 08e910f7a7 feat: Update llama.cpp 2024-03-10 23:45:05 -04:00
Andrei Betlen a7281994d8 chore: Bump version 2024-03-08 21:14:44 -05:00
Andrei Betlen 919fca9f2b Merge branch 'main' of https://github.com/abetlen/llama-cpp-python into main 2024-03-08 21:10:56 -05:00
Andrei Betlen d02a9cf16f Fixed json strings grammar by blacklisting character control set. Closes #1259 2024-03-08 21:10:53 -05:00
Felipe Lorenz c139f8b5d5
feat: Add endpoints for tokenize, detokenize and count tokens (#1136)
* Add endpoint to count tokens

* Add tokenize and detokenize endpoints

* Change response key to tokens for tokenize endpoint

* Fix dependency bug

* Cleanup

* Remove example added by mistake

* Move tokenize, detokenize, and count to Extras namespace. Tag existing endpoints

---------

Co-authored-by: Andrei Betlen <abetlen@gmail.com>
2024-03-08 21:09:00 -05:00
Kevin Cao 1f3156d4f2
fix: Check for existence of clip model path (#1264) 2024-03-08 21:00:10 -05:00
Douglas Hanley 2811014bae
feat: Switch embed to llama_get_embeddings_seq (#1263)
* switch to llama_get_embeddings_seq

* Remove duplicate definition of llama_get_embeddings_seq

Co-authored-by: Andrei <abetlen@gmail.com>

---------

Co-authored-by: Andrei <abetlen@gmail.com>
2024-03-08 20:59:35 -05:00
Andrei Betlen 40c6b54f68 feat: Update llama.cpp 2024-03-08 20:58:50 -05:00
Andrei Betlen 93dc56ace8 Update llama.cpp 2024-03-06 01:32:00 -05:00
Andrei Betlen 87a6e5797e feat: Update llama.cpp 2024-03-03 11:27:04 -05:00
Andrei Betlen 13177aae0f chore: Bump version 2024-03-02 22:46:40 -05:00
Kenneth Hoste 663659f730
docs: fix small typo in README: 'model know how' -> 'model knows how' (#1244)
Co-authored-by: Andrei <abetlen@gmail.com>
2024-03-02 22:20:41 -05:00
Andrei Betlen 0e70984fb6 feat: Update llama.cpp 2024-03-02 22:20:04 -05:00
Andrei Betlen d5df431278 chore: Bump version 2024-03-01 13:15:16 -05:00
Andrei Betlen 97aa3a153d docs: Add information re: auto chat formats. Closes #1236 2024-03-01 13:10:25 -05:00
Andrei Betlen f062a7f51d feat: Update llama.cpp 2024-03-01 12:57:16 -05:00
Douglas Hanley cf1fdd8a9a
docs: fix typo in README.md embeddings example. (#1232) 2024-02-29 13:55:50 -05:00
Andrei Betlen 8c71725d53 fix: Remove deprecated cfg sampling functions 2024-02-28 14:37:07 -05:00
Andrei Betlen 727d60c28a misc: Format 2024-02-28 14:27:40 -05:00
Andrei Betlen 0d37ce52b1 feat: Update llama.cpp 2024-02-28 14:27:16 -05:00
Andrei Betlen ffcd4b2636 chore: Bump version 2024-02-28 01:38:32 -05:00
Sigbjørn Skjæret c36ab15e68
fix: eos/bos_token set correctly for Jinja2ChatFormatter and automatic chat formatter (#1230)
The token strings were not correctly retrieved (empty).
2024-02-28 01:30:31 -05:00
Andrei Betlen fea33c9b94 feat: Update llama.cpp 2024-02-27 12:22:17 -05:00
Andrei 4d574bd765
feat(server): Add support for pulling models from Huggingface Hub (#1222)
* Basic support for hf pull on server

* Add hf_model_repo_id setting

* Update README
2024-02-26 14:35:08 -05:00
Andrei Betlen b3e358dee4 docs: Add example of local image loading to README 2024-02-26 11:58:33 -05:00
Andrei Betlen afe1e445c9 chore: Bump version 2024-02-26 11:43:24 -05:00