Commit graph

34 commits

Author SHA1 Message Date
Andrei Betlen b8fc1c7d83 feat: Add ability to load chat format from huggingface autotokenizer or tokenizer_config.json files. 2024-01-18 21:21:37 -05:00
Austin 6bfe98bd80
Integration of Jinja2 Templating (#875)
* feat: Add support for jinja templating

Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>

* fix: Refactor chat formatter and update interface for jinja templates

- Simplify the `llama2_template` in `llama_jinja_format.py` by removing unnecessary line breaks for readability without affecting functionality.
- Update `ChatFormatterInterface` constructor to accept a more generic `Optional[object]` type for the template parameter, enhancing flexibility.
- Introduce a `template` property to `ChatFormatterInterface` for standardized access to the template string.
- Replace `MetaSingleton` metaclass with `Singleton` for the `ChatFormatterFactory` to streamline the singleton implementation.

These changes enhance code readability, maintain usability, and ensure consistency in the chat formatter's design pattern usage.

* Add outline for Jinja2 templating integration documentation

Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>

* Add jinja2 as a dependency with version range for Hugging Face transformers compatibility

Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>

* Update jinja2 version constraint for mkdocs-material compatibility

Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>

* Fix attribute name in AutoChatFormatter

- Changed attribute name from `self._renderer` to `self._environment`

---------

Signed-off-by: teleprint-me <77757836+teleprint-me@users.noreply.github.com>
2024-01-17 09:47:52 -05:00
Mark Neumann c689ccc728
Fix Pydantic model parsing (#1087) 2024-01-15 10:45:57 -05:00
kddubey 5a8944672f
Fix logits_to_logprobs for 2-D and 3-D logits (#1002)
* Fix logits_to_logprobs for 2-D and 3-D logits

* Set dtype to single

* Test size
2023-12-16 18:59:26 -05:00
Andrei Betlen 9515467439 tests: add mock_kv_cache placeholder functions 2023-11-22 06:02:21 -05:00
Andrei Betlen 0ea244499e tests: avoid constantly reallocating logits 2023-11-22 04:31:05 -05:00
Andrei Betlen 0a7e05bc10 tests: don't mock sampling functions 2023-11-22 04:12:32 -05:00
Andrei Betlen d7388f1ffb Use mock_llama for all tests 2023-11-21 18:13:19 -05:00
Maarten ter Huurne c21edb6908
Do not set grammar to None for new LlamaGrammar objects (#834)
* Do not set `grammar` to `None` for new `LlamaGrammar` objects

The `grammar` attribute is written by `init()`, but that method always
returns `None`, so `__init__()` would then discard the previously
written object.

* Add minimal test for grammar parsing
2023-11-21 00:23:18 -05:00
Andrei Betlen 3dc21b2557 tests: Improve llama.cpp mock 2023-11-20 23:23:18 -05:00
Andrei Betlen 2c2afa320f Update llama.cpp 2023-11-20 14:11:33 -05:00
Andrei Betlen e32ecb0516 Fix tests 2023-11-10 05:39:42 -05:00
Andrei Betlen e214a58422 Refactor Llama class internals 2023-11-06 09:16:36 -05:00
Andrei ab028cb878
Migrate inference to llama_batch and llama_decode api (#795)
* Add low-level batching notebook

* fix: tokenization of special characters: (#850)

It should behave like llama.cpp, where most out of the box usages
treat special characters accordingly

* Update CHANGELOG

* Cleanup

* Fix runner label

* Update notebook

* Use llama_decode and batch api

* Support logits_all parameter

---------

Co-authored-by: Antoine Lizee <antoine.lizee@gmail.com>
2023-11-02 20:13:57 -04:00
Antoine Lizee 4d4e0f11e2 fix: tokenization of special characters: (#850)
It should behave like llama.cpp, where most out of the box usages
treat special characters accordingly
2023-11-02 14:28:14 -04:00
Andrei Betlen ef03d77b59 Enable finish reason tests 2023-10-19 02:56:45 -04:00
Andrei Betlen cbeef36510 Re-enable tests completion function 2023-10-19 02:55:29 -04:00
Andrei Betlen 1a1c3dc418 Update llama.cpp 2023-09-28 22:42:03 -04:00
janvdp f49b6d7c67 add test to see if llama_cpp.__version__ exists 2023-09-05 21:10:05 +02:00
Andrei Betlen 4887973c22 Update llama.cpp 2023-08-27 12:59:20 -04:00
Andrei Betlen 3a29d65f45 Update llama.cpp 2023-08-26 23:36:24 -04:00
Andrei Betlen 8ac59465b9 Strip leading space when de-tokenizing. 2023-08-25 04:56:48 -04:00
Andrei Betlen 3674e5ed4e Update model path 2023-08-24 01:01:20 -04:00
Andrei Betlen 01a010be52 Fix llama_cpp and Llama type signatures. Closes #221 2023-05-19 11:59:33 -04:00
Andrei Betlen 46e3c4b84a Fix 2023-05-01 22:41:54 -04:00
Andrei Betlen 9eafc4c49a Refactor server to use factory 2023-05-01 22:38:46 -04:00
Andrei Betlen c088a2b3a7 Un-skip tests 2023-05-01 15:46:03 -04:00
Andrei Betlen 2f8a3adaa4 Temporarily skip sampling tests. 2023-05-01 15:01:49 -04:00
Lucas Doyle efe8e6f879 llama_cpp server: slight refactor to init_llama function
Define an init_llama function that starts llama with supplied settings instead of just doing it in the global context of app.py

This allows the test to be less brittle by not needing to mess with os.environ, then importing the app
2023-04-29 11:42:23 -07:00
Lucas Doyle 6d8db9d017 tests: simple test for server module 2023-04-29 11:42:20 -07:00
Mug 18a0c10032 Remove excessive errors="ignore" and add utf8 test 2023-04-29 12:19:22 +02:00
Mug 5f81400fcb Also ignore errors on input prompts 2023-04-26 14:45:51 +02:00
Andrei Betlen e96a5c5722 Make Llama instance pickleable. Closes #27 2023-04-05 06:52:17 -04:00
Andrei Betlen c3972b61ae Add basic tests. Closes #24 2023-04-05 03:23:15 -04:00