llama.cpp/examples
Andrei ab028cb878
Migrate inference to llama_batch and llama_decode api (#795)
* Add low-level batching notebook

* fix: tokenization of special characters: (#850)

It should behave like llama.cpp, where most out of the box usages
treat special characters accordingly

* Update CHANGELOG

* Cleanup

* Fix runner label

* Update notebook

* Use llama_decode and batch api

* Support logits_all parameter

---------

Co-authored-by: Antoine Lizee <antoine.lizee@gmail.com>
2023-11-02 20:13:57 -04:00
..
high_level_api examples fastapi_server: deprecate 2023-05-01 22:34:23 -07:00
low_level_api Add numa support, low level api users must now explicitly call llama_backend_init at the start of their programs. 2023-09-13 23:00:43 -04:00
notebooks Migrate inference to llama_batch and llama_decode api (#795) 2023-11-02 20:13:57 -04:00