llama.cpp

History

Andrei ab028cb878 Migrate inference to llama_batch and llama_decode api (#795 ) * Add low-level batching notebook * fix: tokenization of special characters: (#850) It should behave like llama.cpp, where most out of the box usages treat special characters accordingly * Update CHANGELOG * Cleanup * Fix runner label * Update notebook * Use llama_decode and batch api * Support logits_all parameter --------- Co-authored-by: Antoine Lizee <antoine.lizee@gmail.com>		2023-11-02 20:13:57 -04:00
..
high_level_api	examples fastapi_server: deprecate	2023-05-01 22:34:23 -07:00
low_level_api	Add numa support, low level api users must now explicitly call llama_backend_init at the start of their programs.	2023-09-13 23:00:43 -04:00
notebooks	Migrate inference to llama_batch and llama_decode api (#795 )	2023-11-02 20:13:57 -04:00