llama.cpp

History

Andrei Betlen 556c7edf47 Truncate max_tokens if it exceeds context length		2023-06-09 10:57:36 -04:00
..
server	Fix cache implementation breaking changes	2023-06-08 13:19:23 -04:00
__init__.py	Black formatting	2023-03-24 14:59:29 -04:00
llama.py	Truncate max_tokens if it exceeds context length	2023-06-09 10:57:36 -04:00
llama_cpp.py	Allow both .so and .dylib extensions for macos	2023-06-08 00:27:19 -04:00
llama_types.py	Allow first logprob token to be null to match openai api	2023-05-19 02:04:57 -04:00