llama.cpp

Author	SHA1	Message	Date
Andrei Betlen	196650ccb2	Update model paths to be more clear they should point to file	2023-04-09 22:45:55 -04:00
MillionthOdin16	c283edd7f2	Set n_batch to default values and reduce thread count: Change batch size to the llama.cpp default of 8. I've seen issues in llama.cpp where batch size affects quality of generations. (It shouldn't) But in case that's still an issue I changed to default. Set auto-determined num of threads to 1/2 system count. ggml will sometimes lock cores at 100% while doing nothing. This is being addressed, but can cause bad experience for user if pegged at 100%	2023-04-05 18:17:29 -04:00
Andrei Betlen	e1b5b9bb04	Update fastapi server example	2023-04-05 14:44:26 -04:00
Andrei Betlen	c8e13a78d0	Re-organize examples folder	2023-04-05 04:10:13 -04:00