llama.cpp/examples/high_level_api
Lucas Doyle 0fcc25cdac examples fastapi_server: deprecate
This commit "deprecates" the example fastapi server by remaining runnable but pointing folks at the module if they want to learn more.

Rationale:

Currently there exist two server implementations in this repo:

- `llama_cpp/server/__main__.py`, the module that's runnable by consumers of the library with `python3 -m llama_cpp.server`
- `examples/high_level_api/fastapi_server.py`, which is probably a copy-pasted example by folks hacking around

IMO this is confusing. As a new user of the library I see they've both been updated relatively recently but looking side-by-side there's a diff.

The one in the module seems better:
- supports logits_all
- supports use_mmap
- has experimental cache support (with some mutex thing going on)
- some stuff with streaming support was moved around more recently than fastapi_server.py
2023-05-01 22:34:23 -07:00
..
fastapi_server.py examples fastapi_server: deprecate 2023-05-01 22:34:23 -07:00
high_level_api_embedding.py Update model paths to be more clear they should point to file 2023-04-09 22:45:55 -04:00
high_level_api_inference.py Update model paths to be more clear they should point to file 2023-04-09 22:45:55 -04:00
high_level_api_streaming.py Update model paths to be more clear they should point to file 2023-04-09 22:45:55 -04:00
langchain_custom_llm.py Update model paths to be more clear they should point to file 2023-04-09 22:45:55 -04:00