Fast LLM speculative inference server for consumer hardware.
-
Updated
Jun 25, 2026 - C++
Fast LLM speculative inference server for consumer hardware.
Add a description, image, and links to the pflash topic page so that developers can more easily learn about it.
To associate your repository with the pflash topic, visit your repo's landing page and select "manage topics."