How it works:
Request arrives at an OpenAI-compatible endpoint
Classifier detects task type + complexity
Adaptive router picks the highest-scoring model for that cell
Quality feedback (user ratings + LLM judge) continuously improves routing
Change 2 lines in your code. That's it.
But it's more than a router. Full platform:
Request logs with replay + diff view
Time-series analytics (cost, latency p50/p95/p99)
A/B testing between models
Guardrails (PII redaction)
Prompt template versioning
Spend/latency alerting
Tweet 4:
Self-hosted with Docker. Your data never leaves your infrastructure.
Supports OpenAI, Anthropic, Google, Mistral, xAI, Ollama, and any OpenAI-compatible provider.
BYOK — bring your own API keys.
The routing gets smarter over time. An LLM-as-judge automatically scores responses, building quality data per model per task type.
After enough feedback, the router stops sending complex prompts to the cheapest model and starts picking the best one.
No manual configuration needed.
GitHub: https://github.com/syndicalt/provara
Live demo: https://www.provara.xyz
This article was originally published by DEV Community and written by Nicholas Blanchard.
Read original article on DEV Community