OpenAI-compatible proxy for Claude with streaming, analytics, caching, rate limiting, and a beautiful admin dashboard. Deploy in minutes.
Trusted by developers worldwide
Production-ready features out of the box. No configuration needed.
Full SSE streaming with OpenAI-compatible format. Token-by-token responses for instant feedback.
Cache identical responses to reduce API calls and costs. Configurable TTL and automatic cleanup.
Per-key rate limits with sliding window algorithm. Protect your API from abuse.
Exponential backoff retry with automatic model fallback on rate limits or errors.
Real-time dashboards with request logs, latency tracking, cost estimation, and model performance.
Create, revoke, and manage API keys with per-key quotas and usage tracking.
Track every request, monitor performance, and optimize costs with comprehensive analytics and visualizations.
Works with existing OpenAI SDK. Just change the base URL.
from openai import OpenAI
client = OpenAI(
api_key="sk-your-api-key",
base_url="http://localhost:3000/v1"
)
response = client.chat.completions.create(
model="claude-sonnet-4-6",
messages=[{
"role": "user",
"content": "Hello!"
}],
stream=True
)
for chunk in response:
print(chunk.choices[0].delta.content, end="")
curl http://localhost:3000/v1/chat/completions \
-H "Authorization: Bearer sk-your-key" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"messages": [
{
"role": "user",
"content": "Hello!"
}
],
"stream": true
}'