Frequently Asked Questions
General
What is Perf?
Perf is an AI runtime orchestrator that sits between your application and LLM providers. We automatically select the optimal model for each request based on your cost, quality, and reliability requirements.How does Perf reduce costs?
Perf analyzes each request and intelligently selects the most cost-effective model that can handle it. Simple queries get sent to cheaper models like GPT-4o Mini, while complex tasks use more powerful models. This typically reduces costs by 40-60% compared to using a single premium model.Is Perf compatible with the OpenAI API?
Yes, Perf is fully OpenAI-compatible. You can replace your OpenAI base URL with Perf’s endpoint and everything will work seamlessly, with added benefits of cost optimization and quality control.Pricing & Billing
How does Perf pricing work?
You only pay for the actual model usage. Perf charges the standard rate of whichever model we select for your request, with no markup. We make money through our enterprise plans with additional features.Is there a free tier?
Yes, we offer a free tier with 10,000 requests per month to get started. Perfect for testing and small projects.Can I set cost limits?
Yes, you can set cost limits at multiple levels:- Per-request maximum (
max_cost_per_callparameter) - Daily/monthly account limits in the dashboard
- Team-wide budgets for enterprise plans
Technical
What’s the latency overhead?
Perf adds minimal latency overhead (typically 20-50ms) for orchestration decisions. Our intelligent caching and model selection algorithms are optimized for speed.Do you support streaming?
Yes, Perf fully supports streaming responses using Server-Sent Events (SSE), just like the OpenAI API.How do you ensure data privacy?
Your prompt data is only used to process your request and is not stored permanently. All requests are proxied directly to the provider. We log metadata (model used, tokens, latency) for analytics purposes.Platform Features
What analytics do you provide?
Perf provides analytics through our API:- Request logs and history
- Cost breakdown by model and task type
- Performance metrics (latency, success rate)
- Usage statistics
Can multiple team members access the same account?
Team management features are coming soon. For now, you can share API keys with your team.Do you offer SLA guarantees?
We provide automatic failover across providers to ensure reliability. Enterprise SLA guarantees are available on request.Can I export my data?
Yes, you can access all logs and metrics data via our API. JSON format is supported.Getting Started
How long does integration take?
Most teams are up and running in under 30 minutes. If you’re already using OpenAI, it’s as simple as changing your base URL and adding your Perf API key.Do you provide migration support?
Yes, our team provides migration support for all paid plans. We’ll help you migrate from your existing LLM setup and optimize your configuration.Can I test Perf before committing?
Absolutely. Sign up for our free tier and test with your actual use cases. No credit card required.Where can I get help?
- Documentation: docs.withperf.pro
- Email Support: support@withperf.pro
- Enterprise Support: Available 24/7 for enterprise customers
Troubleshooting
What if a request fails?
Perf includes automatic retry logic with intelligent fallback. If a request fails with one provider, we automatically retry with an alternative provider and model.How do I monitor quality?
Perf provides quality validation including:- Response completeness checks
- Format validation for extraction tasks
- Automatic retry for low-quality responses