Frequently Asked Questions

General

What is Perf?

Perf is an AI runtime orchestrator that sits between your application and LLM providers. We automatically select the optimal model for each request based on your cost, quality, and reliability requirements.

How does Perf reduce costs?

Perf analyzes each request and intelligently selects the most cost-effective model that can handle it. Simple queries get sent to cheaper models like GPT-4o Mini, while complex tasks use more powerful models. This typically reduces costs by 40-60% compared to using a single premium model.

Is Perf compatible with the OpenAI API?

Yes, Perf is fully OpenAI-compatible. You can replace your OpenAI base URL with Perf’s endpoint and everything will work seamlessly, with added benefits of cost optimization and quality control.

Pricing & Billing

How does Perf pricing work?

You only pay for the actual model usage. Perf charges the standard rate of whichever model we select for your request, with no markup. We make money through our enterprise plans with additional features.

Is there a free tier?

Yes, we offer a free tier with 10,000 requests per month to get started. Perfect for testing and small projects.

Can I set cost limits?

Yes, you can set cost limits at multiple levels:

Per-request maximum (max_cost_per_call parameter)
Daily/monthly account limits in the dashboard
Team-wide budgets for enterprise plans

Technical

What’s the latency overhead?

Perf adds minimal latency overhead (typically 20-50ms) for orchestration decisions. Our intelligent caching and model selection algorithms are optimized for speed.

Do you support streaming?

Yes, Perf fully supports streaming responses using Server-Sent Events (SSE), just like the OpenAI API.

How do you ensure data privacy?

Your prompt data is only used to process your request and is not stored permanently. All requests are proxied directly to the provider. We log metadata (model used, tokens, latency) for analytics purposes.

Platform Features

What analytics do you provide?

Perf provides analytics through our API:

Request logs and history
Cost breakdown by model and task type
Performance metrics (latency, success rate)
Usage statistics

Dashboard features are coming soon.

Can multiple team members access the same account?

Team management features are coming soon. For now, you can share API keys with your team.

Do you offer SLA guarantees?

We provide automatic failover across providers to ensure reliability. Enterprise SLA guarantees are available on request.

Can I export my data?

Yes, you can access all logs and metrics data via our API. JSON format is supported.

Getting Started

How long does integration take?

Most teams are up and running in under 30 minutes. If you’re already using OpenAI, it’s as simple as changing your base URL and adding your Perf API key.

Do you provide migration support?

Yes, our team provides migration support for all paid plans. We’ll help you migrate from your existing LLM setup and optimize your configuration.

Can I test Perf before committing?

Absolutely. Sign up for our free tier and test with your actual use cases. No credit card required.

Where can I get help?

Documentation: docs.withperf.pro
Email Support: support@withperf.pro
Enterprise Support: Available 24/7 for enterprise customers

Troubleshooting

What if a request fails?

Perf includes automatic retry logic with intelligent fallback. If a request fails with one provider, we automatically retry with an alternative provider and model.

How do I monitor quality?

Perf provides quality validation including:

Response completeness checks
Format validation for extraction tasks
Automatic retry for low-quality responses

You can check validation results via the API logs.

Still have questions?

Contact our team at support@withperf.pro.

Getting Started

Voice Agents

SDKs

API Documentation

Platform

Advanced

Resources

Frequently Asked Questions

Frequently Asked Questions

General

What is Perf?

How does Perf reduce costs?

Is Perf compatible with the OpenAI API?

Pricing & Billing

How does Perf pricing work?

Is there a free tier?

Can I set cost limits?

Technical

What’s the latency overhead?

Do you support streaming?

How do you ensure data privacy?

Platform Features

What analytics do you provide?

Can multiple team members access the same account?

Do you offer SLA guarantees?

Can I export my data?

Getting Started

How long does integration take?

Do you provide migration support?

Can I test Perf before committing?

Where can I get help?

Troubleshooting

What if a request fails?

How do I monitor quality?

Still have questions?

Getting Started

Voice Agents

SDKs

API Documentation

Platform

Advanced

Resources

​Frequently Asked Questions

​General

​What is Perf?

​How does Perf reduce costs?

​Is Perf compatible with the OpenAI API?

​Pricing & Billing

​How does Perf pricing work?

​Is there a free tier?

​Can I set cost limits?

​Technical

​What’s the latency overhead?

​Do you support streaming?

​How do you ensure data privacy?

​Platform Features

​What analytics do you provide?

​Can multiple team members access the same account?

​Do you offer SLA guarantees?

​Can I export my data?

​Getting Started

​How long does integration take?

​Do you provide migration support?

​Can I test Perf before committing?

​Where can I get help?

​Troubleshooting

​What if a request fails?

​How do I monitor quality?

​Still have questions?

Frequently Asked Questions

General

What is Perf?

How does Perf reduce costs?

Is Perf compatible with the OpenAI API?

Pricing & Billing

How does Perf pricing work?

Is there a free tier?

Can I set cost limits?

Technical

What’s the latency overhead?

Do you support streaming?

How do you ensure data privacy?

Platform Features

What analytics do you provide?

Can multiple team members access the same account?

Do you offer SLA guarantees?

Can I export my data?

Getting Started

How long does integration take?

Do you provide migration support?

Can I test Perf before committing?

Where can I get help?

Troubleshooting

What if a request fails?

How do I monitor quality?

Still have questions?