Perf Platform Documentation

Welcome to Perf - the intelligent AI runtime orchestrator that optimizes your LLM applications for cost, quality, and reliability.

What is Perf?

Perf is a production-grade AI infrastructure layer that sits between your application and LLM providers (OpenAI, Anthropic, and more). We automatically route requests to the optimal model based on your task, budget, and quality requirements - while continuously learning from millions of inferences to improve over time.

Why Perf?

The Problem

Building production LLM applications is complex:

Cost Uncertainty: Model costs vary 30x between providers and models
Quality Variance: Different models excel at different tasks
Reliability Issues: Providers experience downtime and rate limits
Manual Optimization: Teams spend weeks tuning model selection
Hidden Failures: Poor outputs slip through without validation

The Solution

Perf provides:

Intelligent Routing: Automatically select the best model for each request
Cost Control: Enforce budgets and automatically optimize spend
Quality Validation: Detect and retry poor outputs before they reach users
Zero Downtime: Automatic failover across providers
Learning System: Performance improves as you use it

Key Features

Smart Model Routing

Automatic task classification (extraction, reasoning, code generation, etc.)
Complexity-aware model selection
Per-customer preference learning
Real-time provider health monitoring

Cost Optimization

Per-request budget constraints
Automatic model downgrade when needed
Up to 60% cost savings vs GPT-4o
Transparent per-call billing

Quality Assurance

Output validation (format, quality, completeness)
Automatic retry logic
Intelligent fallback escalation
User feedback integration

Enterprise Observability

Real-time performance dashboards
Detailed call logs and traces
Custom metrics and alerts
ROI tracking and reporting

Production-Ready

99.9% uptime SLA
SOC 2 Type II compliant
Role-based access control
Audit logs and compliance reporting

Quick Start

1. Get Your API Key

2. Make Your First Request

curl https://api.withperf.pro/v1/chat \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Extract key points from: ..."}],
    "max_cost_per_call": 0.01
  }'

Getting Started

API Reference

Platform

Guides

Resources

Perf - AI Runtime Orchestrator for LLM Cost Optimization & Quality Control

Perf Platform Documentation

What is Perf?

Why Perf?

The Problem

The Solution

Key Features

Smart Model Routing

Cost Optimization

Quality Assurance

Enterprise Observability

Production-Ready

Quick Start

1. Get Your API Key

2. Make Your First Request

3. View Analytics

Documentation Structure

Getting Started

API Reference

Platform Guide

Advanced Usage

Resources

Support

Next Steps

Getting Started

API Reference

Platform

Guides

Resources

​Perf Platform Documentation

​What is Perf?

​Why Perf?

​The Problem

​The Solution

​Key Features

​Smart Model Routing

​Cost Optimization

​Quality Assurance

​Enterprise Observability

​Production-Ready

​Quick Start

​1. Get Your API Key

​2. Make Your First Request

​3. View Analytics

​Documentation Structure

​Getting Started

​API Reference

​Platform Guide

​Advanced Usage

​Resources

​Support

​Next Steps

Perf Platform Documentation

What is Perf?

Why Perf?

The Problem

The Solution

Key Features

Smart Model Routing

Cost Optimization

Quality Assurance

Enterprise Observability

Production-Ready

Quick Start

1. Get Your API Key

2. Make Your First Request

3. View Analytics

Documentation Structure

Getting Started

API Reference

Platform Guide

Advanced Usage

Resources

Support

Next Steps