Perf Platform Documentation

Welcome to Perf - the intelligent AI runtime orchestrator that optimizes your LLM applications for cost, quality, and reliability while providing unified access to text, image, audio, and video generation.

What is Perf?

Perf is an AI infrastructure layer that sits between your application and AI providers (OpenAI, Anthropic, Google, Stability AI, Runway, and more). We provide:

Unified API - One API for text, images, audio, and video generation
Intelligent Orchestration - Automatically select the optimal model based on task, budget, and quality requirements
Schema Enforcement - Validate and auto-repair LLM outputs against your JSON schemas
Continuous Learning - Performance improves as we learn from millions of inferences

Why Perf?

The Problem

Building production AI applications is complex:

Fragmented APIs - Different providers have different APIs, authentication, and response formats
Cost Uncertainty - Model costs vary 30x+ between providers and models
Unreliable Outputs - LLMs return malformed JSON, hallucinate, or refuse requests
Manual Optimization - Teams spend weeks tuning model selection and prompts
Provider Lock-in - Switching providers requires significant code changes

The Solution

Perf provides:

One API, All Modalities - Text, images, audio, video through OpenAI-compatible endpoints
Intelligent Orchestration - Automatically select the best model for each request
Schema Enforcement - Define JSON schemas, we validate and auto-repair outputs
Cost Control - Enforce budgets and automatically optimize spend
Zero Lock-in - Switch models/providers without code changes

Key Features

Text - Chat completions with intelligent model selection
Images - DALL-E 3, Stable Diffusion 3, Flux, Ideogram, Imagen
Audio - Text-to-speech (TTS) and speech-to-text (Whisper)
Video - Veo 3, Runway Gen-3, Luma Dream Machine, Pika
Voice Agents - Real-time conversational AI agents with custom instructions, knowledge base, and content safety

Smart Model Selection

Automatic task classification (extraction, reasoning, code, vision, audio)
Complexity-aware model selection
Per-customer preference learning
Real-time provider health monitoring

Schema Enforcement

Define JSON schemas for structured outputs
Automatic validation and repair
Type coercion and format correction
Per-project default schemas

Policy Enforcement

Routing Policies - Control model selection, set cost limits, block providers
Content Policies - Detect and redact PII, filter sensitive terms
Compliance - HIPAA-ready PII detection, audit logging
Governance - Policy templates for common use cases

Tools Library

Web Search - Real-time web search for up-to-date information
Documents/RAG - Upload and query documents with semantic search
Memory - Persistent conversation context across sessions
Coming Soon - External actions (Slack, GitHub, Jira, and more)

Cost Optimization

Per-request budget constraints
Automatic model downgrade when needed
Up to 60% cost savings vs GPT-4
Transparent per-call billing

Production-Ready

Automatic failover across providers
Quality validation with retry logic
Real-time dashboards and logs
Enterprise observability

Quick Start

1. Get Your API Key

2. Make Requests

# Text generation
curl https://api.withperf.pro/v1/chat \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Hello!"}]}'

# Image generation
curl https://api.withperf.pro/v1/images/generations \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "A sunset over mountains", "model": "dall-e-3"}'

3. View Analytics

Monitor usage, costs, and performance in the Dashboard.

API Reference

Text Generation

Chat API - Intelligent model selection for text
Streaming API - Real-time token streaming

Voice Agents

Overview - Build conversational voice AI agents
JavaScript SDK - Drop-in SDK for web apps
WebSocket Protocol - Raw protocol reference
Python Integration - Server-side integration

Media Generation

Image Generation - DALL-E, Stable Diffusion, Flux, and more
Audio API - Text-to-speech and transcription
Video Generation - Veo, Runway, Luma, Pika

Governance & Quality

Schema Enforcement - JSON schema validation and auto-repair
Policies API - Routing rules, PII detection, content filtering

Agentic Tools

Tools API - Web search, documents/RAG, conversation memory

Analytics

Metrics API - Analytics and monitoring
Logs API - Debugging and audit trails

Documentation

Getting Started

Platform Guide

Advanced

Support

Documentation: docs.withperf.pro
Email: support@withperf.pro
Status: status.withperf.pro

​Perf Platform Documentation

​What is Perf?

​Why Perf?

​The Problem

​The Solution

​Key Features

​Multi-Modal Generation

​Smart Model Selection

​Schema Enforcement

​Policy Enforcement

​Tools Library

​Cost Optimization

​Production-Ready

​Quick Start

​1. Get Your API Key

​2. Make Requests

​3. View Analytics

​API Reference

​Text Generation

​Voice Agents

​Media Generation

​Governance & Quality

​Agentic Tools

​Analytics

​Documentation

​Getting Started

​Platform Guide

​Advanced

​Support