> ## Documentation Index > Fetch the complete documentation index at: https://docs.withperf.pro/llms.txt > Use this file to discover all available pages before exploring further. # Voice Agents > Build conversational voice AI agents with Perf # Voice Agents Build real-time conversational voice agents powered by your custom instructions, knowledge base, and content safety policies. Voice agents handle speech recognition, natural language understanding, response generation, and text-to-speech — all through a single WebSocket connection. ## How It Works ``` Your App → WebSocket → Perf → LLM + TTS + STT ↕ Audio streaming (PCM16, 16kHz, mono) ``` 1. Your application opens a WebSocket connection to Perf 2. Perf establishes a real-time voice pipeline (speech-to-text, LLM, text-to-speech) 3. Your app streams microphone audio to Perf, and receives agent audio + transcripts back 4. Content safety policies are evaluated on every turn ## Quick Start ### 1. Create a Voice Agent In the [Perf Dashboard](https://dashboard.withperf.pro/dashboard/voice/agents), click **Create Agent** and configure: * **Name** — A label for your agent (e.g. "Customer Support") * **System Prompt** — Instructions that define the agent's behavior * **Voice** — Choose from available voices * **First Message** — What the agent says when a conversation starts * **Content Policy** (optional) — Attach a policy for PII redaction, blocked terms, or custom safety criteria ### 2. Add the SDK The fastest way to integrate is the [PerfVoice JavaScript SDK](./sdk): ```html theme={null} ``` That's it. The SDK handles microphone capture, audio encoding, WebSocket protocol, audio playback, interruptions, and ping/pong keepalive. ### 3. Test It Click your start button, allow microphone access, and speak. You should hear the agent respond and see transcripts in the console. ## Features | Feature | Description | | ------------------------- | -------------------------------------------------------- | | **Real-time streaming** | Sub-second latency from speech to agent response | | **Interruption handling** | Users can interrupt the agent mid-sentence | | **Custom voices** | Choose from multiple voice options | | **Knowledge base (RAG)** | Attach document collections for grounded answers | | **Web search** | Enable real-time web search for up-to-date information | | **Content safety** | PII detection, blocked terms, custom criteria | | **Loop detection** | Automatic detection and breaking of conversational loops | | **Transcripts** | Real-time agent and user transcripts via events | ## Integration Options | Method | Best For | Docs | | ------------------ | ------------------------------------ | --------------------------------- | | **JavaScript SDK** | Web apps, fastest integration | [SDK Reference](./sdk) | | **Raw WebSocket** | Full control, custom audio pipelines | [WebSocket Protocol](./websocket) | | **Python** | Server-side, IVR systems, telephony | [Python Integration](./python) | ## Authentication Voice agent connections require two parameters: | Parameter | Description | | ---------- | ---------------------------------------------- | | `api_key` | Your project API key (format: `pk_live_...`) | | `agent_id` | The voice agent ID (from the dashboard or API) | These are passed as query parameters on the WebSocket URL: ``` wss://api.withperf.pro/v1/voice/conversation?api_key=YOUR_API_KEY&agent_id=YOUR_AGENT_ID ``` ## Audio Format All audio is streamed as **PCM 16-bit, 16kHz, mono, little-endian**: | Property | Value | | ----------- | ------------------------------- | | Encoding | PCM signed 16-bit integer | | Sample rate | 16,000 Hz | | Channels | 1 (mono) | | Byte order | Little-endian | | Transport | Base64-encoded in JSON messages | ## Content Safety Voice agents support the same content safety policies as the rest of the Perf platform: * **Blocked terms** — Prevent specific words or phrases in agent responses * **PII detection** — Detect and redact personally identifiable information * **Custom criteria** — Define LLM-evaluated safety rules (e.g. "Agent must not provide medical advice") * **Filler phrases** — Play natural filler audio while safety evaluation runs Configure policies in the [Dashboard](https://dashboard.withperf.pro/dashboard/policies) and attach them to your voice agent. ## Next Steps * [JavaScript SDK Reference](./sdk) — Full SDK API documentation * [WebSocket Protocol](./websocket) — Raw WebSocket integration for advanced use cases * [Python Integration](./python) — Server-side Python integration * [Content Policies](../api-reference/policies) — Configure safety policies