Model integration · Perplexity

Perplexity for answer workflows.

Use perplexity/sonar-pro where grounded answers earn their cost, without another client surface in your app. Spend stays visible per request.

Start free How to connect

$129/month SaaS. Bring your own model keys. No inference markup.

Three steps to connect.

Pass Perplexity through

Perplexity's Sonar models answer with live search grounding. Use OpenRouter-backed access on your own key while native Perplexity key support remains a future path.

Keep the OpenAI surface

Point your client at https://api.proxyllm.ai/v1 and send Perplexity model names through the same gateway as your other model families.

Cap exploratory spend

Search-grounded calls get expensive quickly. Scoped sub-keys, budget caps, and request logs keep that traffic accountable.

Grounded answers, tracked.

Call Perplexity models through the ProxyLLM OpenAI-compatible endpoint on your own key.

client.py

from openai import OpenAI

client = OpenAI(
    base_url="https://api.proxyllm.ai/v1",
    api_key="pk_live_...",
)

r = client.chat.completions.create(
    model="perplexity/sonar-pro",
    messages=[{"role": "user", "content": "Find the current answer and cite sources."}],
)

Codex Hosted · the main feature

Run your AI workloads on your ChatGPT subscription.

ProxyLLM runs OpenAI's Codex for you, signed in with your own ChatGPT account. Your apps call one OpenAI-compatible endpoint and the work bills to your flat plan instead of per-token API pricing.

Get Codex Hosted How it works

$129/month · normal SaaS pricing

Put a cap on search-heavy calls.

Expose Perplexity to products and agents through scoped sub-keys with spend limits instead of unrestricted provider access. No markup on inference.

Start free All integrations