Perplexity for answer workflows.
Use perplexity/sonar-pro where grounded answers earn their cost, without another client surface in your app. Spend stays visible per request.
$129/month SaaS. Bring your own model keys. No inference markup.
Three steps to connect.
Pass Perplexity through
Perplexity's Sonar models answer with live search grounding. Use OpenRouter-backed access on your own key while native Perplexity key support remains a future path.
Keep the OpenAI surface
Point your client at https://api.proxyllm.ai/v1 and send Perplexity model names through the same gateway as your other model families.
Cap exploratory spend
Search-grounded calls get expensive quickly. Scoped sub-keys, budget caps, and request logs keep that traffic accountable.
Grounded answers, tracked.
Call Perplexity models through the ProxyLLM OpenAI-compatible endpoint on your own key.
from openai import OpenAI
client = OpenAI(
base_url="https://api.proxyllm.ai/v1",
api_key="pk_live_...",
)
r = client.chat.completions.create(
model="perplexity/sonar-pro",
messages=[{"role": "user", "content": "Find the current answer and cite sources."}],
) Run your AI workloads on your ChatGPT subscription.
ProxyLLM runs OpenAI's Codex for you, signed in with your own ChatGPT account. Your apps call one OpenAI-compatible endpoint and the work bills to your flat plan instead of per-token API pricing.
Put a cap on search-heavy calls.
Expose Perplexity to products and agents through scoped sub-keys with spend limits instead of unrestricted provider access. No markup on inference.