Stop losing customers to long AI response times.
Classify. Extract. Score. Summarize. Respond. Five prompts that could finish at once instead take five times as long. Blitz fans them out in parallel with a single call.
Free on every account. 0% markup on inference. Bring your own keys.
Without vs. with ProxyLLM
Eight 0.9-second prompts in a row is 7.5 seconds of spinner. In parallel, it is 0.94 seconds.
Total = sum of every individual call. User watches a spinner.
Total = slowest single call. User does not see a spinner.
One call. All results.
Drop the for-loop. Replace it with a Blitz request that takes an array of prompts.
const results = []
for (const prompt of prompts) {
const r = await openai.chat.completions
.create({ model: "gpt-4o-mini",
messages: [{ role: "user",
content: prompt }] })
results.push(r)
}
// 8 prompts · 7.5s total const { results } = await proxyllm.blitz({
model: "gpt-4o-mini",
prompts,
max_usd: 0.50,
})
// 8 prompts · 0.94s total
// rate-limit aware, partial-failure handled What people use Blitz for.
Anywhere you have a for-loop around an LLM call, Blitz fits.
Classify, then extract
Two passes on the same input. Classify the message, then extract structured fields. Blitz runs both at once and returns when both are done.
Multi-aspect scoring
Score a piece of content on tone, accuracy, brand fit, and risk. Four separate calls, one Blitz request, all four results back at once.
Fan-out summarization
Summarize 30 documents. Sequential is a coffee break. Blitz finishes before you switch tabs.
A/B prompt evaluation
Same input, three prompt variants. Compare outputs side by side. Blitz gets you all three in the time of one.
Same tokens. Same bill. 8x faster.
Blitz is free on every account. Sign up, drop in your keys, swap the for-loop for one call.
Run your AI workloads on your ChatGPT subscription.
ProxyLLM runs OpenAI's Codex for you, signed in with your own ChatGPT account. Your apps call one OpenAI-compatible endpoint and the work bills to your flat plan instead of per-token API pricing. Blitz fans the calls out. Codex Hosted is the lane where they run at a flat price.
Questions on Blitz.
What if one of the prompts fails?
Blitz returns partial results with per-prompt status. The rest of your batch is unaffected. You decide whether to retry the failures or move on.
Does it respect rate limits?
Yes. Blitz is rate-limit aware across providers. It will back off, queue, and reschedule based on the limits your keys have. You can also set a hard concurrency cap.
Can I set a cost ceiling?
Yes. Set a max-USD-per-Blitz-call. ProxyLLM stops dispatching when the cap is hit and returns whatever finished.
Does Blitz work with my Codex subscription?
Yes. Blitz distributes calls across whatever credentials you have configured. If your Codex container can absorb half the batch, it does. The rest falls back to API keys.