Flat-price AI, explained in practice.
Guides on Codex Hosted, the ChatGPT subscription as an API backend, OpenAI cost math, and the Codex CLI. Written for teams that ship.
Codex Hosted
9 articlesCodex Hosted Savings: 5 Real Workload Scenarios
Five worked examples from a $250 solo dev bill to a $14,000 agent fleet, with full arithmetic. Savings run 40 to 97 percent, and every capacity figure is labeled an estimate.
ReadCodex Hosted Setup: From ChatGPT Account to Endpoint in 5 Minutes
Connect your ChatGPT account with device-code sign-in, let your isolated container start, set one base URL, and send your first request. Every step, with commands.
ReadCodex Hosted vs Running Codex Yourself: An Honest Comparison
DIY Codex on a laptop, VPS, or CI runner is free and fine for personal scripts. The honest ops comparison: uptime, auth refresh, queueing, logs, and the real costs.
ReadCodex Hosted: 25 Questions, Answered Directly
Twenty-five direct answers about Codex Hosted: what it is, how sign-in works, streaming, models, usage limits, billing, and where OpenAI's terms stand.
ReadFallback Lanes: How Codex Hosted Survives Usage Limits
Codex Hosted orders your credentials into lanes: ChatGPT accounts first, your API key last. How failover works, what the request log shows, and how to size the stack.
ReadHow Codex Hosted Billing Works: $129 Flat, No Markup
ProxyLLM charges $129/month flat, no inference markup. OpenAI bills your ChatGPT plan separately. What each invoice covers, what the dashboard tracks, and the free tier's scope.
ReadOne Account, One Container: How Codex Hosted Isolation Works
Device-code sign-in, no password custody, one isolated container per ChatGPT account, never pooled, AES-256-GCM for stored keys. The Codex Hosted security model, explained.
ReadWhat Is Codex Hosted? ProxyLLM's Main Feature, Explained
Codex Hosted runs OpenAI's official Codex CLI on our servers, signed in with your ChatGPT account, and exposes it as an OpenAI-compatible endpoint billed to your flat plan.
ReadWhat Works with Codex Hosted (and What Doesn't)
If a tool accepts an OpenAI base URL, it works with Codex Hosted. The caveats, honestly: complete responses, Codex's model surface, and what stays on your API key.
ReadChatGPT subscription as API
16 articlesCan You Turn a ChatGPT Account Into an API Key?
Not into a key: OpenAI issues API keys only on its separate API platform. But a ChatGPT account can become an endpoint, because plans include Codex and Codex runs from code.
ReadCan You Use Your ChatGPT Subscription as an API?
Yes, through OpenAI's Codex CLI. Here is how ChatGPT-plan Codex usage works, what it costs next to the API, and how to expose it as an endpoint.
ReadChatGPT Business and Enterprise: Codex and Programmatic Use
ChatGPT Business and Enterprise include Codex on every seat, not API credits. How teams run seats programmatically, with admin controls and seat capacity math.
ReadChatGPT Plus vs the API for Coding: 2026 Breakeven Math
ChatGPT Plus vs the OpenAI API for coding, at June 2026 GPT-5.x prices: breakeven tables at 1M, 10M, and 100M tokens, and why agent volumes settle the question.
ReadChatMock and DIY Codex Proxies: Setup, Risks, Hosted Alternative
ChatMock, codex-openai-proxy, and CLIProxyAPI turn a ChatGPT login into a local OpenAI-compatible API. They work. Here is what running one actually involves.
ReadDoes ChatGPT Plus Include API Access? What You Actually Get
ChatGPT Plus includes no API key and no API credits: the API bills separately. But Plus does include Codex, which runs programmatically, an estimated $700 of monthly capacity.
ReadFlat-Rate OpenAI API: Does It Exist in 2026?
OpenAI does not sell a flat-rate API. Three workarounds exist: gray-market resellers, DIY Codex proxies, and subscription-backed endpoints. A risk-ranked survey.
ReadIs There an API for ChatGPT Pro?
No. ChatGPT Pro ships no API key and no credits. It does include Codex with the largest individual plan windows: an estimated $3,500 to $14,000 of monthly API-equivalent capacity.
ReadSelf-Hosting a Codex Proxy vs ProxyLLM: Total Cost of Ownership
A fair TCO comparison: VPS cost, setup hours, auth-refresh breakage, and queueing against a $129 flat fee. With the arithmetic, and the cases where DIY wins.
ReadSign In with ChatGPT: How Subscriptions Became Credentials
Sign in with ChatGPT turns a subscription into a credential: where the program started, how the device-code flow works, and which tools can bill work to your plan.
ReadThe "Unlimited OpenAI API" Myth: What Subscriptions Really Give You
No unlimited OpenAI API exists at any price. What ChatGPT subscriptions actually provide: rolling usage windows, fair-use rules, and how to plan around them.
ReadUse Your ChatGPT Subscription in Make.com Scenarios
Call a Codex-backed OpenAI endpoint from Make.com with the HTTP module: exact module settings, response mapping, and what an AI-heavy scenario costs both ways.
ReadUse Your ChatGPT Subscription in n8n (No API Key Billing)
Point n8n's OpenAI credential at a Codex-backed base URL and your workflow's AI calls bill to your flat ChatGPT plan. Setup steps, caveats, and the cost math.
ReadUse Your ChatGPT Subscription in Zapier Workflows
Zapier's native OpenAI actions cannot change the API base URL. Webhooks by Zapier can: the exact POST setup that bills your AI steps to a flat ChatGPT plan.
ReadWhy Agent Workloads Flip the API-vs-Subscription Math
One agent task is 5 to 50 model calls, and context regrows every step. Worked loop math showing where per-token billing breaks and flat plan windows win.
ReadWhy Your OpenAI API Bill Is Separate from ChatGPT
ChatGPT subscriptions and the OpenAI API are two products with two bills. Why your Plus or Pro payment never covers API usage, and where the one documented bridge sits.
ReadOpenAI costs
16 articlesEstimating Your OpenAI API Costs: A Calculator Walkthrough
Estimate OpenAI API costs from token counts: 0.75 words per token, the per-request formula, volume with a buffer, then map the total to a ChatGPT plan tier.
ReadFixed-Cost LLM Inference: Options for Predictable AI Bills
Three real paths to a flat LLM bill: self-hosted open models on rented GPUs, provider capacity commitments, and subscription-backed Codex. Honest tradeoff table included.
ReadGPT-5.5 API Cost: Per-Token Prices and Real Workload Math
GPT-5.5 costs $5 per million input tokens and $30 per million output as of June 2026. What a PR review, a 100-document run, and an agent session cost, with the math shown.
ReadHow to Calculate AI Agent Costs (Multi-Step, Tool Calls, Retries)
Agents pay for loops, not single completions. The formula: steps x calls x tokens x retry rate, with worked GPT-5 examples and the multipliers public calculators miss.
ReadHow to Cap OpenAI API Spending Before It Caps You
Set OpenAI's budget alerts and monthly limits, then add hard per-app caps with scoped sub-keys. A runaway GPT-5.5 loop burns about $576 an hour; here is the containment stack.
ReadHow to Reduce OpenAI API Costs: Every Lever That Works in 2026
Seven levers that cut OpenAI API bills, with worked numbers for each: model routing, prompt caching, batch jobs, prompt trimming, output caps, monitoring, and a flat-rate lane.
ReadLLM Cost per User: The SaaS Math Nobody Shows
Cost per user = actions x calls x tokens x price. Worked example: $1.82 at the mean, $18 at p95 against $29 ARPU. Margin bands, and how flat capacity caps tail risk.
ReadOpenAI API Pricing Explained (2026): Models, Tokens, Gotchas
OpenAI API prices per million tokens, June 2026: GPT-5.5 $5/$30, GPT-5 $1.25/$10, Mini $0.25/$2, plus cached input, batch discounts, and the gotchas that inflate bills.
ReadOpenAI API vs ChatGPT Subscription: The Real Cost Math
Per-token API pricing vs flat ChatGPT plans, with worked examples. Where the crossover sits and when a subscription-backed setup wins.
ReadOpenAI Costs Suddenly Spiked? Find the Cause in 10 Minutes
A spike has one of five causes: a loop bug, a model bump, context growth, lost cache hits, or traffic. The 10-minute diagnostic, in order, with math to confirm each.
ReadOpenAI Prompt Caching: How It Works and What It Saves
OpenAI bills cached input at $0.125 per million tokens on GPT-5 versus $1.25 fresh, as of June 2026. The prefix rules, cache-friendly prompt structure, and hit-rate math.
ReadReasoning Tokens: The Hidden Multiplier in Your OpenAI Bill
Reasoning tokens bill as output tokens at full price. A 150-token answer can bill 1,500 tokens, turning a $525 monthly estimate into $2,550. How to see and control them.
ReadThe Cheapest OpenAI Model That Still Does the Job
GPT-5 Nano at $0.05/$0.40 per million tokens handles classification and extraction; Mini covers most production text. The decision table, with a worked 94x cost spread.
ReadThe OpenAI Batch API: When 50% Off Is Worth the Wait
The Batch API takes 50 percent off OpenAI token prices for jobs returned within 24 hours. How it works, which workloads qualify, and when a flat subscription lane beats it.
ReadWhat a 24/7 AI Agent Actually Costs: API vs Subscription
A 30-day worked scenario at three volumes: $239 to $7,155 a month on metered GPT-5 pricing, against flat ChatGPT plan setups at $149 to $329. Estimates, math shown.
ReadWhy Is My OpenAI Bill So High? A Diagnostic Checklist
Eight causes explain almost every oversized OpenAI bill: reasoning tokens, context bloat, agent loops, retries, wrong model, no caching, uncapped output, forgotten keys.
ReadCodex CLI
21 articlesAGENTS.md: Steering Codex with a Repo Contract
AGENTS.md is the instruction file Codex reads before working in your repo. What belongs in it, what to leave out, and a complete example for an exec-driven repository.
ReadCan GitHub Actions Use Your ChatGPT Plan for Codex?
Mechanically possible, but the official action expects an API key, and a personal plan session on shared runners blurs OpenAI's account rules. Three clean patterns, compared.
ReadCodex Auth: API Key vs ChatGPT Sign-In, Compared
ChatGPT sign-in bills Codex to your flat plan with usage windows. An API key bills per token with per-minute limits. How to choose, and when to run both as lanes.
ReadCodex CLI in Docker: A Working Container Setup
A Dockerfile for OpenAI's Codex CLI, a named volume for ~/.codex auth, device login inside the container, and why the sandbox flag changes when Docker is the boundary.
Readcodex exec Cookbook: 12 Non-Interactive Recipes
Twelve copy-paste codex exec recipes: PR review, TODO extraction, doc and changelog generation, cron digests, webhook handlers, jq pipelines, and resumed multi-stage chains.
Readcodex exec: The Complete Guide to Non-Interactive Codex
How OpenAI's codex exec works: syntax, auth on headless machines, CI usage, resume, and when a hosted container beats running it yourself.
ReadCodex in CI/CD: Pipelines, Gates, and Non-Interactive Runs
How to run OpenAI's Codex CLI in CI/CD: codex exec for pre-merge checks, sandbox and approval presets, output piping, and the right auth for shared vs personal runners.
ReadCodex Limits: ChatGPT Plus vs Pro vs Business
Every ChatGPT plan includes Codex; the window sizes differ. How Plus, Pro, and Business capacity compares, which tier fits which workload, and the cost math as estimates.
Readcodex login Without a Browser: Device Auth Step by Step
Run codex login --device-auth on the headless box, approve the code at chatgpt.com from any device, and the CLI saves its tokens to ~/.codex/auth.json. Full walkthrough.
ReadCodex On-Demand Credits: Pricing, Draining, and When to Skip Them
Codex credits let Plus and Pro users keep working after a limit. How they bill, why users report fast drain on agent work, and when a second account is the cheaper valve.
ReadCodex Rate Limit Errors: Causes and Fixes
A Codex rate limit error means one of four things: the 5-hour window, the weekly cap, drained credits, or an API 429. How to tell which one, and the fix for each.
ReadCodex Usage Limits, Explained (Updated June 2026)
How Codex limits work: rolling 5-hour windows, weekly caps on some plans, one shared usage pool, and on-demand credits. The full system, with links to OpenAI's live numbers.
ReadCodex with Multiple ChatGPT Accounts: Strategies That Work
The Codex CLI signs in one ChatGPT account at a time. The workarounds that exist: separate CODEX_HOME directories, re-login juggling, and gateway fallback lanes.
ReadCodex's 5-Hour Window: What Counts and What Resets
Codex meters usage over a rolling window of roughly five hours. What draws it down, including reasoning time, how resets behave, and how to pace heavy sessions.
ReadHow to Check Your Codex Usage and Remaining Limits
Check Codex usage with /status in the CLI, ChatGPT's usage surfaces, or per-request gateway logs. Where to see your 5-hour and weekly limits and when they reset.
ReadOpenAI Codex CLI: Getting Started in 10 Minutes
Install OpenAI's Codex CLI with npm or Homebrew, sign in with your ChatGPT account, run your first task, and set sane defaults. A practical setup guide.
ReadRun Codex CLI Headless: Servers, SSH, and No-Browser Auth
Run Codex CLI on a headless server: install, device-code auth with no browser, codex exec for scripts, auth.json handling, and systemd patterns that survive reboots.
ReadRun Codex on a Schedule: Cron, Webhooks, and Always-On Jobs
Schedule codex exec with cron or systemd timers, wire failure alerts with exit codes and webhooks, and pick the job's home: laptop, VPS, or a hosted endpoint.
ReadRunning Codex CLI on a VPS: The Complete Setup
Rent a $5 box, install Codex CLI, sign in with device auth, keep it alive with tmux or systemd, and guard auth.json. The full VPS walkthrough, plus the honest hosted bridge.
ReadThe Codex GitHub Action: PR Reviews and Quality Gates
Set up OpenAI's official Codex GitHub Action with an OPENAI_API_KEY repo secret, post PR reviews as comments, and gate merges on a Codex verdict. Working YAML included.
ReadThe Codex Weekly Limit: How It Works and How to Live With It
Some ChatGPT plans cap Codex over a 7-day cycle on top of the 5-hour window. How the weekly limit resets, why agent work burns it fast, and how to keep shipping.
ReadComparisons
15 articlesAre Cheap OpenAI API Resellers Legit? The Gray Market, Examined
Most cheap OpenAI API resellers run on bulk accounts and resold keys OpenAI's terms prohibit. How the gray market sources capacity, how it ends, and the clean alternative.
ReadCheap OpenAI API Access: Every Path, Ranked by Risk
Five ways to pay less for OpenAI: official discounts, subscription-backed capacity, DIY proxies, gray-market resellers, and shared keys, ranked by what can go wrong.
ReadCodex vs Claude Code: The Cost Structure Comparison
Codex bills through flat ChatGPT plan windows; Claude Code meters programmatic use through Agent SDK credits or API keys. The structures and policies, compared.
ReadHelicone Alternatives After the Mintlify Acquisition
Helicone joined Mintlify in 2026, and teams are rechecking their LLM observability options: Langfuse, Portkey, gateway built-in logs, and lightweight request logging.
ReadLiteLLM vs OpenRouter: Self-Hosted Router or Hosted Marketplace?
LiteLLM is an open-source router you run yourself, with no markup. OpenRouter is a hosted marketplace charging about 5.5% on credits. How to choose on fees, ops, and models.
ReadNanoGPT Review: What $8/Month Flat Actually Buys
An honest NanoGPT review: the $8/month flat tier is real value for individuals on open-weight models, with fair-use limits on pooled capacity. Who it fits and who it doesn't.
ReadNanoGPT vs OpenRouter: Flat Subscription Meets Pay-Per-Token
NanoGPT sells a flat $8/month tier on capacity pooled across its own accounts; OpenRouter meters 400+ models per token plus a ~5.5% credit fee. How to pick between them.
ReadOpenRouter Alternatives in 2026 (Including the Flat-Rate Option)
Five OpenRouter alternatives compared on the axis most lists skip: cost model. LiteLLM, Portkey, Requesty, direct provider keys, and ProxyLLM's flat-rate lane.
ReadOpenRouter vs Direct OpenAI: When the Middleman Earns His Fee
OpenRouter adds roughly 5.5% on credits and a BYOK fee on top of OpenAI's list prices. The fee math at three volumes, what the marketplace buys, and when direct keys win.
ReadPer-Token vs Flat-Rate LLM Pricing: A Decision Framework
Per-token billing prices variance; flat plans price a ceiling. A decision table by utilization, burst shape, and risk tolerance for picking an LLM pricing model.
ReadPortkey Alternatives: Gateways, Routers, and Flat-Rate Lanes
Portkey bundles a gateway, observability, and guardrails. The real alternatives sorted by job: LiteLLM, OpenRouter, Langfuse, Helicone, and a flat-rate lane for OpenAI spend.
ReadProxyLLM Alternatives: DIY Proxies, Gateways, and Going Direct
The honest list: ChatMock or CLIProxyAPI if you self-host, OpenRouter for model breadth, LiteLLM for routing, the direct API for low spend. And who should not buy from us.
ReadProxyLLM vs LiteLLM: Hosted Flat-Rate vs Self-Hosted Router
LiteLLM is a free self-hosted router across many providers. ProxyLLM is a hosted flat-rate lane for OpenAI volume. Different layers, and they compose well.
ReadProxyLLM vs OpenRouter: Different Problems, Different Tools
OpenRouter sells model breadth per token; ProxyLLM sells flat OpenAI volume on your own ChatGPT plan. What each is for, when to run both, and who should buy neither.
ReadThe Best LLM Gateways in 2026, by Workload
No single best LLM gateway exists; workloads pick winners. OpenRouter for multi-model apps, LiteLLM for platform teams, Portkey for observability, ProxyLLM for flat OpenAI volume.
ReadIntegrations
12 articlesControlling AI Costs in GitHub Actions
CI runs AI on every push, with retries and matrix jobs multiplying calls. Capped per-repo keys, request logs per key, and when CI belongs on a key vs a plan.
ReadCursor and Your ChatGPT Subscription: What Works in 2026
Cursor's OpenAI key override can point chat at a subscription-backed endpoint. What survives the override, what stays on Cursor's stack, and when it pays off.
ReadDify Apps Without Metered OpenAI Billing
Add ProxyLLM as an OpenAI-compatible provider in Dify and app traffic bills to a flat ChatGPT plan. Per-app sub-keys with hard caps make each app's spend visible.
ReadFlowise Agent Flows on Flat OpenAI Capacity
Point Flowise's ChatOpenAI credential at a flat OpenAI lane: the config, worked cost math for an agent flow that loops and retries, and caps per deployment.
ReadLangChain on a Flat-Rate OpenAI Lane
ChatOpenAI accepts a base_url, so LangChain chains and agents can bill to a flat ChatGPT plan through Codex Hosted. Python and JS setup, plus worked agent-loop math.
ReadLlamaIndex Ingestion Without the Token Meter
LlamaIndex ingestion is a bulk LLM workload: three extractors over 10,000 chunks is 30,000 model calls. The api_base setup and the worked math, metered vs flat.
ReadMCP Tool Loops and the Cost of Agency
MCP hosts re-send every tool schema with every model call, and each tool result triggers another call. The worked math of tool-loop spend and what a flat lane changes.
ReadNode.js OpenAI SDK Behind a Flat-Rate Endpoint
Point the OpenAI Node.js SDK at ProxyLLM with baseURL or env vars. Services and serverless functions bill OpenAI calls to a flat ChatGPT subscription.
Readopencode on Your ChatGPT Subscription via ProxyLLM
Add ProxyLLM to opencode.json with @ai-sdk/openai-compatible and coding sessions bill to your ChatGPT plan. The config block, session cost math, and limit behavior.
ReadPython OpenAI SDK: One base_url Change for Flat-Rate Calls
How to point the OpenAI Python SDK at ProxyLLM with base_url. Batch scripts and notebooks bill to a flat ChatGPT subscription instead of the per-token meter.
ReadThe OpenAI-Compatible API, by Hand: curl Examples
Raw curl requests against an OpenAI-compatible endpoint: headers, request body, response shape, and error envelope, with flat-rate billing behind the URL.
ReadVercel AI SDK with ProxyLLM: Provider Setup and Caveats
Configure createOpenAICompatible with ProxyLLM's base URL and OpenAI calls bill to a flat ChatGPT plan. Setup code, the streaming caveat, and per-environment keys.
ReadAI agency
7 articlesAI Agency Pricing Models: 5 Structures with Real Numbers
Retainer, pass-through, capped, value-based, and hybrid pricing for AI agency work, each with worked numbers and its failure mode, plus what flat COGS changes.
ReadAI Agency Unit Economics: Where Margin Actually Leaks
Revenue per client is flat; cost per client is metered. The agentic margin concept, a quantified leak inventory (retries, scope creep, model bumps), and the math to plug it.
ReadForecasting OpenAI Costs for Client Projects
A working estimation method for agencies: prototype a thin slice, sample real token counts, multiply by steps and retries, quote a range, then track it weekly.
ReadShould You Pass API Costs Through to Clients?
Pass through when usage is client-driven, absorb when it is small and stable, cap when clients need certainty. Failure modes, sample contract language, and the flat-capacity case.
ReadThe AI Agency Margin Problem: When OpenAI Is Your Biggest Vendor
API costs scale with every client you add. How agencies price AI work, where margins leak, and what flat subscription-backed capacity changes.
ReadThe Real Cost of an AI Agency Tech Stack (2026)
Fixed tools run $50-420 a month. The model line is the only one that scales with clients, and at 25 clients it is about 89% of the stack. The itemized table and how to flatten it.
ReadWhite-Label AI Infrastructure for Agencies
White-label AI for agencies usually means chatbot builders. The margin layer sits below: a sub-key per client, budget caps, and cost reports that carry your brand.
ReadPolicies & limits
6 articlesAnthropic's Agent SDK Credits: The June 2026 Rules, Explained
Anthropic now includes metered Agent SDK credits with Claude plans: $20 on Pro, $100 on Max 5x, $200 on Max 20x. What they permit, what stays prohibited, and the math.
ReadIs Codex Hosted Against OpenAI's Terms? An Honest Reading
codex exec is documented functionality and Codex is included in ChatGPT plans. What the terms say, what they don't, and why OpenAI keeps the final call.
ReadSharing an OpenAI Account: What the Terms Actually Say
Yes, it is prohibited: OpenAI's terms say you may not share credentials or make your account available to anyone else. What that covers, and why owning two accounts differs.
ReadThird-Party Tools and Your OpenAI Account: A Risk Hygiene Guide
OpenAI restricts accounts for shared credentials, scraping-style extraction, and resold access, not for tool choice. The six-question checklist to vet any tool, including us.
ReadWhat Happens When You Hit Your Codex Usage Limit?
ChatGPT plan limits reset on rolling windows. How limits behave, and how multi-account and API-key fallback keep production requests flowing.
ReadWhy We Don't Support Claude Code (And Won't)
Anthropic prohibits third-party services from routing requests through Claude subscription credentials. The policy, the January 2026 crackdown, and what it means.
ReadRun your AI workloads on your ChatGPT subscription.
ProxyLLM runs OpenAI's Codex for you, signed in with your own ChatGPT account. Your apps call one OpenAI-compatible endpoint and the work bills to your flat plan instead of per-token API pricing.