Blog

Flat-price AI, explained in practice.

Guides on Codex Hosted, the ChatGPT subscription as an API backend, OpenAI cost math, and the Codex CLI. Written for teams that ship.

Codex Hosted

9 articles

Codex Hosted Savings: 5 Real Workload Scenarios

Five worked examples from a $250 solo dev bill to a $14,000 agent fleet, with full arithmetic. Savings run 40 to 97 percent, and every capacity figure is labeled an estimate.

Read

Codex Hosted Setup: From ChatGPT Account to Endpoint in 5 Minutes

Connect your ChatGPT account with device-code sign-in, let your isolated container start, set one base URL, and send your first request. Every step, with commands.

Read

Codex Hosted vs Running Codex Yourself: An Honest Comparison

DIY Codex on a laptop, VPS, or CI runner is free and fine for personal scripts. The honest ops comparison: uptime, auth refresh, queueing, logs, and the real costs.

Read

Codex Hosted: 25 Questions, Answered Directly

Twenty-five direct answers about Codex Hosted: what it is, how sign-in works, streaming, models, usage limits, billing, and where OpenAI's terms stand.

Read

Fallback Lanes: How Codex Hosted Survives Usage Limits

Codex Hosted orders your credentials into lanes: ChatGPT accounts first, your API key last. How failover works, what the request log shows, and how to size the stack.

Read

How Codex Hosted Billing Works: $129 Flat, No Markup

ProxyLLM charges $129/month flat, no inference markup. OpenAI bills your ChatGPT plan separately. What each invoice covers, what the dashboard tracks, and the free tier's scope.

Read

One Account, One Container: How Codex Hosted Isolation Works

Device-code sign-in, no password custody, one isolated container per ChatGPT account, never pooled, AES-256-GCM for stored keys. The Codex Hosted security model, explained.

Read

What Is Codex Hosted? ProxyLLM's Main Feature, Explained

Codex Hosted runs OpenAI's official Codex CLI on our servers, signed in with your ChatGPT account, and exposes it as an OpenAI-compatible endpoint billed to your flat plan.

Read

What Works with Codex Hosted (and What Doesn't)

If a tool accepts an OpenAI base URL, it works with Codex Hosted. The caveats, honestly: complete responses, Codex's model surface, and what stays on your API key.

Read

ChatGPT subscription as API

16 articles

Can You Turn a ChatGPT Account Into an API Key?

Not into a key: OpenAI issues API keys only on its separate API platform. But a ChatGPT account can become an endpoint, because plans include Codex and Codex runs from code.

Read

Can You Use Your ChatGPT Subscription as an API?

Yes, through OpenAI's Codex CLI. Here is how ChatGPT-plan Codex usage works, what it costs next to the API, and how to expose it as an endpoint.

Read

ChatGPT Business and Enterprise: Codex and Programmatic Use

ChatGPT Business and Enterprise include Codex on every seat, not API credits. How teams run seats programmatically, with admin controls and seat capacity math.

Read

ChatGPT Plus vs the API for Coding: 2026 Breakeven Math

ChatGPT Plus vs the OpenAI API for coding, at June 2026 GPT-5.x prices: breakeven tables at 1M, 10M, and 100M tokens, and why agent volumes settle the question.

Read

ChatMock and DIY Codex Proxies: Setup, Risks, Hosted Alternative

ChatMock, codex-openai-proxy, and CLIProxyAPI turn a ChatGPT login into a local OpenAI-compatible API. They work. Here is what running one actually involves.

Read

Does ChatGPT Plus Include API Access? What You Actually Get

ChatGPT Plus includes no API key and no API credits: the API bills separately. But Plus does include Codex, which runs programmatically, an estimated $700 of monthly capacity.

Read

Flat-Rate OpenAI API: Does It Exist in 2026?

OpenAI does not sell a flat-rate API. Three workarounds exist: gray-market resellers, DIY Codex proxies, and subscription-backed endpoints. A risk-ranked survey.

Read

Is There an API for ChatGPT Pro?

No. ChatGPT Pro ships no API key and no credits. It does include Codex with the largest individual plan windows: an estimated $3,500 to $14,000 of monthly API-equivalent capacity.

Read

Self-Hosting a Codex Proxy vs ProxyLLM: Total Cost of Ownership

A fair TCO comparison: VPS cost, setup hours, auth-refresh breakage, and queueing against a $129 flat fee. With the arithmetic, and the cases where DIY wins.

Read

Sign In with ChatGPT: How Subscriptions Became Credentials

Sign in with ChatGPT turns a subscription into a credential: where the program started, how the device-code flow works, and which tools can bill work to your plan.

Read

The "Unlimited OpenAI API" Myth: What Subscriptions Really Give You

No unlimited OpenAI API exists at any price. What ChatGPT subscriptions actually provide: rolling usage windows, fair-use rules, and how to plan around them.

Read

Use Your ChatGPT Subscription in Make.com Scenarios

Call a Codex-backed OpenAI endpoint from Make.com with the HTTP module: exact module settings, response mapping, and what an AI-heavy scenario costs both ways.

Read

Use Your ChatGPT Subscription in n8n (No API Key Billing)

Point n8n's OpenAI credential at a Codex-backed base URL and your workflow's AI calls bill to your flat ChatGPT plan. Setup steps, caveats, and the cost math.

Read

Use Your ChatGPT Subscription in Zapier Workflows

Zapier's native OpenAI actions cannot change the API base URL. Webhooks by Zapier can: the exact POST setup that bills your AI steps to a flat ChatGPT plan.

Read

Why Agent Workloads Flip the API-vs-Subscription Math

One agent task is 5 to 50 model calls, and context regrows every step. Worked loop math showing where per-token billing breaks and flat plan windows win.

Read

Why Your OpenAI API Bill Is Separate from ChatGPT

ChatGPT subscriptions and the OpenAI API are two products with two bills. Why your Plus or Pro payment never covers API usage, and where the one documented bridge sits.

Read

OpenAI costs

16 articles

Estimating Your OpenAI API Costs: A Calculator Walkthrough

Estimate OpenAI API costs from token counts: 0.75 words per token, the per-request formula, volume with a buffer, then map the total to a ChatGPT plan tier.

Read

Fixed-Cost LLM Inference: Options for Predictable AI Bills

Three real paths to a flat LLM bill: self-hosted open models on rented GPUs, provider capacity commitments, and subscription-backed Codex. Honest tradeoff table included.

Read

GPT-5.5 API Cost: Per-Token Prices and Real Workload Math

GPT-5.5 costs $5 per million input tokens and $30 per million output as of June 2026. What a PR review, a 100-document run, and an agent session cost, with the math shown.

Read

How to Calculate AI Agent Costs (Multi-Step, Tool Calls, Retries)

Agents pay for loops, not single completions. The formula: steps x calls x tokens x retry rate, with worked GPT-5 examples and the multipliers public calculators miss.

Read

How to Cap OpenAI API Spending Before It Caps You

Set OpenAI's budget alerts and monthly limits, then add hard per-app caps with scoped sub-keys. A runaway GPT-5.5 loop burns about $576 an hour; here is the containment stack.

Read

How to Reduce OpenAI API Costs: Every Lever That Works in 2026

Seven levers that cut OpenAI API bills, with worked numbers for each: model routing, prompt caching, batch jobs, prompt trimming, output caps, monitoring, and a flat-rate lane.

Read

LLM Cost per User: The SaaS Math Nobody Shows

Cost per user = actions x calls x tokens x price. Worked example: $1.82 at the mean, $18 at p95 against $29 ARPU. Margin bands, and how flat capacity caps tail risk.

Read

OpenAI API Pricing Explained (2026): Models, Tokens, Gotchas

OpenAI API prices per million tokens, June 2026: GPT-5.5 $5/$30, GPT-5 $1.25/$10, Mini $0.25/$2, plus cached input, batch discounts, and the gotchas that inflate bills.

Read

OpenAI API vs ChatGPT Subscription: The Real Cost Math

Per-token API pricing vs flat ChatGPT plans, with worked examples. Where the crossover sits and when a subscription-backed setup wins.

Read

OpenAI Costs Suddenly Spiked? Find the Cause in 10 Minutes

A spike has one of five causes: a loop bug, a model bump, context growth, lost cache hits, or traffic. The 10-minute diagnostic, in order, with math to confirm each.

Read

OpenAI Prompt Caching: How It Works and What It Saves

OpenAI bills cached input at $0.125 per million tokens on GPT-5 versus $1.25 fresh, as of June 2026. The prefix rules, cache-friendly prompt structure, and hit-rate math.

Read

Reasoning Tokens: The Hidden Multiplier in Your OpenAI Bill

Reasoning tokens bill as output tokens at full price. A 150-token answer can bill 1,500 tokens, turning a $525 monthly estimate into $2,550. How to see and control them.

Read

The Cheapest OpenAI Model That Still Does the Job

GPT-5 Nano at $0.05/$0.40 per million tokens handles classification and extraction; Mini covers most production text. The decision table, with a worked 94x cost spread.

Read

The OpenAI Batch API: When 50% Off Is Worth the Wait

The Batch API takes 50 percent off OpenAI token prices for jobs returned within 24 hours. How it works, which workloads qualify, and when a flat subscription lane beats it.

Read

What a 24/7 AI Agent Actually Costs: API vs Subscription

A 30-day worked scenario at three volumes: $239 to $7,155 a month on metered GPT-5 pricing, against flat ChatGPT plan setups at $149 to $329. Estimates, math shown.

Read

Why Is My OpenAI Bill So High? A Diagnostic Checklist

Eight causes explain almost every oversized OpenAI bill: reasoning tokens, context bloat, agent loops, retries, wrong model, no caching, uncapped output, forgotten keys.

Read

Codex CLI

21 articles

AGENTS.md: Steering Codex with a Repo Contract

AGENTS.md is the instruction file Codex reads before working in your repo. What belongs in it, what to leave out, and a complete example for an exec-driven repository.

Read

Can GitHub Actions Use Your ChatGPT Plan for Codex?

Mechanically possible, but the official action expects an API key, and a personal plan session on shared runners blurs OpenAI's account rules. Three clean patterns, compared.

Read

Codex Auth: API Key vs ChatGPT Sign-In, Compared

ChatGPT sign-in bills Codex to your flat plan with usage windows. An API key bills per token with per-minute limits. How to choose, and when to run both as lanes.

Read

Codex CLI in Docker: A Working Container Setup

A Dockerfile for OpenAI's Codex CLI, a named volume for ~/.codex auth, device login inside the container, and why the sandbox flag changes when Docker is the boundary.

Read

codex exec Cookbook: 12 Non-Interactive Recipes

Twelve copy-paste codex exec recipes: PR review, TODO extraction, doc and changelog generation, cron digests, webhook handlers, jq pipelines, and resumed multi-stage chains.

Read

codex exec: The Complete Guide to Non-Interactive Codex

How OpenAI's codex exec works: syntax, auth on headless machines, CI usage, resume, and when a hosted container beats running it yourself.

Read

Codex in CI/CD: Pipelines, Gates, and Non-Interactive Runs

How to run OpenAI's Codex CLI in CI/CD: codex exec for pre-merge checks, sandbox and approval presets, output piping, and the right auth for shared vs personal runners.

Read

Codex Limits: ChatGPT Plus vs Pro vs Business

Every ChatGPT plan includes Codex; the window sizes differ. How Plus, Pro, and Business capacity compares, which tier fits which workload, and the cost math as estimates.

Read

codex login Without a Browser: Device Auth Step by Step

Run codex login --device-auth on the headless box, approve the code at chatgpt.com from any device, and the CLI saves its tokens to ~/.codex/auth.json. Full walkthrough.

Read

Codex On-Demand Credits: Pricing, Draining, and When to Skip Them

Codex credits let Plus and Pro users keep working after a limit. How they bill, why users report fast drain on agent work, and when a second account is the cheaper valve.

Read

Codex Rate Limit Errors: Causes and Fixes

A Codex rate limit error means one of four things: the 5-hour window, the weekly cap, drained credits, or an API 429. How to tell which one, and the fix for each.

Read

Codex Usage Limits, Explained (Updated June 2026)

How Codex limits work: rolling 5-hour windows, weekly caps on some plans, one shared usage pool, and on-demand credits. The full system, with links to OpenAI's live numbers.

Read

Codex with Multiple ChatGPT Accounts: Strategies That Work

The Codex CLI signs in one ChatGPT account at a time. The workarounds that exist: separate CODEX_HOME directories, re-login juggling, and gateway fallback lanes.

Read

Codex's 5-Hour Window: What Counts and What Resets

Codex meters usage over a rolling window of roughly five hours. What draws it down, including reasoning time, how resets behave, and how to pace heavy sessions.

Read

How to Check Your Codex Usage and Remaining Limits

Check Codex usage with /status in the CLI, ChatGPT's usage surfaces, or per-request gateway logs. Where to see your 5-hour and weekly limits and when they reset.

Read

OpenAI Codex CLI: Getting Started in 10 Minutes

Install OpenAI's Codex CLI with npm or Homebrew, sign in with your ChatGPT account, run your first task, and set sane defaults. A practical setup guide.

Read

Run Codex CLI Headless: Servers, SSH, and No-Browser Auth

Run Codex CLI on a headless server: install, device-code auth with no browser, codex exec for scripts, auth.json handling, and systemd patterns that survive reboots.

Read

Run Codex on a Schedule: Cron, Webhooks, and Always-On Jobs

Schedule codex exec with cron or systemd timers, wire failure alerts with exit codes and webhooks, and pick the job's home: laptop, VPS, or a hosted endpoint.

Read

Running Codex CLI on a VPS: The Complete Setup

Rent a $5 box, install Codex CLI, sign in with device auth, keep it alive with tmux or systemd, and guard auth.json. The full VPS walkthrough, plus the honest hosted bridge.

Read

The Codex GitHub Action: PR Reviews and Quality Gates

Set up OpenAI's official Codex GitHub Action with an OPENAI_API_KEY repo secret, post PR reviews as comments, and gate merges on a Codex verdict. Working YAML included.

Read

The Codex Weekly Limit: How It Works and How to Live With It

Some ChatGPT plans cap Codex over a 7-day cycle on top of the 5-hour window. How the weekly limit resets, why agent work burns it fast, and how to keep shipping.

Read

Comparisons

15 articles

Are Cheap OpenAI API Resellers Legit? The Gray Market, Examined

Most cheap OpenAI API resellers run on bulk accounts and resold keys OpenAI's terms prohibit. How the gray market sources capacity, how it ends, and the clean alternative.

Read

Cheap OpenAI API Access: Every Path, Ranked by Risk

Five ways to pay less for OpenAI: official discounts, subscription-backed capacity, DIY proxies, gray-market resellers, and shared keys, ranked by what can go wrong.

Read

Codex vs Claude Code: The Cost Structure Comparison

Codex bills through flat ChatGPT plan windows; Claude Code meters programmatic use through Agent SDK credits or API keys. The structures and policies, compared.

Read

Helicone Alternatives After the Mintlify Acquisition

Helicone joined Mintlify in 2026, and teams are rechecking their LLM observability options: Langfuse, Portkey, gateway built-in logs, and lightweight request logging.

Read

LiteLLM vs OpenRouter: Self-Hosted Router or Hosted Marketplace?

LiteLLM is an open-source router you run yourself, with no markup. OpenRouter is a hosted marketplace charging about 5.5% on credits. How to choose on fees, ops, and models.

Read

NanoGPT Review: What $8/Month Flat Actually Buys

An honest NanoGPT review: the $8/month flat tier is real value for individuals on open-weight models, with fair-use limits on pooled capacity. Who it fits and who it doesn't.

Read

NanoGPT vs OpenRouter: Flat Subscription Meets Pay-Per-Token

NanoGPT sells a flat $8/month tier on capacity pooled across its own accounts; OpenRouter meters 400+ models per token plus a ~5.5% credit fee. How to pick between them.

Read

OpenRouter Alternatives in 2026 (Including the Flat-Rate Option)

Five OpenRouter alternatives compared on the axis most lists skip: cost model. LiteLLM, Portkey, Requesty, direct provider keys, and ProxyLLM's flat-rate lane.

Read

OpenRouter vs Direct OpenAI: When the Middleman Earns His Fee

OpenRouter adds roughly 5.5% on credits and a BYOK fee on top of OpenAI's list prices. The fee math at three volumes, what the marketplace buys, and when direct keys win.

Read

Per-Token vs Flat-Rate LLM Pricing: A Decision Framework

Per-token billing prices variance; flat plans price a ceiling. A decision table by utilization, burst shape, and risk tolerance for picking an LLM pricing model.

Read

Portkey Alternatives: Gateways, Routers, and Flat-Rate Lanes

Portkey bundles a gateway, observability, and guardrails. The real alternatives sorted by job: LiteLLM, OpenRouter, Langfuse, Helicone, and a flat-rate lane for OpenAI spend.

Read

ProxyLLM Alternatives: DIY Proxies, Gateways, and Going Direct

The honest list: ChatMock or CLIProxyAPI if you self-host, OpenRouter for model breadth, LiteLLM for routing, the direct API for low spend. And who should not buy from us.

Read

ProxyLLM vs LiteLLM: Hosted Flat-Rate vs Self-Hosted Router

LiteLLM is a free self-hosted router across many providers. ProxyLLM is a hosted flat-rate lane for OpenAI volume. Different layers, and they compose well.

Read

ProxyLLM vs OpenRouter: Different Problems, Different Tools

OpenRouter sells model breadth per token; ProxyLLM sells flat OpenAI volume on your own ChatGPT plan. What each is for, when to run both, and who should buy neither.

Read

The Best LLM Gateways in 2026, by Workload

No single best LLM gateway exists; workloads pick winners. OpenRouter for multi-model apps, LiteLLM for platform teams, Portkey for observability, ProxyLLM for flat OpenAI volume.

Read

Integrations

12 articles

Controlling AI Costs in GitHub Actions

CI runs AI on every push, with retries and matrix jobs multiplying calls. Capped per-repo keys, request logs per key, and when CI belongs on a key vs a plan.

Read

Cursor and Your ChatGPT Subscription: What Works in 2026

Cursor's OpenAI key override can point chat at a subscription-backed endpoint. What survives the override, what stays on Cursor's stack, and when it pays off.

Read

Dify Apps Without Metered OpenAI Billing

Add ProxyLLM as an OpenAI-compatible provider in Dify and app traffic bills to a flat ChatGPT plan. Per-app sub-keys with hard caps make each app's spend visible.

Read

Flowise Agent Flows on Flat OpenAI Capacity

Point Flowise's ChatOpenAI credential at a flat OpenAI lane: the config, worked cost math for an agent flow that loops and retries, and caps per deployment.

Read

LangChain on a Flat-Rate OpenAI Lane

ChatOpenAI accepts a base_url, so LangChain chains and agents can bill to a flat ChatGPT plan through Codex Hosted. Python and JS setup, plus worked agent-loop math.

Read

LlamaIndex Ingestion Without the Token Meter

LlamaIndex ingestion is a bulk LLM workload: three extractors over 10,000 chunks is 30,000 model calls. The api_base setup and the worked math, metered vs flat.

Read

MCP Tool Loops and the Cost of Agency

MCP hosts re-send every tool schema with every model call, and each tool result triggers another call. The worked math of tool-loop spend and what a flat lane changes.

Read

Node.js OpenAI SDK Behind a Flat-Rate Endpoint

Point the OpenAI Node.js SDK at ProxyLLM with baseURL or env vars. Services and serverless functions bill OpenAI calls to a flat ChatGPT subscription.

Read

opencode on Your ChatGPT Subscription via ProxyLLM

Add ProxyLLM to opencode.json with @ai-sdk/openai-compatible and coding sessions bill to your ChatGPT plan. The config block, session cost math, and limit behavior.

Read

Python OpenAI SDK: One base_url Change for Flat-Rate Calls

How to point the OpenAI Python SDK at ProxyLLM with base_url. Batch scripts and notebooks bill to a flat ChatGPT subscription instead of the per-token meter.

Read

The OpenAI-Compatible API, by Hand: curl Examples

Raw curl requests against an OpenAI-compatible endpoint: headers, request body, response shape, and error envelope, with flat-rate billing behind the URL.

Read

Vercel AI SDK with ProxyLLM: Provider Setup and Caveats

Configure createOpenAICompatible with ProxyLLM's base URL and OpenAI calls bill to a flat ChatGPT plan. Setup code, the streaming caveat, and per-environment keys.

Read

AI agency

7 articles

AI Agency Pricing Models: 5 Structures with Real Numbers

Retainer, pass-through, capped, value-based, and hybrid pricing for AI agency work, each with worked numbers and its failure mode, plus what flat COGS changes.

Read

AI Agency Unit Economics: Where Margin Actually Leaks

Revenue per client is flat; cost per client is metered. The agentic margin concept, a quantified leak inventory (retries, scope creep, model bumps), and the math to plug it.

Read

Forecasting OpenAI Costs for Client Projects

A working estimation method for agencies: prototype a thin slice, sample real token counts, multiply by steps and retries, quote a range, then track it weekly.

Read

Should You Pass API Costs Through to Clients?

Pass through when usage is client-driven, absorb when it is small and stable, cap when clients need certainty. Failure modes, sample contract language, and the flat-capacity case.

Read

The AI Agency Margin Problem: When OpenAI Is Your Biggest Vendor

API costs scale with every client you add. How agencies price AI work, where margins leak, and what flat subscription-backed capacity changes.

Read

The Real Cost of an AI Agency Tech Stack (2026)

Fixed tools run $50-420 a month. The model line is the only one that scales with clients, and at 25 clients it is about 89% of the stack. The itemized table and how to flatten it.

Read

White-Label AI Infrastructure for Agencies

White-label AI for agencies usually means chatbot builders. The margin layer sits below: a sub-key per client, budget caps, and cost reports that carry your brand.

Read
Codex Hosted · the main feature

Run your AI workloads on your ChatGPT subscription.

ProxyLLM runs OpenAI's Codex for you, signed in with your own ChatGPT account. Your apps call one OpenAI-compatible endpoint and the work bills to your flat plan instead of per-token API pricing.