OpenAI API Pricing Explained (2026): Models, Tokens, Gotchas
OpenAI API prices per million tokens, June 2026: GPT-5.5 $5/$30, GPT-5 $1.25/$10, Mini $0.25/$2, plus cached input, batch discounts, and the gotchas that inflate bills.
OpenAI prices API usage per million tokens, with separate rates for input and output set per model. As of June 2026 the spread runs from $0.05 per million input tokens on GPT-5 Nano to $30 per million output tokens on GPT-5.5, with two standing discounts: 90 percent off cached input on GPT-5 and 50 percent off everything through the Batch API. The full table, the worked request math, and the line items that surprise people are below; the live source is OpenAI’s pricing page.
How the pricing model works
Three rules govern every bill.
You pay per token, OpenAI’s unit of text. A token is roughly four characters or three quarters of an English word, so 1,000 words is about 1,330 tokens.
Input and output are priced separately, and output costs six to eight times more across the GPT-5 family. Input is everything you send: system prompt, conversation history, tool schemas, the user message. Output is everything the model generates, including reasoning tokens you never see.
The API is stateless. A 20-turn conversation resends its history every turn, which means an unbounded chat conversation’s API cost grows with the square of its length, even when users type short messages.
The bill is tokens × rate ÷ 1,000,000, summed across input and output, per model.
The June 2026 price table
| Model | Input /1M | Output /1M | Batch input /1M | Batch output /1M |
|---|---|---|---|---|
| GPT-5.5 | $5.00 | $30.00 | $2.50 | $15.00 |
| GPT-5.4 | $2.50 | $15.00 | $1.25 | $7.50 |
| GPT-5 | $1.25 | $10.00 | $0.625 | $5.00 |
| GPT-5 Mini | $0.25 | $2.00 | $0.125 | $1.00 |
| GPT-5 Nano | $0.05 | $0.40 | $0.025 | $0.20 |
| o4-mini | $0.55 | $2.20 | $0.275 | $1.10 |
Two discounts modify the table. Cached input: when requests repeat a prompt prefix of 1,024 tokens or more, the repeated part bills at a steep discount, $0.125 per million on GPT-5 (90 percent off); per-model cached rates sit alongside the regular rates on the pricing page. Batch: jobs submitted through the Batch API and returned within 24 hours bill at half price on both sides, shown above.
OpenAI adjusts prices as models rotate. Treat this table as a June 2026 snapshot and the pricing page as the source of truth.
What a real request costs
A typical production request: 2,000 input tokens (system prompt, context, user message) and 500 output tokens.
| Model | Input math | Output math | Per request | Per 100k requests |
|---|---|---|---|---|
| GPT-5.5 | 2,000 × $5 ÷ 1M = $0.0100 | 500 × $30 ÷ 1M = $0.0150 | $0.0250 | $2,500 |
| GPT-5.4 | $0.0050 | $0.0075 | $0.0125 | $1,250 |
| GPT-5 | $0.0025 | $0.0050 | $0.0075 | $750 |
| GPT-5 Mini | $0.0005 | $0.0010 | $0.0015 | $150 |
| GPT-5 Nano | $0.0001 | $0.0002 | $0.0003 | $30 |
| o4-mini | $0.0011 | $0.0011 | $0.0022 | $220 |
The same request costs 83 times more on GPT-5.5 than on GPT-5 Nano. Model selection is where API budgets are won or lost. The flagship’s economics get their own breakdown in GPT-5.5 API cost.
The gotchas that inflate bills
Reasoning tokens bill at the output rate. Reasoning models think before they answer, and the thinking is counted in completion_tokens even though it never appears in the response. A 150-token visible answer can bill 1,500 output tokens. The mechanics and the worked multiplier are in reasoning tokens explained.
History resends every turn. Stateless requests mean conversation cost compounds. Cap history or summarize it; do not ship unbounded chat.
Cached input needs byte-identical prefixes. A timestamp or user ID at the top of the system prompt drops the hit rate to zero. Structure rules are in prompt caching explained.
The batch discount has a clock. Half price applies only to jobs that can wait up to 24 hours. The workload test is in the Batch API explained.
ChatGPT and the API bill separately. A $200 Pro subscription does not offset a dollar of API spend. The two-products split, and the one bridge between them, is covered in why your OpenAI API bill is separate from ChatGPT.
Estimating a monthly bill
Multiply a per-request cost from the table by monthly volume. The GPT-5 request above at 30,000 requests a day: $0.0075 × 30,000 = $225 a day, about $6,750 a month. Then apply the discounts you can actually capture: caching on repeated prefixes, batch on async jobs, a cheaper model on the simple share.
Run your own token counts through the calculator; it prices the metered bill and shows which setup covers it cheapest.
Frequently asked questions
How much does the OpenAI API cost in 2026?
As of June 2026: GPT-5.5 costs $5 per million input tokens and $30 per million output, GPT-5.4 $2.50/$15, GPT-5 $1.25/$10, GPT-5 Mini $0.25/$2, and GPT-5 Nano $0.05/$0.40. Cached input and the Batch API discount these rates further. OpenAI adjusts prices over time, so check openai.com/api/pricing for current numbers.
Why are output tokens more expensive than input tokens?
Generating tokens costs more compute than reading them, so OpenAI prices output six to eight times higher than input across the GPT-5 family. Reasoning tokens also bill at the output rate, which makes output the side of the bill that surprises people.
Does paying for ChatGPT Plus reduce OpenAI API prices?
No. ChatGPT subscriptions and the OpenAI API are separate products with separate billing, and a Plus or Pro plan does not discount API tokens. The one bridge is Codex, which is included in ChatGPT plans and can run programmatically.
What is cached input pricing on the OpenAI API?
When requests repeat the same prompt prefix of 1,024 tokens or more, OpenAI bills the repeated part at a steep discount. On GPT-5 cached input costs $0.125 per million tokens instead of $1.25, a 90 percent reduction, as of June 2026. It happens automatically when prompts are structured with static content first.