Codex Hosted Setup: From ChatGPT Account to Endpoint in 5 Minutes

Connect your ChatGPT account with device-code sign-in, let your isolated container start, set one base URL, and send your first request. Every step, with commands.

Setup is four moves: connect a ChatGPT account with OpenAI’s device-code sign-in, wait about a minute for your isolated container, point OPENAI_BASE_URL at https://api.proxyllm.ai/v1, and send a request. A fifth, optional move adds fallback lanes. Five minutes covers it, and every step below comes with the exact commands.

If you want the concept before the commands, what is Codex Hosted? explains the feature in two minutes.

What you need before starting

  • A ChatGPT account. Codex is included in ChatGPT plans, and a paid plan (Plus, Pro, Business) is what carries real programmatic capacity.
  • A ProxyLLM account with Codex Hosted enabled ($129/month, no inference markup).
  • Optional: an OpenAI API key, if you want a metered fallback lane behind the subscription.

Step 1: connect your ChatGPT account

In the dashboard, choose to connect a ChatGPT account. The sign-in is OpenAI’s device-code flow, the same one the Codex CLI uses on headless servers (documented at developers.openai.com/codex/auth):

1. The dashboard requests a device code from OpenAI and shows it to you.
2. You open the verification link, sign in at chatgpt.com as yourself,
   and approve the code.
3. OpenAI issues the session directly into your container.

The exchange happens between you and OpenAI. We never see your password, and there is nothing for us to store except the session OpenAI places in your container. If device-code auth is new territory, codex login without a browser walks the flow in detail.

Step 2: wait for your container (about a minute)

Each connected account gets its own container running the official, unmodified Codex CLI. One account, one container, never shared and never pooled. There is nothing to install on your side; the dashboard marks the container ready when the session lands.

Step 3: point your stack at the endpoint

One base URL and one key. Your ProxyLLM key goes where the OpenAI key used to go:

export OPENAI_BASE_URL="https://api.proxyllm.ai/v1"
export OPENAI_API_KEY="pllm_your_key_here"   # ProxyLLM key, not an OpenAI key

The same swap in Python:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.proxyllm.ai/v1",
    api_key="pllm_your_key_here",
)

Anything that accepts an OpenAI base URL works the same way: official SDKs, n8n, LangChain, plain curl. The full capability map, including what should stay on an API key, is in what works with Codex Hosted.

Step 4: send the first request

curl -s "$OPENAI_BASE_URL/chat/completions" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5",
    "messages": [{"role": "user", "content": "Confirm you are running on my subscription."}]
  }'

Two things to notice in the result. First, the payload is standard OpenAI shape, so existing parsing code keeps working. Second, the response arrives complete rather than as a token stream: the Codex lane returns finished responses, and only API-key lanes stream. To see which models your Codex lane currently serves, ask the endpoint instead of trusting a blog post:

curl -s "$OPENAI_BASE_URL/models" -H "Authorization: Bearer $OPENAI_API_KEY"

Step 5 (optional): add fallback lanes

Plan limits reset on rolling windows, and a clean setup decides in advance where overflow goes. In the dashboard, add lanes in the order you want them tried:

1. Codex - account A     flat, your primary subscription window
2. Codex - account B     flat, optional second account you own
3. OpenAI API key        metered, catches overflow until a window resets

Stored API keys are encrypted at rest with AES-256-GCM, and every entry in the request log names the lane that served it. Lane ordering, failover behavior, and sizing guidance get their own page: fallback lanes explained.

Your first week: what to watch

  • The request log. Per-request, per-lane records. If traffic spills to the API key, you see it the day it happens, not on the invoice.
  • Observed usage against the window. The dashboard shows what your plan absorbed. Size a second account from data, not guesses.
  • One real workload first. Move a single cron job or agent, confirm the logs look right, then migrate the rest. Tool-specific configuration lives in integrations.

Setup is the entire migration: no code changes beyond a base URL, no SDK forks, no new request format. If you have not priced the move yet, the calculator maps your current API bill to a plan tier in thirty seconds.

Frequently asked questions

How long does Codex Hosted setup take?

About five minutes. Device-code sign-in takes around a minute, the container starts in about another minute, and the rest is setting OPENAI_BASE_URL in your app and sending a test request.

Do I need to install anything to use Codex Hosted?

No. The Codex CLI runs in your hosted container, not on your machine. Your side only needs the ability to send HTTPS requests, which every OpenAI SDK and HTTP client already has.

What do I set as the base URL?

Set OPENAI_BASE_URL to https://api.proxyllm.ai/v1 and use your ProxyLLM key as the API key. Any tool that accepts an OpenAI base URL then routes through your subscription-backed container.

How does the ChatGPT sign-in work?

Through OpenAI's documented device-code flow. The dashboard requests a code, you approve it at chatgpt.com while signed in as yourself, and OpenAI issues the session directly into your container. Your password never passes through ProxyLLM.

Can I add fallback lanes after setup?

Yes, anytime. Connect a second ChatGPT account you own or add your own OpenAI API key in the dashboard, and the gateway uses them in order when your primary plan's window is exhausted.

More on Codex Hosted
Codex Hosted · the main feature

Run your AI workloads on your ChatGPT subscription.

ProxyLLM runs OpenAI's Codex for you, signed in with your own ChatGPT account. Your apps call one OpenAI-compatible endpoint and the work bills to your flat plan instead of per-token API pricing.