Model integration · Vertex AI

Vertex AI with portable Gemini access.

Keep Vertex where Google Cloud policy needs it. Portable Gemini traffic passes through ProxyLLM on your own key, with budget caps and request logs around it.

Start free How to connect

$129/month SaaS. Bring your own model keys. No inference markup.

Three steps to connect.

Keep Vertex auth where it is

Vertex AI runs on Google Cloud projects, locations, and IAM. ProxyLLM does not claim native Vertex credential handling today.

Pass Gemini through OpenRouter

For Gemini traffic that does not need a GCP project behind it, send google/ model names to https://api.proxyllm.ai/v1 on your own OpenRouter key.

Split by governance

Use Vertex for GCP-governed workloads. Everything portable gets scoped sub-keys, budget caps, and request logs in ProxyLLM.

Gemini outside the project.

Use google/ model names through ProxyLLM when a request does not need a Vertex project behind it.

fallback.ts

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.proxyllm.ai/v1",
  apiKey: "pk_live_...",
});

const r = await client.chat.completions.create({
  model: "google/gemini-flash-1.5",
  messages: [{ role: "user", content: "Summarize this GCP incident." }],
});

Codex Hosted · the main feature

Run your AI workloads on your ChatGPT subscription.

ProxyLLM runs OpenAI's Codex for you, signed in with your own ChatGPT account. Your apps call one OpenAI-compatible endpoint and the work bills to your flat plan instead of per-token API pricing.

Get Codex Hosted How it works

$129/month · normal SaaS pricing

Avoid a single-cloud model lane.

Portable Gemini traffic runs on your own key with caps and logs, without overstating the native Vertex support ProxyLLM has today.

Start free All integrations