GPT Image 2 is OpenAI's new flagship image generation model, announced on April 21, 2026 under the product name "ChatGPT Images 2.0". The model API ID is gpt-image-2, with a pinned snapshot of gpt-image-2-2026-04-21. It is the direct successor to gpt-image-1 (March 2025) and DALL·E 3, both of which retire on May 12, 2026.

OpenAI's framing on launch day: this is "OpenAI's most capable image generation model yet" — and notably, it is the company's first image model with O-series reasoning baked in. In OpenAI's own words, the model is "built for production workflows, where images need to be accurate, readable, on-brand, localized, formatted for the destination surface, and usable without heavy cleanup."

This post breaks down what shipped, why it matters for production work, and how to actually call the API.

The four official capability pillars

OpenAI organized the launch around four pillars. Each one is a direct answer to a long-standing pain point with gpt-image-1 and DALL·E 3.

1. Asset Creation

More aspect ratios and resolutions up to 2K, targeted at apps, ads, product flows, social, presentations, and docs. The previous generation maxed out at 1536×1024 with a thin set of ratios; GPT Image 2 widens the surface to cover the formats real product teams actually ship in.

2. Text-Heavy Visuals

Stronger structured generation — diagrams, infographics, charts, posters, comics — and significantly improved multilingual text rendering. Latin scripts are no longer the only first-class citizens; Japanese, Korean, Cyrillic, Arabic, and dense CJK layouts now render legibly, including non-trivial cases like manga panels and infographics with stacked labels.

3. Control & Precision

More reliable instruction-following, detail preservation, and composition. If you have ever asked the previous generation for "two specific objects, one in each corner" and gotten one or the other, this is the pillar that addresses that failure mode.

4. Reasoning Integration

This is the architectural story. GPT Image 2 is the first OpenAI image model that uses O-series reasoning — it thinks before it draws. With reasoning models, it can research the prompt, transform inputs, generate variations, and self-check the result. For workflows like "render an infographic that summarizes this PDF" or "produce a poster localized to four markets," the model is no longer just painting — it is reasoning over the brief first.

Why "thinks before it draws" matters

Every previous diffusion or autoregressive image model essentially started from noise and tried to converge on a plausible image given the prompt. GPT Image 2 inserts a reasoning step before the visual generation begins. That sounds incremental, but it changes the shape of what the model can do:

Multi-step briefs become reliable. "Build a quarterly performance poster from these numbers, in the brand color, with the headline in Korean" stops being three separate retry rounds.
Self-check closes obvious failure modes. The model can catch its own mis-rendered text or wrong aspect ratio before returning a result.
Tool-aware generation. Combined with reasoning models in the broader stack, it can transform structured input (CSVs, JSON, snippets of source) into the visuals that summarize them.

This is the key reason OpenAI is positioning GPT Image 2 as a production tool rather than a creative toy. The Image Arena results back the framing — gpt-image-2 took #1 in every category on launch, with a +242 point lead in Text-to-Image (the largest lead ever recorded on that leaderboard).

Multilingual text rendering, in practice

Multilingual text was the most-requested fix from the gpt-image-1 era, and it is the area where GPT Image 2 makes the biggest visible jump. Where the old model would scramble non-Latin glyphs or invent characters, the new one ships:

Japanese vertical layouts and manga panels with readable speech bubbles
Korean Hangul without the typical block-spacing failures
Cyrillic signage at full body length — store fronts, posters, transit signs
Arabic with correct right-to-left flow and connected forms
Dense infographics with stacked labels, axis text, and footnotes that stay aligned

For e-commerce, app screenshots, posters, and any layout where copy lives inside the frame, this is the difference between "needs a designer to fix" and "ship it."

How it compares

	GPT Image 2	gpt-image-1 (Mar 2025)	DALL·E 3
Status	Live	Deprecated	Retires May 12, 2026
Reasoning	O-series, integrated	None	None
Max resolution	Up to 2K	1536×1024	1024×1792
Aspect ratios	Wide range	3 fixed ratios	3 fixed ratios
Multilingual text	Strong (CJK, Cyrillic, Arabic)	Latin-first, weak CJK	Latin-first
Image inputs	High-fidelity	Yes	Limited
API endpoints	`images/generations`, `images/edits`	Same	Same
Image Arena rank	#1 every category, +242 in T2I	—	—

DALL·E retirement note: OpenAI is retiring both DALL·E 2 and DALL·E 3 on May 12, 2026. GPT Image 2 becomes the default image model across ChatGPT and the OpenAI API. If you are still calling dall-e-3, the migration window is short.

API quick-start

GPT Image 2 is accessible through the same image endpoints as the previous generation, so the migration is mostly a model-ID swap. The API model card documents the full surface.

Endpoints

POST /v1/images/generations — text-to-image
POST /v1/images/edits — image editing (with image input)

Modalities — input: text + image. Output: image.

Model ID — gpt-image-2. Pin a snapshot with gpt-image-2-2026-04-21 if you need stable behavior.

Pricing (per million tokens)

	Input	Cached input	Output
Image tokens	$8.00	$2.00	$30.00
Text tokens	$5.00	$1.25	$10.00

The cached-input pricing is the relevant lever for production — if your prompts share a stable preamble (brand guidelines, style constraints, layout rules), you pay roughly a quarter of the per-token cost on the cached portion.

Availability rollout

April 21, 2026 — announcement
April 22, 2026 — ChatGPT and Codex users get access
Early May 2026 — API rollout begins (see the developer community post for the live status)

What is missing today

The launch surface is intentionally narrow. Worth knowing before you architect around it:

No streaming. You wait for the full image; you cannot incrementally render.
No function calling.
No structured outputs.
No fine-tuning yet.

For most product use cases this does not bite. If you were planning to build a pipeline that fine-tunes a brand-specific image model on your own asset library, that capability is not on the menu at launch — track the model card for updates.

What this means for builders

Migrate off DALL·E 3 now. API shutoff is May 12, 2026. The lift is mostly a model-ID change; see the API quick-start above.
Audit your prompt library. GPT Image 2 reads instructions much more literally than DALL·E 3, and reasoning-integrated generation rewards prompts that state intent precisely. Our curated GPT Image 2 prompt library shows patterns that work, with verified output images.
Reach for cached-input pricing. If you are running a high-volume generation pipeline, structuring prompts so the brand/style preamble is reusable cuts the input bill by ~75% on that portion.
Text-in-image is finally production-ready. If you have parked features waiting for correct CJK or RTL rendering, unpark them.

Try it now

GPT Image 2 is live in our generator via the gpt-image-2 model — no waitlist, no migration work. Open the canvas and drop a prompt, or browse the curated prompt library for ship-ready examples with verified output images you can copy and tweak.

New signups get 10 free credits — enough to see the new text rendering and reasoning integration on your own prompts before spending anything.

We will keep this post current as the API rollout completes and OpenAI ships follow-on capabilities. Subscribe via our changelog for updates.

What is GPT Image 2? Everything we know about OpenAI's new image model

Table of Contents