GPT Image 2 is the unofficial name for OpenAI's next-generation image model. It's the visual output of a much larger initiative \u2014 OpenAI's multimodal reasoning model codenamed "Spud" \u2014 and early leaks suggest it's the biggest jump in AI image quality since DALL\u00b7E 3 shipped.
This post sums up what's publicly known as of April 2026 and what it means for anyone building on top of OpenAI's image API.
The leak: three "duct tape" models
In the first week of April 2026, three anonymous image models appeared on LM Arena's blind testing grid under matching codenames:
maskingtape-alphagaffertape-alphapackingtape-alpha
Within hours, developer @levelsio flagged the outputs as dramatically better than the current public leaderboard leaders. The three models were pulled from LM Arena within 24 hours. Shortly after, ChatGPT Plus and Pro users began reporting an A/B test where ordinary image requests were randomly routed to a noticeably sharper, more literate image engine.
Taken together with OpenAI's internal naming conventions (earlier pre-release models used codenames like "Chestnut" and "Hazelnut"), the community consensus is that the tape trio is GPT Image 2 in gray-box testing.
What makes GPT Image 2 different
1. Text rendering crossed 99% accuracy
This is the headline upgrade. GPT Image 1.5 already cleared ~95% on English, but tripped on long signage and non-Latin scripts. GPT Image 2 produces:
- Long printed receipts with real item names and correct decimal alignment
- App-UI screenshots with real button copy and punctuation
- Handwritten notes that stay legible across lines
- Chinese, Japanese, Korean, and Arabic text as a first-class output
For e-commerce, posters, UI mockups, and anything with copy in the frame, this is the difference between "needs a designer to fix" and "ship it."
2. World-knowledge reasoning
Old-generation image models paint shapes they've seen before. GPT Image 2 reasons before painting. The community test that went viral: ask for a desk with a sticky note reading "call Mina at 9" and a watch. GPT Image 2 draws the watch hands on 9 o'clock \u2014 it reads the note, interprets the time, and projects it onto the physical layout of a watch face. That's not image generation, that's multimodal inference.
3. Native 4K, real aspect ratios
GPT Image 1.5 topped out at 1536\u00d71024 and three aspect ratios. GPT Image 2's early outputs confirm 4096\u00d74096 native, full 16:9 widescreen, and are expected to include 9:16 vertical at launch. For video thumbnails, presentation slides, and social-video stills, the upscaler pipeline is gone.
4. Sub-3-second generation
Because GPT Image 2 is autoregressive instead of diffusion, it paints in a single forward pass. Early latency clocks at under 3 seconds per 1024\u00b2 image \u2014 a 3-4\u00d7 improvement over the last generation. For iterative prompt refinement loops, this changes the ergonomics entirely.
5. Character lock and region control
Two new controls show up in third-party demos:
- Character lock \u2014 pin a subject across an entire batch. Useful for comic panels, character sheets, and product catalogs.
- Region-based prompting \u2014 describe per-region content in one prompt instead of wiring up ControlNet. "Top-left neon sign. Bottom-right rusty mech." One model call, no nodes.
What's under the hood: "Spud"
GPT Image 2 is not a standalone image project. It's the visual output of Spud, OpenAI's next frontier model \u2014 a natively multimodal MoE architecture trained on text, image, audio, and video tokens together. Pre-training completed in late March 2026 per multiple independent sources.
The critical architectural difference: before any pixel is drawn, Spud runs a reasoning step over the prompt. Greg Brockman has called this generation "a model designed to move the economy" rather than to chase benchmarks \u2014 the emphasis is on agentic, production workflows, not pixel-art bragging rights.
Release timeline
OpenAI has not confirmed a public launch date. Three hints bracket the window:
- March 24, 2026 \u2014 OpenAI shut down Sora, citing "focus compute on next-generation products." Sora was burning $15M/day in inference at peak against just $2.1M in lifetime revenue. That compute is almost certainly being redirected to Spud / Image 2.
- May 12, 2026 \u2014 DALL\u00b7E 2 and DALL\u00b7E 3 are permanently deprecated from OpenAI's API. A successor-shaped hole appears in the product lineup right at this date.
- April 2026 \u2014 LM Arena gray-box tests + ChatGPT A/B tests are running. Historically OpenAI ships 4\u20138 weeks after this phase.
Most external analysts bracket the official launch between mid-April and early June 2026.
How GPT Image 2 compares
| GPT Image 2 (expected) | Nano Banana Pro (Gemini 3 Pro) | MAI-Image-2 | |
|---|---|---|---|
| Architecture | Autoregressive multimodal | Optimized diffusion | Diffusion (MS stack) |
| Text rendering | >99%, multi-script | Near-perfect, Latin-first | Strong on business graphics |
| World knowledge | Strongest (reasons before drawing) | Medium | Medium |
| Speed | <3s per image | 3\u20135\u00d7 faster than GPT Image 1.5 | Fastest at its price tier |
| Max resolution | 4096\u00b2 | Very high | 1024\u00b2 optimized |
| Best for | Production assets, multilingual, reasoning-heavy | Photorealistic portraits, fast exploration | High-volume enterprise |
Nano Banana Pro still wins on raw photographic portraits and Gemini's speed. MAI-Image-2 wins on per-image cost. GPT Image 2 wins on instruction adherence, long text, and physical-world reasoning \u2014 the three things that actually block AI-generated assets from shipping to production.
What this means for builders
- DALL\u00b7E is dead. If you are still calling
dall-e-3, migrate now \u2014 the API shuts off May 12, 2026. - The OpenAI image API will break old prompts. GPT Image 2 reads prompts much more literally than DALL\u00b7E 3. Audit your prompt library.
- Multi-model routing wins. Different models have different sweet spots and different pricing. A thin abstraction layer (like the one gptimage2.design sits on) is worth building now.
- Text-in-image is finally real. If you had a feature parked waiting for correct CJK rendering \u2014 unpark it.
We'll update this post when OpenAI ships the official launch and pricing. Follow updates on our changelog.

