What is GPT Image 2? Everything we know about OpenAI's next image model

Apr 17, 2026

GPT Image 2 is the unofficial name for OpenAI's next-generation image model. It's the visual output of a much larger initiative \u2014 OpenAI's multimodal reasoning model codenamed "Spud" \u2014 and early leaks suggest it's the biggest jump in AI image quality since DALL\u00b7E 3 shipped.

This post sums up what's publicly known as of April 2026 and what it means for anyone building on top of OpenAI's image API.

The leak: three "duct tape" models

In the first week of April 2026, three anonymous image models appeared on LM Arena's blind testing grid under matching codenames:

  • maskingtape-alpha
  • gaffertape-alpha
  • packingtape-alpha

Within hours, developer @levelsio flagged the outputs as dramatically better than the current public leaderboard leaders. The three models were pulled from LM Arena within 24 hours. Shortly after, ChatGPT Plus and Pro users began reporting an A/B test where ordinary image requests were randomly routed to a noticeably sharper, more literate image engine.

Taken together with OpenAI's internal naming conventions (earlier pre-release models used codenames like "Chestnut" and "Hazelnut"), the community consensus is that the tape trio is GPT Image 2 in gray-box testing.

What makes GPT Image 2 different

1. Text rendering crossed 99% accuracy

This is the headline upgrade. GPT Image 1.5 already cleared ~95% on English, but tripped on long signage and non-Latin scripts. GPT Image 2 produces:

  • Long printed receipts with real item names and correct decimal alignment
  • App-UI screenshots with real button copy and punctuation
  • Handwritten notes that stay legible across lines
  • Chinese, Japanese, Korean, and Arabic text as a first-class output

For e-commerce, posters, UI mockups, and anything with copy in the frame, this is the difference between "needs a designer to fix" and "ship it."

2. World-knowledge reasoning

Old-generation image models paint shapes they've seen before. GPT Image 2 reasons before painting. The community test that went viral: ask for a desk with a sticky note reading "call Mina at 9" and a watch. GPT Image 2 draws the watch hands on 9 o'clock \u2014 it reads the note, interprets the time, and projects it onto the physical layout of a watch face. That's not image generation, that's multimodal inference.

3. Native 4K, real aspect ratios

GPT Image 1.5 topped out at 1536\u00d71024 and three aspect ratios. GPT Image 2's early outputs confirm 4096\u00d74096 native, full 16:9 widescreen, and are expected to include 9:16 vertical at launch. For video thumbnails, presentation slides, and social-video stills, the upscaler pipeline is gone.

4. Sub-3-second generation

Because GPT Image 2 is autoregressive instead of diffusion, it paints in a single forward pass. Early latency clocks at under 3 seconds per 1024\u00b2 image \u2014 a 3-4\u00d7 improvement over the last generation. For iterative prompt refinement loops, this changes the ergonomics entirely.

5. Character lock and region control

Two new controls show up in third-party demos:

  • Character lock \u2014 pin a subject across an entire batch. Useful for comic panels, character sheets, and product catalogs.
  • Region-based prompting \u2014 describe per-region content in one prompt instead of wiring up ControlNet. "Top-left neon sign. Bottom-right rusty mech." One model call, no nodes.

What's under the hood: "Spud"

GPT Image 2 is not a standalone image project. It's the visual output of Spud, OpenAI's next frontier model \u2014 a natively multimodal MoE architecture trained on text, image, audio, and video tokens together. Pre-training completed in late March 2026 per multiple independent sources.

The critical architectural difference: before any pixel is drawn, Spud runs a reasoning step over the prompt. Greg Brockman has called this generation "a model designed to move the economy" rather than to chase benchmarks \u2014 the emphasis is on agentic, production workflows, not pixel-art bragging rights.

Release timeline

OpenAI has not confirmed a public launch date. Three hints bracket the window:

  • March 24, 2026 \u2014 OpenAI shut down Sora, citing "focus compute on next-generation products." Sora was burning $15M/day in inference at peak against just $2.1M in lifetime revenue. That compute is almost certainly being redirected to Spud / Image 2.
  • May 12, 2026 \u2014 DALL\u00b7E 2 and DALL\u00b7E 3 are permanently deprecated from OpenAI's API. A successor-shaped hole appears in the product lineup right at this date.
  • April 2026 \u2014 LM Arena gray-box tests + ChatGPT A/B tests are running. Historically OpenAI ships 4\u20138 weeks after this phase.

Most external analysts bracket the official launch between mid-April and early June 2026.

How GPT Image 2 compares

GPT Image 2 (expected)Nano Banana Pro (Gemini 3 Pro)MAI-Image-2
ArchitectureAutoregressive multimodalOptimized diffusionDiffusion (MS stack)
Text rendering>99%, multi-scriptNear-perfect, Latin-firstStrong on business graphics
World knowledgeStrongest (reasons before drawing)MediumMedium
Speed<3s per image3\u20135\u00d7 faster than GPT Image 1.5Fastest at its price tier
Max resolution4096\u00b2Very high1024\u00b2 optimized
Best forProduction assets, multilingual, reasoning-heavyPhotorealistic portraits, fast explorationHigh-volume enterprise

Nano Banana Pro still wins on raw photographic portraits and Gemini's speed. MAI-Image-2 wins on per-image cost. GPT Image 2 wins on instruction adherence, long text, and physical-world reasoning \u2014 the three things that actually block AI-generated assets from shipping to production.

What this means for builders

  1. DALL\u00b7E is dead. If you are still calling dall-e-3, migrate now \u2014 the API shuts off May 12, 2026.
  2. The OpenAI image API will break old prompts. GPT Image 2 reads prompts much more literally than DALL\u00b7E 3. Audit your prompt library.
  3. Multi-model routing wins. Different models have different sweet spots and different pricing. A thin abstraction layer (like the one gptimage2.design sits on) is worth building now.
  4. Text-in-image is finally real. If you had a feature parked waiting for correct CJK rendering \u2014 unpark it.

We'll update this post when OpenAI ships the official launch and pricing. Follow updates on our changelog.

gptimage2.design Research

gptimage2.design Research

What is GPT Image 2? Everything we know about OpenAI's next image model | Blog