DevReview

Claude Sonnet 5 vs GPT-5.6: Coding, Cost, Access

On this page
  1. The quick verdict
  2. Who wins what
  3. Where Claude Sonnet 5 wins
  4. Where GPT-5.6 wins
  5. Price and access
  6. So which should you choose?
  7. Sources and further reading

Claude Sonnet 5 vs GPT-5.6 is the coding-model question of the moment, and it comes with an unusual twist: only one of them is something you can actually use today. Anthropic shipped Claude Sonnet 5 on June 30, 2026, generally available on every plan, through the API, and inside Claude Code, its agentic coding tool. OpenAI previewed GPT-5.6 (a trio named Sol, Terra and Luna) days earlier, but as a limited preview for trusted partners, not something in ChatGPT yet. So the honest read is two-sided: on raw benchmarks the two trade blows, but on availability, price, and shipping today, the picture is lopsided. Here is where each one wins, with the launch numbers and no hype.

The short answer

Two flagships, one you can actually use. Claude Sonnet 5 is generally available today, including in Claude Code, costs about half as much on output as GPT-5.6's Sol tier, leads the real pull-request benchmark SWE-bench Pro (63.2), and holds a 1M context. GPT-5.6 (Sol, Terra, Luna) tops terminal and agent-environment benchmarks and edges classic HumanEval, but it is a limited preview for trusted partners, not yet in ChatGPT.

Available nowSonnet 5 GA; GPT-5.6 preview
~2x cheaperSonnet 5 output vs GPT-5.6 Sol
63.2Sonnet 5 SWE-bench Pro
Answer card: Claude Sonnet 5 is generally available today including in Claude Code, costs about half as much on output, and leads SWE-bench Pro at 63.2, while GPT-5.6 tops Terminal-Bench and edges HumanEval but is still a limited preview for trusted partners.
The whole decision on one card. Ships today and cheaper, or tops the agentic benchmarks but hard to get. PNG

For a year the frontier race has been Anthropic and OpenAI trading the lead every few months, and the last week of June 2026 was another exchange. Anthropic launched Claude Sonnet 5, a Sonnet-tier model that reaches what used to be Opus quality on coding, and made it available everywhere at once. OpenAI answered with a preview of GPT-5.6, a three-model family with a headline terminal-benchmark record. But "available everywhere" versus "preview for trusted partners" is the difference that shapes this whole comparison, so we will treat it as a first-class factor, not a footnote. Here is the honest split, with the launch numbers in front of us.

One quick clarification, because the naming trips people up: Claude Code is Anthropic's agentic coding tool, a terminal assistant that edits your repo and runs commands. Claude Sonnet 5 is the model that now powers it. So "Claude Code Sonnet 5" is not a new product, it is Sonnet 5 doing the work inside Claude Code.

The quick verdict

Skimming? Here it is, by who you are.

You are...The pickWhy
Shipping code this weekClaude Sonnet 5Generally available, including in Claude Code
Cost-sensitive at scaleClaude Sonnet 5About half the output price of GPT-5.6 Sol
Building terminal or agent-environment toolsGPT-5.6 (Sol)Tops Terminal-Bench; ultra mode adds subagents
A trusted partner with preview accessGPT-5.6Peak agentic scores, if you can get it
Working inside Claude CodeClaude Sonnet 5It is the default coding model there

That top row is the whole story in miniature: for most people, only one of these two is a model you can actually run today.

Who wins what

The benchmark picture is genuinely split, and the numbers below are the launch figures. There is no clean apples-to-apples table because the two labs report on different tests, so here is the honest "who leads each dimension" view.

DimensionEdgeDetail
Classic coding (HumanEval)GPT-5.692.3 vs 89.7 for Sonnet 5
Terminal / agent-environment (Terminal-Bench 2.1)GPT-5.6Sol 88.8, ultra mode 91.9, above every Claude including Opus 4.8 at 78.9
Real pull requests (SWE-bench Pro)Claude Sonnet 563.2 (GPT-5.6's figure not yet published)
Output priceClaude Sonnet 5About 15 dollars per million vs Sol's 30
Available todayClaude Sonnet 5GA everywhere vs GPT-5.6's limited preview

The pattern: GPT-5.6 owns the machine-operating, terminal-driving end of coding, where its ultra mode and subagents shine. Claude Sonnet 5 owns the write-and-fix-a-real-pull-request end, plus everything about actually getting your hands on it.

Where Claude Sonnet 5 wins

Three things, and they compound. First, you can use it now. It is generally available on every plan, on the API, and in Claude Code, which for most teams is the entire ballgame against a model still gated behind a partner preview. Second, price. At roughly 3 dollars per million input and 15 per million output (with an introductory 2 and 10 through August 31, 2026), it runs about half the output cost of GPT-5.6's Sol tier. On a real bill, where output dominates, that gap is the difference between a feature being affordable and not.

Third, the coding itself is near-Opus. On SWE-bench Pro, the benchmark built from actual repository pull requests and the one most teams trust as a proxy for production work, Sonnet 5 leads at 63.2. It holds a 1,000,000-token context so a whole codebase fits, it takes images at high resolution, and inside Claude Code it runs at high effort and drives agentic loops end to end. The honest caveat: on the most demanding terminal and abstract-reasoning tasks, GPT-5.6 is ahead, and if peak agentic performance is the whole job, that matters.

Where GPT-5.6 wins

GPT-5.6 is built for the frontier of agentic coding, and its headline number backs that up. On Terminal-Bench 2.1, which measures a model driving a terminal through a real task, Sol scores 88.8 and its new ultra mode reaches 91.9, higher than every Claude model on record, Opus 4.8 included. Ultra mode is the interesting part: it goes beyond a single agent by orchestrating subagents, and OpenAI paired it with a new maximum reasoning effort for the hardest problems. It also edges Sonnet 5 on the classic HumanEval coding test, 92.3 to 89.7.

The family structure is a genuine strength too. Sol is the flagship, Terra matches the previous GPT-5.5 at about half the price, and Luna is the fast, cheap tier, so you can route easy calls to a lighter model. The catch, and it is a big one right now, is access: GPT-5.6 is a limited preview for trusted partners through the API and Codex, not in ChatGPT, with general availability promised in the coming weeks. Until that lands, most of these strengths are on paper for most developers.

Price and access

The split that actually decides a lot of projects, in one place. Prices are the launch and preview rate cards and move often, so treat them as the shape of the gap, not gospel.

Claude Sonnet 5GPT-5.6
Input, per 1M~$3 (intro ~$2)Sol ~$5 · Terra ~$2.50 · Luna ~$1
Output, per 1M~$15 (intro ~$10)Sol ~$30 · Terra ~$15 · Luna ~$6
Context window1,000,000~1,000,000 class
AvailabilityGA: all plans, API, Claude CodeLimited preview (API + Codex, trusted partners)

Access is where they diverge hardest, and it is worth being blunt about it: you can start building on Sonnet 5 in the next five minutes, and you probably cannot start building on GPT-5.6 at all unless you are on OpenAI's partner list.

Terminal showing Claude Sonnet 5 running in Claude Code with a single command, next to a GPT-5.6 call via the OpenAI API whose output notes that preview access is limited to trusted partners.
One you launch with a command today; the other you call via API only if you have preview access. PNG

That gap is the practical core of the whole comparison: benchmarks are close and split, but one model is in your terminal now and the other is behind a preview gate.

So which should you choose?

Two clean cases. If you are writing and shipping code, want near-Opus quality at Sonnet prices, care about cost at volume, or simply need a model you can use today, Claude Sonnet 5 is the pick, and it is the obvious one inside Claude Code. If you are building terminal-driven or computer-use agents, chasing the very top of the agentic benchmarks, and you have preview access (or are happy to wait for general availability), GPT-5.6, especially Sol with ultra mode, is the frontier.

For most developers reading this in mid-2026, the tiebreaker is not a benchmark, it is the calendar: Claude Sonnet 5 is here, GPT-5.6 is almost here. When GPT-5.6 goes generally available, this becomes a much closer, benchmark-by-benchmark call. Until then, the model you can actually run has a very large advantage, and it happens to be an excellent one.

Sources and further reading

Frequently asked questions

Which is better for coding, Claude Sonnet 5 or GPT-5.6?

It is close, and they win different benchmarks. GPT-5.6 Sol edges the classic HumanEval test (92.3 to 89.7) and clearly leads terminal and agent-environment work, where its ultra mode scores 91.9 on Terminal-Bench 2.1, above every Claude on record including Opus 4.8. Claude Sonnet 5 leads SWE-bench Pro, the benchmark built from real pull requests, at 63.2, and it lands near Opus 4.8 quality for far less money. So GPT-5.6 for peak terminal and agentic tasks, Claude Sonnet 5 for real-world pull-request coding, and honestly both are excellent for everyday work.

Can I use GPT-5.6 right now?

Not generally, and this is the biggest practical difference. As of its June 2026 preview, GPT-5.6 Sol, Terra and Luna are limited to trusted partners through the API and the Codex tool. They are not in ChatGPT, and general availability is promised in the coming weeks. Claude Sonnet 5, by contrast, is available today: it is the default model on the Free and Pro plans, it is on Max, Team and Enterprise, it is on the API, and it runs in Claude Code. If you need to build something this week, only one of these two is actually an option for most people.

How much do Claude Sonnet 5 and GPT-5.6 cost?

Claude Sonnet 5 is about 3 dollars per million input tokens and 15 per million output, with an introductory 2 and 10 through August 31, 2026. GPT-5.6 comes in three tiers: Sol, the flagship, at roughly 5 and 30; Terra, which matches GPT-5.5 at about 2.50 and 15; and Luna, the fast, cheap tier at about 1 and 6. So Sonnet 5 costs roughly half of Sol on output, while GPT-5.6's Terra and Luna tiers undercut it if you can accept a smaller model. Prices move, so confirm the current rate cards before committing a budget.

What is Claude Code, and how does Sonnet 5 fit in?

Claude Code is Anthropic's agentic coding tool: a command-line assistant that reads your repo, edits files, runs commands, and drives a task to completion. It is not a model. Claude Sonnet 5 is the new model that powers it, running at high effort for coding and agentic loops. So Claude Code Sonnet 5 just means using Sonnet 5 through Claude Code, rather than a separate product. You can also point Claude Code at other Claude models, but Sonnet 5 is the default coding model on the Free and Pro plans.

What context window and reasoning do they have?

Claude Sonnet 5 ships with a 1,000,000-token context window and up to 128,000 tokens of output, with adaptive thinking whose depth you can dial from low up to xhigh or max. GPT-5.6 sits in the same roughly one-million-token class and adds a new maximum reasoning effort plus an ultra mode that goes beyond a single agent by using subagents. Both hold a large codebase in context comfortably, so for most work the window is not the deciding factor between them.