Claude Sonnet 5 becomes Claude Code's default model, with a full 1M-token context window
Anthropic's Claude Sonnet 5, released 30 June 2026, is now the default model in Claude Code and includes a full 1 million token context window at standard pricing, with no premium for large-context requests. Launch pricing of $2/$10 per million input/output tokens runs through 31 August 2026.
1 July 2026
Anthropic released Claude Sonnet 5 on 30 June 2026 and made it the default model across Claude Code, Claude’s Free and Pro plans, with availability for Max, Team, and Enterprise users. The detail worth flagging for anyone commissioning AI-assisted development: the full 1 million token context window is now included at standard pricing. A 9,000-token request and a 1,000,000-token request cost the same per token — there’s no surcharge for working across a large codebase in one pass.
Why the pricing structure matters more than the benchmark
Anthropic is running launch pricing of $2 per million input tokens and $10 per million output tokens through 31 August 2026, stepping up to $3/$15 after that. Sonnet 5 is described as the most agentic Sonnet model yet, with performance approaching Opus 4.8 at meaningfully lower cost — but the pricing structure is the more practical story. Large-context work (reviewing a whole repository, reasoning across a long spec, holding an entire codebase in memory during a refactor) has historically been the expensive edge case in agentic coding. Removing the premium for it changes the economics of exactly the workflows agencies run most: multi-file refactors, whole-codebase migrations, and long-running autonomous tasks.
This follows Cursor’s move a day earlier to split Teams pricing into Standard and Premium tiers based on usage intensity — both companies converging on the same read: AI coding tool spend is no longer flat-rate, and the tools that make heavy, large-context agentic use cheaper rather than more expensive will win the procurement conversation.
So what
If your development partner is still pricing AI-assisted work as a flat add-on rather than accounting for which models and context sizes a job actually needs, they’re either overcharging you or under-delivering on the parts of a build — architecture review, cross-codebase refactors, long-running agent tasks — where large context actually pays off. We run Claude Code as part of our day-to-day workflow and factor model and token costs into project scoping rather than treating them as fixed overhead. More on how that fits our delivery process on the AI-assisted development page, or get in touch to talk through a project.