Claude in 2026: the features that matter

If you’ve been following the AI space at all, you’ve probably noticed that things are moving fast. Anthropic’s Claude has gone through several major iterations in a short time, and the features that shipped in the last few months are genuinely changing how we work with these models day-to-day. Rather than listing every announcement, I want to focus on the features that actually matter when you’re building software.

What we will be covering

I am going to walk you through the Claude features that I find most impactful for the kind of work we do at Sourcelabs. We will be looking at:

The Claude 4.6 model family and what’s different
Adaptive thinking and why it replaces manual extended thinking
Million-token context windows
Agent teams in Claude Code
What this means for how we build

The 4.6 generation

In February 2026, Anthropic released Claude Opus 4.6, followed shortly by Sonnet 4.6. If you’ve been on Opus 4.5 or Sonnet 4.5, the jump is noticeable. Opus 4.6 achieved the highest score on Terminal-Bench 2.0, a benchmark that specifically measures agentic coding ability - the kind of work where the model needs to plan, execute, and course-correct over extended tasks in real codebases.

What stands out to me is not just that benchmarks improved. It’s the qualitative difference: Opus 4.6 plans more carefully, handles edge cases better and sustains longer agentic tasks without losing the plot. If you’ve ever had a model confidently go down the wrong path for twenty steps before you notice, you’ll appreciate this.

The pricing stayed the same across the board ($5/$25 per million input/output tokens for Opus, $3/$15 for Sonnet), which is worth noting since that’s not usually how these things go.

Which model should I use?
If you’re doing complex agentic work or need the best reasoning, go with Opus 4.6. For everyday development tasks where speed matters more, Sonnet 4.6 is an excellent balance. Haiku 4.5 remains the go-to for high-volume, latency-sensitive workloads. The model IDs are claude-opus-4-6, claude-sonnet-4-6 and claude-haiku-4-5 respectively.

Adaptive thinking

Extended thinking has been around for a while - the idea that Claude can show its work, reasoning through problems step by step before answering. What’s new with Opus 4.6 is adaptive thinking. Instead of you manually setting a budget_tokens value and hoping it’s the right amount, adaptive thinking lets the model decide how much reasoning a problem actually needs.

The API looks like this:

1response = client.messages.create(
2    model="claude-opus-4-6",
3    max_tokens=16000,
4    thinking={"type": "adaptive"},
5    messages=[...],
6)

Simple question? Minimal thinking. Complex multi-step problem? It’ll allocate more reasoning budget automatically. You can also set an effort parameter with four levels (low, medium, high, max) to guide the balance between intelligence, speed and cost.

This matters because it removes a source of guesswork. Previously you had to decide upfront how much thinking budget to allocate, and getting it wrong meant either wasting tokens or not giving the model enough room to reason through a hard problem. Adaptive thinking handles this for you.

A note on summarised thinking
Starting with Claude 4, the API returns a summary of Claude’s internal reasoning rather than the full chain of thought. You are still billed for the full thinking tokens, not the summary. Keep this in mind when estimating costs - the billed token count will be higher than what you see in the response.

Million-token context windows

Both Opus 4.6 and Sonnet 4.6 now support a 1M token context window (in beta). That’s roughly 750,000 words. To put this in perspective: you could fit a medium-sized codebase, its documentation and its test suite into a single prompt.

Opus 4.6 particularly shines here. On the MRCR v2 benchmark (which tests the ability to find needles in a haystack across a million tokens), it scores 76% on the 8-needle variant. The previous best, Sonnet 4.5, managed 18.5%. That’s not an incremental improvement; it’s a fundamental shift in how reliably the model can use all that context.

To enable the 1M context window, you pass a beta header:

1response = client.messages.create(
2    model="claude-opus-4-6",
3    max_tokens=16000,
4    betas=["context-1m-2025-08-07"],
5    messages=[...],
6)

There is a pricing premium for prompts exceeding 200K tokens ($10/$37.50 per million tokens for Opus), so it’s worth being deliberate about when you need the full context.

Agent teams in Claude Code

This one is closer to our day-to-day. Claude Code now supports assembling multiple Claude agents that work in parallel. If you’re using Claude Code for tasks that involve touching multiple parts of a codebase simultaneously - refactoring a service layer while updating its tests and documentation, for example - agents can now split this work across parallel tracks.

Combined with context compaction (which automatically summarises older context to make room for new work), this means Claude Code can sustain much longer working sessions without degrading. If you’ve ever had to restart a session because the model lost track of what it was doing thirty minutes in, this is the fix.

Claude Code itself has been on a trajectory - it crossed the $1 billion milestone in December 2025 and has since expanded into Xcode, Microsoft 365 Copilot and Foundry. The tooling around it is maturing quickly.

Interleaved thinking with tools

One more feature worth calling out: interleaved thinking for tool use. Previously, extended thinking happened once at the start, then tool calls followed. Now Claude can reason between tool calls - it can think after receiving a tool result before deciding what to do next.

This is significant for agentic workflows. Instead of planning everything upfront and executing linearly, the model can adapt its approach based on what it learns from each tool interaction. Think of it as the difference between writing a script and having a conversation.

For Opus 4.6 this is enabled automatically with adaptive thinking. For Sonnet 4.6 and other Claude 4 models, you need the interleaved-thinking-2025-05-14 beta header.

Conclusion

What I find compelling about this latest round of updates is that they address real friction points rather than just pushing benchmark numbers. Adaptive thinking removes guesswork from reasoning budgets. The million-token context window makes it practical to work with entire codebases. Agent teams and interleaved thinking make longer, more complex tasks feasible.

If you’re already using Claude in your development workflow (and let’s face it, most of us are at this point), these features are worth exploring. The models have gotten meaningfully better at the kind of work that matters: sustained, multi-step tasks in real-world codebases where getting the details right is what counts.

← Meer artikelen