Tech News

Claude Opus 4.7 Is Here: Best Coding Model, 3× Vision, Same Price

Anthropic's Claude Opus 4.7 is here, delivering massive gains in coding and vision at no extra cost. Discover the new features and benchmark results.

Eby Equipe BlueprintblogJun 196 min read

Claude Opus 4.7 Is Here: Best Coding Model, 3× Vision, Same Price

Struggling with AI models that stumble on complex coding tasks or lack the vision to handle your most demanding visual data? Anthropic has just released Claude Opus 4.7 to solve exactly that. This upgrade delivers their most powerful coding performance yet and triples your vision capabilities—all while keeping the price completely unchanged. It is available now on claude.ai, the API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry.

This is not an incremental update. SWE-bench Pro is up 10.9 percentage points. CursorBench is up 12 points. Vision resolution has tripled. And — a detail companies are happy to hear — the pricing has not changed.

SWE-bench Pro

64.3%

era 53,4% no Opus 4.6 (+10,9pp)

CursorBench

70%

era 58% (+12pp) — melhor coding do mercado

Visão

3.75MP

era 1,15MP — 3× mais resolução

What has actually changed

Opus 4.7 was built around three real problems that Opus 4.6 users were reporting: the model sometimes abandoned long tasks midway, sometimes delivered code that looked correct but failed review, and sometimes interpreted instructions more loosely than expected.

The three central bets of Opus 4.7 are directly aimed at these problems: persistence in long tasks, self-verification before reporting, and literal instruction following.

Benchmarks: where Opus 4.7 won and where it gave ground

Opus 4.6 vs Opus 4.7 comparison on key benchmarks

Opus 4.7 vs GPT-5.4 vs Gemini 3.1 Pro

Claude Opus 4.7 leads

SWE-bench Pro: 64.3% vs 57.7% (GPT) and 54.2% (Gemini) SWE-bench Verified: 87.6% CursorBench: 70% — best coding in IDE on the market MCP-Atlas (tool use): 77.3% vs 68.1% (GPT) Finance Agent: 64.4% vs 59.7% (Gemini) GDPVal-AA knowledge work: Elo 1,753 vs 1,674 (GPT)

Where it loses or ties

BrowseComp: 79.3% vs 89.3% (GPT) and 85.9% (Gemini) GPQA Diamond: 94.2% — practically tied (GPT: 94.4%, Gemini: 94.3%) Terminal-Bench 2.0: 69.4% vs 75.1% (GPT) Humanity's Last Exam: 54.7% vs 58.7% (GPT) CyberGym: intentional — cyber capabilities were reduced during training

3× better vision — what this changes in practice

Opus 4.6 processed images at up to 1,568px on the long side (1.15 megapixels). Opus 4.7 goes up to 2,576px (3.75 megapixels) — more than 3× more pixels.

In practice: dense technical diagrams, IDE screenshots, high-resolution PDF documents, design mockups, and complex financial charts arrive with true fidelity — not interpolated. The CharXiv visual reasoning with tools benchmark jumped from 84.7% to 91.0%.

The new xhigh level — fine control between quality and cost

Opus 4.6 had four effort levels: low, medium, high, and max. Opus 4.7 introduces a new level between high and max:

Effort level scale in Opus 4.7

xhigh is now the default for Claude Code across all plans. The logic is simple: if a task requires three attempts at high to get it right, one attempt at xhigh is usually cheaper in total — fewer retries, fewer tokens spent.

Task budgets, /ultrareview and cross-session memory

Three new features arriving with the model:

— Task budgets (public beta): set a token ceiling for autonomous agents. The model sees the counter decreasing and prioritizes the work, finishing cleanly instead of cutting off abruptly. Activate via header task-budgets-2026-03-13 + parameter output_config.task_budget.

— /ultrareview in Claude Code: new command that runs a dedicated review session, reads the entire diff, and flags what a careful human reviewer would detect. 3 free uses on Pro and Max plans at launch.

— Cross-session memory: Opus 4.7 is better at using file-system-based memory. It keeps important notes between long work sessions, reducing the context you need to paste at the start of each new session.

Attention on the 4.6 migration

Anthropic called it a "direct upgrade" but there are changes that affect token usage and behavior:

The elephant in the room: the Mythos Preview

Anthropic was transparent: Opus 4.7 does not match the Claude Mythos Preview, their most powerful model — which is not publicly available due to safety concerns.

The Mythos Preview was released last week to a select group of technology and cybersecurity companies as part of Project Glasswing. Opus 4.7 is the first model where Anthropic tested safeguards against use in cyberattacks — what they learn here will guide how they eventually release Mythos-level models at scale.

Pricing, availability, and model ID

Price identical to Opus 4.6: $5 per million input tokens and $25 per million output tokens. Prompt caching reduces by up to 90%. Batch processing reduces by 50%.

Model ID in the API: claude-opus-4-7. Available on: claude.ai (Pro, Max, Team, Enterprise), Anthropic API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry.

What remains of Opus 4.7

SWE-bench Pro: 64.3% (+10.9pp vs 4.6) — best coding model generally available on the market today.
CursorBench: 70% (+12pp) — the benchmark that measures what developers actually do day-to-day in an IDE.
3.75MP vision (was 1.15MP) — technical diagrams, screenshots, and mockups arrive with true fidelity.
xhigh: new effort level between high and max — now default in Claude Code. More quality without paying the max cost.
Task budgets (beta): token ceiling for autonomous agents. No more billing surprises on overnight runs.
/ultrareview in Claude Code: multi-agent parallel diff review with bug and design issue flagging.
Same price as Opus 4.6: $5/$25 per million tokens. There is no cheaper free upgrade than this.
BrowseComp regressed: 79.3% (was 83.7%). Agents that rely on intensive web research should test before migrating.
Breaking changes in migration: new tokenizer (1.0–1.35× more tokens), more literal instructions, mandatory adaptive thinking.

For most developers using Claude Code day-to-day, Opus 4.7 is a direct upgrade with no decision to make. Same price, better model.

For teams with agents in production, the migration requires attention: measure the impact of the new tokenizer, review prompts that relied on loose interpretation, and configure task budgets before turning on auto mode.

#AI #Claude

Claude Opus 4.7 Is Here: Best Coding Model, 3× Vision, Same Price

What has actually changed

Benchmarks: where Opus 4.7 won and where it gave ground

Opus 4.7 vs GPT-5.4 vs Gemini 3.1 Pro

3× better vision — what this changes in practice

The new xhigh level — fine control between quality and cost

Task budgets, /ultrareview and cross-session memory

Attention on the 4.6 migration

The elephant in the room: the Mythos Preview

Pricing, availability, and model ID

The elite dev's arsenal.

Amazon: 30,000 Layoffs and the Impact of AI on the Tech Market

China leads the open-source AI race — but the foundation is still American

Ubisoft Under Fire: Mass Strike Against Cuts and Mandatory Office Return