insidejob
closed source Anthropic

Claude Sonnet 4.6

undisclosed
Context 1M tokens
Max output 64K tokens
Architecture Dense transformer
Pricing (per 1M tokens) $3 in / $15 out

Benchmark scores

79.6 SWE-bench Verified
88.5 GPQA Diamond
1478 LM Arena Elo
Available via: APIChatBatchAgent SDKManaged Agents

Claude Sonnet 4.6 is the best value proposition in the Claude lineup — 79.6% on SWE-bench Verified (just 1.2 points behind Opus) at 40% of the cost.

Benchmarks

BenchmarkScorevs Opus 4.6
SWE-bench Verified79.6%-1.2 pts
GPQA Diamond88.5%-5.8 pts
LM Arena Elo1478-26 pts

The gap between Sonnet and Opus is the narrowest it has ever been. For most tasks, the quality difference is imperceptible.

Pricing

Per 1M tokensvs Opus
Input$3.0040% cheaper
Output$15.0040% cheaper

Architecture & capabilities

  • Context: 1M tokens — same as Opus, at standard pricing
  • Output: Up to 64K tokens
  • Thinking modes: Adaptive, extended, and interleaved thinking — same as Opus
  • Shares all core Opus improvements

Strengths

  • Best price/performance ratio in the frontier tier
  • 1M context at $3/$15 — cheaper than most competitors
  • Same thinking capabilities as Opus
  • Strong enough for daily coding, analysis, and agentic work

Weaknesses

  • Slightly weaker on graduate-level science reasoning (GPQA gap)
  • Lower LM Arena ranking — noticeable in head-to-head comparisons

When to use

Default choice for most tasks. Use Opus only when you need the absolute best reasoning quality or are working on PhD-level scientific analysis.