Skip to content

Model Selection Strategy

Not every task deserves the same model. Sending every request through Opus when Haiku would do is expensive and slow. Sending architectural decisions through Haiku when they need Opus is risky. The right model selection strategy matches reasoning depth to task complexity — and switches mid-session when the task changes.


The Three-Tier Model

Opus — When Quality Outweighs Speed

Opus is the right choice when the cost of a wrong answer is high or when the problem genuinely requires deep, multi-step reasoning.

Use Opus for:

  • Architecture decisions and system design
  • Security analysis and threat modeling
  • Complex algorithmic problems with non-obvious solutions
  • Planning sessions that set direction for hours of downstream work
  • Any situation where Sonnet has gotten stuck in a loop or keeps producing the wrong answer

Opus costs more and responds slower. That trade-off is worth it when the task has high stakes or high complexity.

Sonnet — The Default for Most Work

Sonnet hits the best cost-to-quality ratio for the majority of software development tasks. It is fast enough for interactive use, accurate enough for real implementation work, and cheap enough to use continuously throughout a session.

Use Sonnet for:

  • Feature implementation
  • Bug fixing and refactoring
  • Code review
  • Writing and editing documentation
  • Test authoring
  • Most debugging sessions

When in doubt, start here. Upgrade to Opus only when Sonnet demonstrably underperforms.

Haiku — Speed and Volume

Haiku is optimized for speed and low cost. It handles simple, mechanical tasks well and is the right choice when you need volume (many quick operations) or fast iteration cycles.

Use Haiku for:

  • Simple scripts and one-off utilities
  • Code formatting and minor style fixes
  • Boilerplate generation from a clear template
  • Quick lookups and single-question answers
  • High-volume repetitive tasks in automated pipelines

Switching Models

Terminal window
# Switch to Opus for a planning session
/model opus
# Switch back to Sonnet for implementation
/model sonnet
# Switch to Haiku for a batch of simple tasks
/model haiku

Model switches take effect immediately — the very next message uses the new model. You can switch as many times as you want within a session. The conversation history carries over; only the model handling the next response changes.

Default Model Configuration

{
"model": "claude-sonnet-4-6"
}

Set this in .claude/settings.json for a project-level default, or ~/.claude/settings.json for a global default.


Effort Levels

Effort levels control how deeply the current model reasons about a task. They are independent of model choice — you can run Sonnet at high effort or Opus at low effort.

Terminal window
/effort low # Fast, less thorough — mechanical tasks
/effort medium # Balanced default — most work
/effort high # Thorough analysis — bug investigation, review
/effort max # Maximum reasoning depth — requires Opus 4.6
/effort auto # Claude decides per task based on complexity signal

Configure the default in settings.json:

{
"effortLevel": "medium"
}

When to Override Effort

SituationEffort Override
Debugging a subtle race conditionhigh
Reviewing a security-sensitive PRhigh or max
Generating repetitive boilerplatelow
Quick syntax questionlow
Routine feature implementationmedium (default)
Novel algorithmic problemhigh
Production incident root causehigh

Cost vs. Quality Matrix

Task TypeModelEffortReasoning
Architecture designOpushighHigh stakes, worth the cost
Feature implementationSonnetmediumMost tasks live here
Simple script / formattingHaikulowFast and cheap
Bug triage and root causeSonnethighNeed thoroughness
Security reviewOpusmaxCannot miss threats
Documentation (volume)HaikumediumSpeed matters more
Code reviewSonnethighNeed thoroughness
Refactoring a large moduleSonnetmediumStandard implementation
Novel algorithmic designOpushighNeeds deep reasoning
Test generation (bulk)HaikulowMechanical, high volume

Subagent Model Selection

In multi-agent workflows, different agents in the same pipeline can use different models. Match model cost to the role:

Orchestrator (Opus, high effort)
└── Holds the plan, makes architectural decisions, reviews output
Implementation workers (Sonnet, medium effort)
└── Execute focused coding subtasks
Formatting/lint workers (Haiku, low effort)
└── Mechanical transformations, style fixes, boilerplate

This keeps total cost proportional to actual reasoning requirements across the pipeline rather than running everything at the highest tier.


Extended Thinking and Model Choice

Extended thinking — where Claude reasons through a problem step by step before answering — is available on all models but performs best on Opus 4.6. The ultrathink keyword activates extended thinking regardless of which model is active:

Terminal window
ultrathink through the failure modes of this distributed lock implementation

On Opus 4.6, ultrathink triggers the deepest available reasoning. On Sonnet, it still activates extended thinking but with less depth. On Haiku, the effect is minimal.

For tasks where you genuinely need ultrathink-level analysis, switch to Opus first:

Terminal window
/model opus
/effort max
ultrathink through the security implications of this authentication design

See Extended Thinking for full documentation on the ultrathink keyword and when to use it.


Model Selection Flow

flowchart TD TASK([New Task]) --> Q1{High stakes or\nnovel reasoning?} Q1 -- Yes --> OPUS["Use Opus\n/model opus"] Q1 -- No --> Q2{Simple or\nmechanical?} Q2 -- Yes --> HAIKU["Use Haiku\n/model haiku"] Q2 -- No --> SONNET["Use Sonnet\n/model sonnet (default)"] OPUS --> Q3{Effort level?} SONNET --> Q4{Effort level?} HAIKU --> Q5{Effort level?} Q3 -- Architecture/security --> MAXEFFORT["effort max or high"] Q3 -- Planning --> HIGHEFFORT["effort high"] Q4 -- Bug investigation/review --> HIGHEFFORT2["effort high"] Q4 -- Implementation --> MEDEFFORT["effort medium"] Q5 -- Any --> LOWEFFORT["effort low"] MAXEFFORT --> DONE([Execute]) HIGHEFFORT --> DONE HIGHEFFORT2 --> DONE MEDEFFORT --> DONE LOWEFFORT --> DONE style TASK fill:#1e293b,color:#86efac,stroke:#334155 style Q1 fill:#1e293b,color:#fcd34d,stroke:#334155 style Q2 fill:#1e293b,color:#fcd34d,stroke:#334155 style Q3 fill:#1e293b,color:#fcd34d,stroke:#334155 style Q4 fill:#1e293b,color:#fcd34d,stroke:#334155 style Q5 fill:#1e293b,color:#fcd34d,stroke:#334155 style OPUS fill:#1e293b,color:#7dd3fc,stroke:#334155 style SONNET fill:#1e293b,color:#7dd3fc,stroke:#334155 style HAIKU fill:#1e293b,color:#7dd3fc,stroke:#334155 style MAXEFFORT fill:#1e293b,color:#7dd3fc,stroke:#334155 style HIGHEFFORT fill:#1e293b,color:#7dd3fc,stroke:#334155 style HIGHEFFORT2 fill:#1e293b,color:#7dd3fc,stroke:#334155 style MEDEFFORT fill:#1e293b,color:#7dd3fc,stroke:#334155 style LOWEFFORT fill:#1e293b,color:#7dd3fc,stroke:#334155 style DONE fill:#1e293b,color:#86efac,stroke:#334155

When to Force Opus Mid-Session

You started a session on Sonnet for implementation work. Switch to Opus immediately when:

  • Sonnet produces the wrong answer twice in a row on the same problem — it may be at the edge of its reasoning capability for this task
  • The task scope unexpectedly expands into architectural territory
  • You are about to make a decision that is hard to reverse (schema changes, API contracts, security design)
  • You are debugging something that has resisted multiple approaches and needs fresh, deep analysis
Terminal window
# Sonnet is stuck — escalate to Opus
/model opus
ultrathink why this WebSocket reconnection logic produces intermittent race conditions

After the Opus session resolves the hard problem, switch back to Sonnet for the implementation:

Terminal window
# Hard part is understood — back to Sonnet for the code
/model sonnet
Now implement the fix we just designed

Gotchas

Effort level persists across model switches. If you set /effort high and then switch models, the effort level stays at high. Reset it explicitly if needed after switching.

/effort max requires Opus 4.6. Setting max effort on Sonnet or Haiku will either be ignored or downgraded silently. If you need max reasoning depth, confirm you are on Opus first with /model opus.

Haiku is not just “cheaper Sonnet.” Haiku is optimized for a different performance profile. It can confidently produce plausible-sounding wrong answers on tasks that require multi-step reasoning. Only use Haiku for tasks where you can quickly verify the output.

Auto effort is not magic. /effort auto lets Claude estimate task complexity, but Claude’s self-assessment is imperfect. For high-stakes tasks, always set effort explicitly rather than relying on auto.