Model Selection Strategy

Not every task deserves the same model. Sending every request through Opus when Haiku would do is expensive and slow. Sending architectural decisions through Haiku when they need Opus is risky. The right model selection strategy matches reasoning depth to task complexity — and switches mid-session when the task changes.

The Three-Tier Model

Opus — When Quality Outweighs Speed

Opus is the right choice when the cost of a wrong answer is high or when the problem genuinely requires deep, multi-step reasoning.

Use Opus for:

Architecture decisions and system design
Security analysis and threat modeling
Complex algorithmic problems with non-obvious solutions
Planning sessions that set direction for hours of downstream work
Any situation where Sonnet has gotten stuck in a loop or keeps producing the wrong answer

Opus costs more and responds slower. That trade-off is worth it when the task has high stakes or high complexity.

Sonnet — The Default for Most Work

Sonnet hits the best cost-to-quality ratio for the majority of software development tasks. It is fast enough for interactive use, accurate enough for real implementation work, and cheap enough to use continuously throughout a session.

Use Sonnet for:

Feature implementation
Bug fixing and refactoring
Code review
Writing and editing documentation
Test authoring
Most debugging sessions

When in doubt, start here. Upgrade to Opus only when Sonnet demonstrably underperforms.

Haiku — Speed and Volume

Haiku is optimized for speed and low cost. It handles simple, mechanical tasks well and is the right choice when you need volume (many quick operations) or fast iteration cycles.

Use Haiku for:

Simple scripts and one-off utilities
Code formatting and minor style fixes
Boilerplate generation from a clear template
Quick lookups and single-question answers
High-volume repetitive tasks in automated pipelines

Switching Models

# Switch to Opus for a planning session
/model opus

# Switch back to Sonnet for implementation
/model sonnet

# Switch to Haiku for a batch of simple tasks
/model haiku

Model switches take effect immediately — the very next message uses the new model. You can switch as many times as you want within a session. The conversation history carries over; only the model handling the next response changes.

Default Model Configuration

{
  "model": "claude-sonnet-4-6"
}

Set this in .claude/settings.json for a project-level default, or ~/.claude/settings.json for a global default.

Effort Levels

Effort levels control how deeply the current model reasons about a task. They are independent of model choice — you can run Sonnet at high effort or Opus at low effort.

/effort low      # Fast, less thorough — mechanical tasks
/effort medium   # Balanced default — most work
/effort high     # Thorough analysis — bug investigation, review
/effort max      # Maximum reasoning depth — requires Opus 4.6
/effort auto     # Claude decides per task based on complexity signal

Configure the default in settings.json:

{
  "effortLevel": "medium"
}

When to Override Effort

Situation	Effort Override
Debugging a subtle race condition	`high`
Reviewing a security-sensitive PR	`high` or `max`
Generating repetitive boilerplate	`low`
Quick syntax question	`low`
Routine feature implementation	`medium` (default)
Novel algorithmic problem	`high`
Production incident root cause	`high`

Cost vs. Quality Matrix

Task Type	Model	Effort	Reasoning
Architecture design	Opus	high	High stakes, worth the cost
Feature implementation	Sonnet	medium	Most tasks live here
Simple script / formatting	Haiku	low	Fast and cheap
Bug triage and root cause	Sonnet	high	Need thoroughness
Security review	Opus	max	Cannot miss threats
Documentation (volume)	Haiku	medium	Speed matters more
Code review	Sonnet	high	Need thoroughness
Refactoring a large module	Sonnet	medium	Standard implementation
Novel algorithmic design	Opus	high	Needs deep reasoning
Test generation (bulk)	Haiku	low	Mechanical, high volume

Subagent Model Selection

In multi-agent workflows, different agents in the same pipeline can use different models. Match model cost to the role:

Orchestrator (Opus, high effort)
  └── Holds the plan, makes architectural decisions, reviews output

Implementation workers (Sonnet, medium effort)
  └── Execute focused coding subtasks

Formatting/lint workers (Haiku, low effort)
  └── Mechanical transformations, style fixes, boilerplate

This keeps total cost proportional to actual reasoning requirements across the pipeline rather than running everything at the highest tier.

Extended Thinking and Model Choice

Extended thinking — where Claude reasons through a problem step by step before answering — is available on all models but performs best on Opus 4.6. The ultrathink keyword activates extended thinking regardless of which model is active:

ultrathink through the failure modes of this distributed lock implementation

On Opus 4.6, ultrathink triggers the deepest available reasoning. On Sonnet, it still activates extended thinking but with less depth. On Haiku, the effect is minimal.

For tasks where you genuinely need ultrathink-level analysis, switch to Opus first:

/model opus
/effort max
ultrathink through the security implications of this authentication design

See Extended Thinking for full documentation on the ultrathink keyword and when to use it.

Model Selection Flow

flowchart TD TASK([New Task]) --> Q1{High stakes or\nnovel reasoning?} Q1 -- Yes --> OPUS["Use Opus\n/model opus"] Q1 -- No --> Q2{Simple or\nmechanical?} Q2 -- Yes --> HAIKU["Use Haiku\n/model haiku"] Q2 -- No --> SONNET["Use Sonnet\n/model sonnet (default)"] OPUS --> Q3{Effort level?} SONNET --> Q4{Effort level?} HAIKU --> Q5{Effort level?} Q3 -- Architecture/security --> MAXEFFORT["effort max or high"] Q3 -- Planning --> HIGHEFFORT["effort high"] Q4 -- Bug investigation/review --> HIGHEFFORT2["effort high"] Q4 -- Implementation --> MEDEFFORT["effort medium"] Q5 -- Any --> LOWEFFORT["effort low"] MAXEFFORT --> DONE([Execute]) HIGHEFFORT --> DONE HIGHEFFORT2 --> DONE MEDEFFORT --> DONE LOWEFFORT --> DONE style TASK fill:#1e293b,color:#86efac,stroke:#334155 style Q1 fill:#1e293b,color:#fcd34d,stroke:#334155 style Q2 fill:#1e293b,color:#fcd34d,stroke:#334155 style Q3 fill:#1e293b,color:#fcd34d,stroke:#334155 style Q4 fill:#1e293b,color:#fcd34d,stroke:#334155 style Q5 fill:#1e293b,color:#fcd34d,stroke:#334155 style OPUS fill:#1e293b,color:#7dd3fc,stroke:#334155 style SONNET fill:#1e293b,color:#7dd3fc,stroke:#334155 style HAIKU fill:#1e293b,color:#7dd3fc,stroke:#334155 style MAXEFFORT fill:#1e293b,color:#7dd3fc,stroke:#334155 style HIGHEFFORT fill:#1e293b,color:#7dd3fc,stroke:#334155 style HIGHEFFORT2 fill:#1e293b,color:#7dd3fc,stroke:#334155 style MEDEFFORT fill:#1e293b,color:#7dd3fc,stroke:#334155 style LOWEFFORT fill:#1e293b,color:#7dd3fc,stroke:#334155 style DONE fill:#1e293b,color:#86efac,stroke:#334155

When to Force Opus Mid-Session

You started a session on Sonnet for implementation work. Switch to Opus immediately when:

Sonnet produces the wrong answer twice in a row on the same problem — it may be at the edge of its reasoning capability for this task
The task scope unexpectedly expands into architectural territory
You are about to make a decision that is hard to reverse (schema changes, API contracts, security design)
You are debugging something that has resisted multiple approaches and needs fresh, deep analysis

# Sonnet is stuck — escalate to Opus
/model opus
ultrathink why this WebSocket reconnection logic produces intermittent race conditions

After the Opus session resolves the hard problem, switch back to Sonnet for the implementation:

# Hard part is understood — back to Sonnet for the code
/model sonnet
Now implement the fix we just designed

Gotchas

Effort level persists across model switches. If you set /effort high and then switch models, the effort level stays at high. Reset it explicitly if needed after switching.

/effort max requires Opus 4.6. Setting max effort on Sonnet or Haiku will either be ignored or downgraded silently. If you need max reasoning depth, confirm you are on Opus first with /model opus.

Haiku is not just “cheaper Sonnet.” Haiku is optimized for a different performance profile. It can confidently produce plausible-sounding wrong answers on tasks that require multi-step reasoning. Only use Haiku for tasks where you can quickly verify the output.

Auto effort is not magic. /effort auto lets Claude estimate task complexity, but Claude’s self-assessment is imperfect. For high-stakes tasks, always set effort explicitly rather than relying on auto.