Model Selection Strategy
Not every task deserves the same model. Sending every request through Opus when Haiku would do is expensive and slow. Sending architectural decisions through Haiku when they need Opus is risky. The right model selection strategy matches reasoning depth to task complexity — and switches mid-session when the task changes.
The Three-Tier Model
Opus — When Quality Outweighs Speed
Opus is the right choice when the cost of a wrong answer is high or when the problem genuinely requires deep, multi-step reasoning.
Use Opus for:
- Architecture decisions and system design
- Security analysis and threat modeling
- Complex algorithmic problems with non-obvious solutions
- Planning sessions that set direction for hours of downstream work
- Any situation where Sonnet has gotten stuck in a loop or keeps producing the wrong answer
Opus costs more and responds slower. That trade-off is worth it when the task has high stakes or high complexity.
Sonnet — The Default for Most Work
Sonnet hits the best cost-to-quality ratio for the majority of software development tasks. It is fast enough for interactive use, accurate enough for real implementation work, and cheap enough to use continuously throughout a session.
Use Sonnet for:
- Feature implementation
- Bug fixing and refactoring
- Code review
- Writing and editing documentation
- Test authoring
- Most debugging sessions
When in doubt, start here. Upgrade to Opus only when Sonnet demonstrably underperforms.
Haiku — Speed and Volume
Haiku is optimized for speed and low cost. It handles simple, mechanical tasks well and is the right choice when you need volume (many quick operations) or fast iteration cycles.
Use Haiku for:
- Simple scripts and one-off utilities
- Code formatting and minor style fixes
- Boilerplate generation from a clear template
- Quick lookups and single-question answers
- High-volume repetitive tasks in automated pipelines
Switching Models
# Switch to Opus for a planning session/model opus
# Switch back to Sonnet for implementation/model sonnet
# Switch to Haiku for a batch of simple tasks/model haikuModel switches take effect immediately — the very next message uses the new model. You can switch as many times as you want within a session. The conversation history carries over; only the model handling the next response changes.
Default Model Configuration
{ "model": "claude-sonnet-4-6"}Set this in .claude/settings.json for a project-level default, or ~/.claude/settings.json for a global default.
Effort Levels
Effort levels control how deeply the current model reasons about a task. They are independent of model choice — you can run Sonnet at high effort or Opus at low effort.
/effort low # Fast, less thorough — mechanical tasks/effort medium # Balanced default — most work/effort high # Thorough analysis — bug investigation, review/effort max # Maximum reasoning depth — requires Opus 4.6/effort auto # Claude decides per task based on complexity signalConfigure the default in settings.json:
{ "effortLevel": "medium"}When to Override Effort
| Situation | Effort Override |
|---|---|
| Debugging a subtle race condition | high |
| Reviewing a security-sensitive PR | high or max |
| Generating repetitive boilerplate | low |
| Quick syntax question | low |
| Routine feature implementation | medium (default) |
| Novel algorithmic problem | high |
| Production incident root cause | high |
Cost vs. Quality Matrix
| Task Type | Model | Effort | Reasoning |
|---|---|---|---|
| Architecture design | Opus | high | High stakes, worth the cost |
| Feature implementation | Sonnet | medium | Most tasks live here |
| Simple script / formatting | Haiku | low | Fast and cheap |
| Bug triage and root cause | Sonnet | high | Need thoroughness |
| Security review | Opus | max | Cannot miss threats |
| Documentation (volume) | Haiku | medium | Speed matters more |
| Code review | Sonnet | high | Need thoroughness |
| Refactoring a large module | Sonnet | medium | Standard implementation |
| Novel algorithmic design | Opus | high | Needs deep reasoning |
| Test generation (bulk) | Haiku | low | Mechanical, high volume |
Subagent Model Selection
In multi-agent workflows, different agents in the same pipeline can use different models. Match model cost to the role:
Orchestrator (Opus, high effort) └── Holds the plan, makes architectural decisions, reviews output
Implementation workers (Sonnet, medium effort) └── Execute focused coding subtasks
Formatting/lint workers (Haiku, low effort) └── Mechanical transformations, style fixes, boilerplateThis keeps total cost proportional to actual reasoning requirements across the pipeline rather than running everything at the highest tier.
Extended Thinking and Model Choice
Extended thinking — where Claude reasons through a problem step by step before answering — is available on all models but performs best on Opus 4.6. The ultrathink keyword activates extended thinking regardless of which model is active:
ultrathink through the failure modes of this distributed lock implementationOn Opus 4.6, ultrathink triggers the deepest available reasoning. On Sonnet, it still activates extended thinking but with less depth. On Haiku, the effect is minimal.
For tasks where you genuinely need ultrathink-level analysis, switch to Opus first:
/model opus/effort maxultrathink through the security implications of this authentication designSee Extended Thinking for full documentation on the ultrathink keyword and when to use it.
Model Selection Flow
When to Force Opus Mid-Session
You started a session on Sonnet for implementation work. Switch to Opus immediately when:
- Sonnet produces the wrong answer twice in a row on the same problem — it may be at the edge of its reasoning capability for this task
- The task scope unexpectedly expands into architectural territory
- You are about to make a decision that is hard to reverse (schema changes, API contracts, security design)
- You are debugging something that has resisted multiple approaches and needs fresh, deep analysis
# Sonnet is stuck — escalate to Opus/model opusultrathink why this WebSocket reconnection logic produces intermittent race conditionsAfter the Opus session resolves the hard problem, switch back to Sonnet for the implementation:
# Hard part is understood — back to Sonnet for the code/model sonnetNow implement the fix we just designedGotchas
Effort level persists across model switches. If you set /effort high and then switch models, the effort level stays at high. Reset it explicitly if needed after switching.
/effort max requires Opus 4.6. Setting max effort on Sonnet or Haiku will either be ignored or downgraded silently. If you need max reasoning depth, confirm you are on Opus first with /model opus.
Haiku is not just “cheaper Sonnet.” Haiku is optimized for a different performance profile. It can confidently produce plausible-sounding wrong answers on tasks that require multi-step reasoning. Only use Haiku for tasks where you can quickly verify the output.
Auto effort is not magic. /effort auto lets Claude estimate task complexity, but Claude’s self-assessment is imperfect. For high-stakes tasks, always set effort explicitly rather than relying on auto.