Executive Summary

Ultrathink was a Claude Code v1-specific keyword triggering extended thinking with ~32K tokens. Claude Code v2 replaced it with a simple “Thinking On/Off” toggle (TAB key). The Extended Thinking API remains the programmatic way to control thinking budgets for all Claude interfaces.

Three Distinct Concepts

FeatureWhere It WorksHow to ActivateToken Budget
Ultrathink (deprecated)Claude Code v1 onlyKeywords: “think”, “think harder”, “ultrathink”4K → 10K → 32K
Thinking On/OffClaude Code v2TAB key toggleConfigurable via settings
Extended Thinking APIAPI, Claude Appthinking: {type: "enabled", budget_tokens: N}User-specified

Ultrathink (Claude Code v1 - Deprecated)

What It Was

In Claude Code v1, specific phrases triggered increasing levels of thinking budget:

"think" → 4,000 tokens (routine debugging)
"megathink" → 10,000 tokens (architectural decisions)
"ultrathink" → 31,999 tokens (deep sustained reasoning)

Trigger Phrases

The system recognized various phrasings:

  • Low: “think”, “think about this”
  • Medium: “think hard”, “think harder”, “think really hard”, “think super hard”
  • High: “ultrathink”, “think intensely”, “think longer”

Why It Was Removed

Claude Code v2 simplified the UX by replacing keyword-based levels with a simple toggle, providing more transparent control.

Claude Code V2: Thinking On/Off

Current Approach

TAB key toggles thinking mode on/off in Claude Code v2:

  • Thinking On: Claude performs extended reasoning before responding
  • Thinking Off: Claude responds immediately without extended thinking

Configuration

Token budget is now configurable in Claude Code settings rather than keyword-based.

Migration Guide

Claude Code v1Claude Code v2
”ultrathink this problem”Press TAB → Thinking On
”think about X”Press TAB → Thinking On
Normal promptsThinking Off (default)

Extended Thinking API

API Usage

{
"model": "claude-opus-4-5-20251101",
"max_tokens": 4096,
"thinking": {
"type": "enabled",
"budget_tokens": 10000
},
"messages": [
{
"role": "user",
"content": "Solve this complex problem..."
}
]
}

Token Budget Recommendations

Task ComplexityRecommended BudgetUse Case
SimpleSkip extended thinkingSyntax fixes, formatting
Moderate2,000 - 5,000 tokensCode review, debugging
Complex5,000 - 15,000 tokensArchitecture design, refactoring
Very Complex15,000 - 32,000 tokensNovel algorithms, research

Cost Implications

Extended thinking tokens are charged at the same rate as input tokens:

Example: Opus 4.5 with 10K thinking budget

Base cost: $15/M input tokens
Thinking: 10,000 tokens × $15/M = $0.15
Output: 1,000 tokens × $75/M = $0.075
Total: $0.225 per request

Interleaved Thinking (Claude 4+)

What It Enables

With interleaved thinking, Claude can think between tool calls rather than only before the first response.

API Usage

Terminal window
curl https://api.anthropic.com/v1/messages \
-H "anthropic-beta: interleaved-thinking-2025-05-14" \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-d '{
"model": "claude-4-20250430",
"thinking": {
"type": "enabled",
"budget_tokens": 10000
},
"messages": [...]
}'

Benefits

  • More sophisticated reasoning after receiving tool results
  • Better decision-making in multi-step workflows
  • Adaptive strategy based on intermediate findings

The “Think” Tool (Different Concept)

Key Distinction

FeatureWhen It HappensPurpose
Extended ThinkingBEFORE generating responseDeep reasoning on the problem
Think ToolDURING response generationPause to check if more info needed

Think Tool Use Case

# Claude generates code, then uses think tool:
def process_data(data):
# ... some code ...
<think>
Wait, I should check if the user wants error handling
for edge cases. Let me ask before continuing.
</think>

The think tool allows Claude to stop mid-response and request more information.

When to Use Each Mode

Use Extended Thinking (API/App) When:

✅ Solving novel problems without clear solutions ✅ Complex architectural decisions ✅ Multi-step mathematical reasoning ✅ Research and analysis tasks ✅ Code that requires exploring multiple approaches

Skip Extended Thinking When:

❌ Simple syntax fixes ❌ Formatting tasks ❌ Well-specified problems with clear instructions ❌ Iteration speed is critical ❌ Budget-conscious applications

Warning: Extended thinking can make Claude MORE verbose and LESS accurate on basic tasks while adding latency and cost.

Performance Characteristics

Accuracy vs Thinking Tokens

Claude’s accuracy improves logarithmically with thinking tokens:

1,000 tokens → Baseline
5,000 tokens → +10% accuracy (estimated)
10,000 tokens → +15% accuracy
32,000 tokens → +20% accuracy

Diminishing returns after ~15K tokens for most tasks.

Latency Impact

Thinking BudgetAdded LatencyTotal Time (estimate)
No thinking0s2-5s
5,000 tokens+2-4s4-9s
10,000 tokens+4-8s6-13s
32,000 tokens+10-20s12-25s

Common Misconceptions

❌ Myth: “Ultrathink” works in the API

Reality: The API requires explicit budget_tokens parameter. Keywords like “ultrathink” have no effect outside Claude Code v1.

// ❌ This doesn't work
{"thinking": {"type": "ultrathink"}}
// ✅ This works
{"thinking": {"type": "enabled", "budget_tokens": 30000}}

❌ Myth: More thinking always = better results

Reality: Extended thinking is counterproductive for:

  • Simple, well-defined tasks
  • Tasks requiring quick iteration
  • When Claude has clear instructions

❌ Myth: Extended thinking = different model

Reality: It’s the same model spending more time reasoning before responding, not a different model architecture.

Best Practices

1. Start Conservative

Begin with lower budgets (5K-10K) and increase only if:

  • Responses lack depth
  • Claude makes preventable mistakes
  • Task clearly benefits from more reasoning

2. Match Budget to Task

Simple debugging → Skip thinking
Code review → 2K-5K tokens
Architecture design → 10K-15K tokens
Research problems → 15K-30K tokens

3. Monitor Cost vs Benefit

Track:

  • Success rate improvement vs baseline
  • Cost increase vs value gained
  • Time-to-solution vs thinking budget

4. Use in Agent Loops

Extended thinking is most valuable in agent loops where:

  • Single attempt must be highly accurate
  • Retry cost is high (time/money)
  • Wrong decisions compound over multiple steps

Evolution Timeline

DateChangeImpact
2024 Q2Extended Thinking API launchedProgrammable thinking budgets
2024 Q3Claude Code v1: Ultrathink keywordsEasy access via “ultrathink” trigger
2024 Q4”Think” tool introducedMid-response reasoning
2025 Q1Claude 4 + Interleaved ThinkingThink between tool calls
2025 Q2Claude Code v2: TAB toggleDeprecated keyword-based levels

Sources

  1. The ultrathink mystery: does Claude really think harder?
  2. Claude Code Thinking Levels: From Think to Ultra-Think
  3. Building with extended thinking - Claude Docs
  4. Extended thinking - Amazon Bedrock
  5. Claude’s extended thinking
  6. The “think” tool: Enabling Claude to stop and think
  7. Document the difference between ultrathink and Thinking Mode
  8. Think, Megathink, Ultrathink: Claude Code’s Power Keywords Decoded