ultrathink-vs-thinking-mode
Executive Summary
Ultrathink was a Claude Code v1-specific keyword triggering extended thinking with ~32K tokens. Claude Code v2 replaced it with a simple “Thinking On/Off” toggle (TAB key). The Extended Thinking API remains the programmatic way to control thinking budgets for all Claude interfaces.
Three Distinct Concepts
| Feature | Where It Works | How to Activate | Token Budget |
|---|---|---|---|
| Ultrathink (deprecated) | Claude Code v1 only | Keywords: “think”, “think harder”, “ultrathink” | 4K → 10K → 32K |
| Thinking On/Off | Claude Code v2 | TAB key toggle | Configurable via settings |
| Extended Thinking API | API, Claude App | thinking: {type: "enabled", budget_tokens: N} | User-specified |
Ultrathink (Claude Code v1 - Deprecated)
What It Was
In Claude Code v1, specific phrases triggered increasing levels of thinking budget:
"think" → 4,000 tokens (routine debugging)"megathink" → 10,000 tokens (architectural decisions)"ultrathink" → 31,999 tokens (deep sustained reasoning)Trigger Phrases
The system recognized various phrasings:
- Low: “think”, “think about this”
- Medium: “think hard”, “think harder”, “think really hard”, “think super hard”
- High: “ultrathink”, “think intensely”, “think longer”
Why It Was Removed
Claude Code v2 simplified the UX by replacing keyword-based levels with a simple toggle, providing more transparent control.
Claude Code V2: Thinking On/Off
Current Approach
TAB key toggles thinking mode on/off in Claude Code v2:
- Thinking On: Claude performs extended reasoning before responding
- Thinking Off: Claude responds immediately without extended thinking
Configuration
Token budget is now configurable in Claude Code settings rather than keyword-based.
Migration Guide
| Claude Code v1 | Claude Code v2 |
|---|---|
| ”ultrathink this problem” | Press TAB → Thinking On |
| ”think about X” | Press TAB → Thinking On |
| Normal prompts | Thinking Off (default) |
Extended Thinking API
API Usage
{ "model": "claude-opus-4-5-20251101", "max_tokens": 4096, "thinking": { "type": "enabled", "budget_tokens": 10000 }, "messages": [ { "role": "user", "content": "Solve this complex problem..." } ]}Token Budget Recommendations
| Task Complexity | Recommended Budget | Use Case |
|---|---|---|
| Simple | Skip extended thinking | Syntax fixes, formatting |
| Moderate | 2,000 - 5,000 tokens | Code review, debugging |
| Complex | 5,000 - 15,000 tokens | Architecture design, refactoring |
| Very Complex | 15,000 - 32,000 tokens | Novel algorithms, research |
Cost Implications
Extended thinking tokens are charged at the same rate as input tokens:
Example: Opus 4.5 with 10K thinking budget
Base cost: $15/M input tokensThinking: 10,000 tokens × $15/M = $0.15Output: 1,000 tokens × $75/M = $0.075Total: $0.225 per requestInterleaved Thinking (Claude 4+)
What It Enables
With interleaved thinking, Claude can think between tool calls rather than only before the first response.
API Usage
curl https://api.anthropic.com/v1/messages \ -H "anthropic-beta: interleaved-thinking-2025-05-14" \ -H "x-api-key: $ANTHROPIC_API_KEY" \ -d '{ "model": "claude-4-20250430", "thinking": { "type": "enabled", "budget_tokens": 10000 }, "messages": [...] }'Benefits
- More sophisticated reasoning after receiving tool results
- Better decision-making in multi-step workflows
- Adaptive strategy based on intermediate findings
The “Think” Tool (Different Concept)
Key Distinction
| Feature | When It Happens | Purpose |
|---|---|---|
| Extended Thinking | BEFORE generating response | Deep reasoning on the problem |
| Think Tool | DURING response generation | Pause to check if more info needed |
Think Tool Use Case
# Claude generates code, then uses think tool:def process_data(data): # ... some code ... <think> Wait, I should check if the user wants error handling for edge cases. Let me ask before continuing. </think>The think tool allows Claude to stop mid-response and request more information.
When to Use Each Mode
Use Extended Thinking (API/App) When:
✅ Solving novel problems without clear solutions ✅ Complex architectural decisions ✅ Multi-step mathematical reasoning ✅ Research and analysis tasks ✅ Code that requires exploring multiple approaches
Skip Extended Thinking When:
❌ Simple syntax fixes ❌ Formatting tasks ❌ Well-specified problems with clear instructions ❌ Iteration speed is critical ❌ Budget-conscious applications
Warning: Extended thinking can make Claude MORE verbose and LESS accurate on basic tasks while adding latency and cost.
Performance Characteristics
Accuracy vs Thinking Tokens
Claude’s accuracy improves logarithmically with thinking tokens:
1,000 tokens → Baseline5,000 tokens → +10% accuracy (estimated)10,000 tokens → +15% accuracy32,000 tokens → +20% accuracyDiminishing returns after ~15K tokens for most tasks.
Latency Impact
| Thinking Budget | Added Latency | Total Time (estimate) |
|---|---|---|
| No thinking | 0s | 2-5s |
| 5,000 tokens | +2-4s | 4-9s |
| 10,000 tokens | +4-8s | 6-13s |
| 32,000 tokens | +10-20s | 12-25s |
Common Misconceptions
❌ Myth: “Ultrathink” works in the API
Reality: The API requires explicit budget_tokens parameter. Keywords like “ultrathink” have no effect outside Claude Code v1.
// ❌ This doesn't work{"thinking": {"type": "ultrathink"}}
// ✅ This works{"thinking": {"type": "enabled", "budget_tokens": 30000}}❌ Myth: More thinking always = better results
Reality: Extended thinking is counterproductive for:
- Simple, well-defined tasks
- Tasks requiring quick iteration
- When Claude has clear instructions
❌ Myth: Extended thinking = different model
Reality: It’s the same model spending more time reasoning before responding, not a different model architecture.
Best Practices
1. Start Conservative
Begin with lower budgets (5K-10K) and increase only if:
- Responses lack depth
- Claude makes preventable mistakes
- Task clearly benefits from more reasoning
2. Match Budget to Task
Simple debugging → Skip thinkingCode review → 2K-5K tokensArchitecture design → 10K-15K tokensResearch problems → 15K-30K tokens3. Monitor Cost vs Benefit
Track:
- Success rate improvement vs baseline
- Cost increase vs value gained
- Time-to-solution vs thinking budget
4. Use in Agent Loops
Extended thinking is most valuable in agent loops where:
- Single attempt must be highly accurate
- Retry cost is high (time/money)
- Wrong decisions compound over multiple steps
Evolution Timeline
| Date | Change | Impact |
|---|---|---|
| 2024 Q2 | Extended Thinking API launched | Programmable thinking budgets |
| 2024 Q3 | Claude Code v1: Ultrathink keywords | Easy access via “ultrathink” trigger |
| 2024 Q4 | ”Think” tool introduced | Mid-response reasoning |
| 2025 Q1 | Claude 4 + Interleaved Thinking | Think between tool calls |
| 2025 Q2 | Claude Code v2: TAB toggle | Deprecated keyword-based levels |
Sources
- The ultrathink mystery: does Claude really think harder?
- Claude Code Thinking Levels: From Think to Ultra-Think
- Building with extended thinking - Claude Docs
- Extended thinking - Amazon Bedrock
- Claude’s extended thinking
- The “think” tool: Enabling Claude to stop and think
- Document the difference between
ultrathinkand Thinking Mode - Think, Megathink, Ultrathink: Claude Code’s Power Keywords Decoded