Purpose

Understanding when Claude Code performs instant auto-compaction versus delayed compaction, and why you sometimes experience immediate summarization while other times you must wait several minutes.

Key Finding: Feature Rollback

The “instant” auto-compact feature was rolled back by Anthropic. What was advertised as “instant” in v2.0.64+ never functioned as intended in production, explaining why you rarely see instant compaction but occasionally experience it.

What Is Auto-Compact?

Auto-compact is Claude Code’s intelligent context window management system that automatically summarizes conversations when approaching memory limits, allowing you to continue working without interruption.

How it works:

  • Monitors token usage after each model response
  • Triggers when context window reaches ~80-95% capacity (or 10-25% remaining)
  • Analyzes conversation to identify key information worth preserving
  • Creates a concise summary of previous interactions, decisions, and code changes
  • Replaces old messages with the summary

Two Types of Compaction

1. Traditional (Delayed) Compaction

Duration: 5+ minutes Process:

  • Generates fresh summaries of the entire conversation in real-time
  • Blocks operations while processing
  • CPU-intensive summarization
  • Summary prompt injected as user turn when threshold exceeded
  • Claude generates structured summary wrapped in <summary></summary> tags

When this happens: Most of the time in current Claude Code versions

2. Instant Compaction (Planned Feature)

Duration: Near-instant (seconds) Process:

  • Uses pre-built session memory files updated incrementally in the background
  • Session memory stored at: ~/.claude/projects/[project-path]/[session-id]/session-memory/summary.md
  • Background summarization triggered after each message
  • /compact command simply loads the pre-written summary

Expected session memory structure:

  • Session title
  • Current status (discussion points, completed items, open questions)
  • Task specification
  • Files and functions referenced
  • Errors & corrections
  • Learnings and key results
  • Auto-updated work log

When this happens: Rarely, when the feature flag is temporarily enabled for testing

Why You Experience Both Behaviors

The Rollback Situation

According to Anthropic developer response (Dec 13, 2025):

“we rolled back instant auto compact. We will share our coms for when it is rolled back”

Evidence from user reports:

  • Out of 900+ sessions in one project, only 1 had a session-memory subfolder
  • Feature appeared to work briefly (Dec 11-12, 2025) for some users, then stopped
  • Manual creation of session-memory folder resulted in nothing being written to it
  • The feature was behind a feature flag and never properly deployed to production

What This Means for You

Rare instant compaction: Occurs when:

  • You’re temporarily in a test group with the feature flag enabled
  • Feature flag briefly activated during testing/rollout attempts
  • Session happens to have pre-built session memory (extremely rare)

Typical delayed compaction: Occurs when:

  • Feature flag is disabled (default state after rollback)
  • Session-memory folder doesn’t exist or isn’t being updated
  • Standard real-time summarization is required

Auto-Compact Trigger Conditions

Primary Threshold

Context window reaches approximately 80-95% capacity (or 10-25% remaining)

Earlier Triggering Pattern (Post-2024 Updates)

Claude Code now triggers auto-compact much earlier than historical behavior:

  • Old behavior: Triggered at 90%+ context usage
  • New behavior: Triggers around 64-75% context usage
  • Reason: Implements a “completion buffer” to allow tasks to finish gracefully

Observed discrepancy:

  • Claude Code reports “10% context remaining”
  • Independent monitors show only 64% utilization
  • Gap: ~54 percentage points difference in reporting

The Completion Buffer

Rather than compacting mid-operation, Claude Code maintains enough free space to complete the current task before resetting:

  • Reserves approximately 25% of context window (roughly 50k tokens in a 200k window)
  • Provides runway to “land the plane before resetting”
  • Prevents context collapse mid-operation

Historical problem: Sessions would run until 8-12% remaining context, causing constant interruptions

Manual Control with /compact

You can trigger compaction manually at any time:

/compact

When to use:

  • Natural breakpoints (task completion)
  • Want smaller, more focused context
  • Control what gets preserved vs automatic compaction
  • Finish a task before moving to unrelated work

Performance: Manual /compact uses the same mechanism as auto-compact:

  • Instant if session-memory exists and feature flag enabled (rare)
  • Delayed (5+ minutes) in most cases

Summary Content

The built-in summary prompt instructs Claude to create structured continuation summaries including:

  • Task Overview
  • Current State
  • Important Discoveries
  • Next Steps
  • Context to Preserve

Workarounds & Best Practices

Since instant compaction is unreliable:

  1. Manual compaction at breakpoints: Use /compact when finishing tasks rather than waiting for auto-compact
  2. Use CLAUDE.md for persistence: Store important context in project memory files
  3. Smaller sessions: Start new sessions for unrelated tasks rather than continuing long conversations
  4. Monitor context usage: Use /stats to track token usage proactively
  5. Plan for delays: Expect 5+ minute delays when compaction triggers

Timeline & Status

DateEvent
v2.0.64 (Dec 2025)“Instant auto-compact” announced
Dec 11-12, 2025Feature briefly works for some users
Dec 13, 2025Feature rolled back by Anthropic
Jan 2026Feature remains rolled back

Current status: Traditional (delayed) compaction is the standard behavior. Instant compaction may be gradually re-enabled via feature flags in the future, but no timeline announced.

Sources