auto-compact-instant-vs-delayed

Purpose

Understanding when Claude Code performs instant auto-compaction versus delayed compaction, and why you sometimes experience immediate summarization while other times you must wait several minutes.

Key Finding: Feature Rollback

The “instant” auto-compact feature was rolled back by Anthropic. What was advertised as “instant” in v2.0.64+ never functioned as intended in production, explaining why you rarely see instant compaction but occasionally experience it.

What Is Auto-Compact?

Auto-compact is Claude Code’s intelligent context window management system that automatically summarizes conversations when approaching memory limits, allowing you to continue working without interruption.

How it works:

Monitors token usage after each model response
Triggers when context window reaches ~80-95% capacity (or 10-25% remaining)
Analyzes conversation to identify key information worth preserving
Creates a concise summary of previous interactions, decisions, and code changes
Replaces old messages with the summary

Two Types of Compaction

1. Traditional (Delayed) Compaction

Duration: 5+ minutes Process:

Generates fresh summaries of the entire conversation in real-time
Blocks operations while processing
CPU-intensive summarization
Summary prompt injected as user turn when threshold exceeded
Claude generates structured summary wrapped in <summary></summary> tags

When this happens: Most of the time in current Claude Code versions

2. Instant Compaction (Planned Feature)

Duration: Near-instant (seconds) Process:

Uses pre-built session memory files updated incrementally in the background
Session memory stored at: ~/.claude/projects/[project-path]/[session-id]/session-memory/summary.md
Background summarization triggered after each message
/compact command simply loads the pre-written summary

Expected session memory structure:

Session title
Current status (discussion points, completed items, open questions)
Task specification
Files and functions referenced
Errors & corrections
Learnings and key results
Auto-updated work log

When this happens: Rarely, when the feature flag is temporarily enabled for testing

Why You Experience Both Behaviors

The Rollback Situation

According to Anthropic developer response (Dec 13, 2025):

“we rolled back instant auto compact. We will share our coms for when it is rolled back”

Evidence from user reports:

Out of 900+ sessions in one project, only 1 had a session-memory subfolder
Feature appeared to work briefly (Dec 11-12, 2025) for some users, then stopped
Manual creation of session-memory folder resulted in nothing being written to it
The feature was behind a feature flag and never properly deployed to production

What This Means for You

Rare instant compaction: Occurs when:

You’re temporarily in a test group with the feature flag enabled
Feature flag briefly activated during testing/rollout attempts
Session happens to have pre-built session memory (extremely rare)

Typical delayed compaction: Occurs when:

Feature flag is disabled (default state after rollback)
Session-memory folder doesn’t exist or isn’t being updated
Standard real-time summarization is required

Auto-Compact Trigger Conditions

Primary Threshold

Context window reaches approximately 80-95% capacity (or 10-25% remaining)

Earlier Triggering Pattern (Post-2024 Updates)

Claude Code now triggers auto-compact much earlier than historical behavior:

Old behavior: Triggered at 90%+ context usage
New behavior: Triggers around 64-75% context usage
Reason: Implements a “completion buffer” to allow tasks to finish gracefully

Observed discrepancy:

Claude Code reports “10% context remaining”
Independent monitors show only 64% utilization
Gap: ~54 percentage points difference in reporting

The Completion Buffer

Rather than compacting mid-operation, Claude Code maintains enough free space to complete the current task before resetting:

Reserves approximately 25% of context window (roughly 50k tokens in a 200k window)
Provides runway to “land the plane before resetting”
Prevents context collapse mid-operation

Historical problem: Sessions would run until 8-12% remaining context, causing constant interruptions

Manual Control with `/compact`

You can trigger compaction manually at any time:

/compact

When to use:

Natural breakpoints (task completion)
Want smaller, more focused context
Control what gets preserved vs automatic compaction
Finish a task before moving to unrelated work

Performance: Manual /compact uses the same mechanism as auto-compact:

Instant if session-memory exists and feature flag enabled (rare)
Delayed (5+ minutes) in most cases

Summary Content

The built-in summary prompt instructs Claude to create structured continuation summaries including:

Task Overview
Current State
Important Discoveries
Next Steps
Context to Preserve

Workarounds & Best Practices

Since instant compaction is unreliable:

Manual compaction at breakpoints: Use /compact when finishing tasks rather than waiting for auto-compact
Use CLAUDE.md for persistence: Store important context in project memory files
Smaller sessions: Start new sessions for unrelated tasks rather than continuing long conversations
Monitor context usage: Use /stats to track token usage proactively
Plan for delays: Expect 5+ minute delays when compaction triggers

Timeline & Status

Date	Event
v2.0.64 (Dec 2025)	“Instant auto-compact” announced
Dec 11-12, 2025	Feature briefly works for some users
Dec 13, 2025	Feature rolled back by Anthropic
Jan 2026	Feature remains rolled back

Current status: Traditional (delayed) compaction is the standard behavior. Instant compaction may be gradually re-enabled via feature flags in the future, but no timeline announced.

Issue #13664: Instant auto-compaction not instant
Issue #13239: Pre-Compact Auto-Save and Improved Summarization
Issue #14176: instant auto compact summary isn’t updated