EHS Caching Strategies for CFR Poker Training
Overview
This document compares two approaches to caching Expected Hand Strength (EHS) values during CFR poker training: random pre-computed caching vs. bucket-based caching. The analysis shows why random pre-computation is ineffective and how bucket-based abstraction dramatically improves cache hit rates.
1. State Space Complexity
State Space Sizes by Street
Coverage Analysis
| Street | Total Combinations | 100k Random Cache | Coverage % |
|---|---|---|---|
| Flop | ~26 million | 100,000 | 0.38% |
| Turn | ~305 million | 100,000 | 0.033% |
| River | ~2.8 billion | 100,000 | 0.0036% |
Key Problem: Pre-computing 100k random samples provides less than 0.001% coverage on turn/river, leading to >99% cache miss rate during training.
2. Problem: Random Pre-computed EHS Approach
Why It Doesn’t Work
Flow: Current Random Pre-computation
Why cache misses are so high:
- Random paths: Each CFR iteration explores different random game trajectories
- Massive state space: 2.8B river combinations, only 100k cached
- No locality: Random pre-computation doesn’t align with training access patterns
- Single-use entries: Cached values are rarely visited again in subsequent iterations
3. Solution: Bucket-Based Caching
Abstraction Strategy
Instead of caching exact (hole, board) combinations, cache by abstracted buckets:
Bucket Space Reduction
| Strategy | Preflop | Flop | Turn | River | Total Combinations |
|---|---|---|---|---|---|
| Exact States | 1,326 | 26M | 305M | 2.8B | ~2.8 billion |
| Bucket Abstraction | 5 | 10 | 10 | 10 | 5,000 |
| Reduction Factor | 265x | 2.6M x | 30.5M x | 280M x | 560,000x |
Example bucket scheme:
- 5 preflop buckets (pairs, suited, offsuit, etc.)
- 10 flop buckets (made hands, draws, etc.)
- 10 turn buckets (strength categories)
- 10 river buckets (final strength)
- Total: 5 × 10 × 10 × 10 = 5,000 bucket combinations
Flow: Bucket-Based Caching
Why cache hits are high:
- Locality: Many exact states map to same bucket
- Small space: Only 5,000 buckets vs. billions of states
- Reuse: Same buckets visited repeatedly across iterations
- Coverage: 100% of state space maps to some bucket
4. Comparison: Random vs. Bucket-Based
Performance Impact
| Metric | Random Pre-computation | Bucket-Based | Improvement |
|---|---|---|---|
| Cache Hit Rate | <1% | >90% | 90-100x |
| EHS Computations per Iteration | ~99% of visited states | ~10% of visited states | 10x reduction |
| Memory Footprint | 100k entries × 8 bytes = 800 KB | 5k entries × 8 bytes = 40 KB | 20x smaller |
| Training Speed | Baseline | 5-10x faster | Significant |
5. Implementation Considerations
Bucket-Based Caching Implementation
class BucketEHSCache: def __init__(self): # Cache: (preflop_bucket, street_bucket) -> EHS value self.cache = {}
def get_ehs(self, hole_cards, board): # Get abstraction buckets preflop_bucket = get_preflop_bucket(hole_cards) street_bucket = get_street_bucket(hole_cards, board)
key = (preflop_bucket, street_bucket)
if key in self.cache: return self.cache[key] # Cache hit
# Cache miss: compute average EHS for all hands in bucket bucket_ehs = compute_bucket_average_ehs(preflop_bucket, street_bucket, board) self.cache[key] = bucket_ehs
return bucket_ehsKey Design Decisions
-
Bucket Granularity: Balance between precision and cache size
- Fewer buckets = higher hit rate, lower precision
- More buckets = lower hit rate, higher precision
- Sweet spot: 5-10 buckets per street
-
Precomputation Strategy:
- Pre-compute all 5,000 bucket EHS values before training
- Or compute lazily during training (cold start penalty)
-
Bucket Definition:
- Hand strength percentiles
- Equity distributions
- Potential (made hand + draws)
Profiling Update: The Real Story
UPDATE (2025-12-03): After profiling, we discovered the analysis above is partially incorrect.
What Profiling Revealed
Total _compute_ehs calls: 27,398Actual GPU compute calls: 720In-memory cache hits: 26,678 (97.4% hit rate!)The existing in-memory _ehs_cache in PostflopAbstraction already achieves 97.4% hit rate!
Why Pre-computed Tables Don’t Help
| Factor | Reality |
|---|---|
| In-memory cache hit rate | 97.4% already |
| Pre-computed table benefit | Marginal (<0.1% improvement) |
| Real bottleneck | Per-call GPU overhead (23.8ms each) |
Corrected Understanding
Actual Optimization Opportunities
| Optimization | Expected Impact | Effort |
|---|---|---|
| Reduce EHS samples (100→50) | 2x faster | Low |
| Batch EHS across tree branches | 3-5x faster | High |
Optimize _compute_indices | 1.2x faster | Medium |
| Async GPU overlap | 1.5x faster | High |
Conclusion
Random pre-computed EHS caching is ineffective because:
Minuscule coverage of massive state spaceActually: In-memory cache already has 97.4% hit rateRandom access patterns don’t align with trainingActually: Cache grows organically with training paths>99% cache miss rateActually: Only 2.6% miss rate
Bucket-based caching would NOT help significantly because:
- In-memory cache already provides excellent coverage
- The bottleneck is per-call GPU overhead, not cache misses
- ~720 GPU calls × 23.8ms = 17.1s regardless of caching strategy
Actual recommendation: Focus on reducing per-call GPU overhead:
- Reduce Monte Carlo samples from 100 to 50
- Batch EHS queries across multiple tree branches
- Optimize combinatorial index computation