EHS Caching Strategies for CFR Poker Training

Overview

This document compares two approaches to caching Expected Hand Strength (EHS) values during CFR poker training: random pre-computed caching vs. bucket-based caching. The analysis shows why random pre-computation is ineffective and how bucket-based abstraction dramatically improves cache hit rates.

1. State Space Complexity

State Space Sizes by Street

Coverage Analysis

Street	Total Combinations	100k Random Cache	Coverage %
Flop	~26 million	100,000	0.38%
Turn	~305 million	100,000	0.033%
River	~2.8 billion	100,000	0.0036%

Key Problem: Pre-computing 100k random samples provides less than 0.001% coverage on turn/river, leading to >99% cache miss rate during training.

2. Problem: Random Pre-computed EHS Approach

Why It Doesn’t Work

Flow: Current Random Pre-computation

Why cache misses are so high:

Random paths: Each CFR iteration explores different random game trajectories
Massive state space: 2.8B river combinations, only 100k cached
No locality: Random pre-computation doesn’t align with training access patterns
Single-use entries: Cached values are rarely visited again in subsequent iterations

3. Solution: Bucket-Based Caching

Abstraction Strategy

Instead of caching exact (hole, board) combinations, cache by abstracted buckets:

Bucket Space Reduction

Strategy	Preflop	Flop	Turn	River	Total Combinations
Exact States	1,326	26M	305M	2.8B	~2.8 billion
Bucket Abstraction	5	10	10	10	5,000
Reduction Factor	265x	2.6M x	30.5M x	280M x	560,000x

Example bucket scheme:

5 preflop buckets (pairs, suited, offsuit, etc.)
10 flop buckets (made hands, draws, etc.)
10 turn buckets (strength categories)
10 river buckets (final strength)
Total: 5 × 10 × 10 × 10 = 5,000 bucket combinations

Flow: Bucket-Based Caching

Why cache hits are high:

Locality: Many exact states map to same bucket
Small space: Only 5,000 buckets vs. billions of states
Reuse: Same buckets visited repeatedly across iterations
Coverage: 100% of state space maps to some bucket

4. Comparison: Random vs. Bucket-Based

Performance Impact

Metric	Random Pre-computation	Bucket-Based	Improvement
Cache Hit Rate	<1%	>90%	90-100x
EHS Computations per Iteration	~99% of visited states	~10% of visited states	10x reduction
Memory Footprint	100k entries × 8 bytes = 800 KB	5k entries × 8 bytes = 40 KB	20x smaller
Training Speed	Baseline	5-10x faster	Significant

5. Implementation Considerations

Bucket-Based Caching Implementation

class BucketEHSCache:
    def __init__(self):
        # Cache: (preflop_bucket, street_bucket) -> EHS value
        self.cache = {}

    def get_ehs(self, hole_cards, board):
        # Get abstraction buckets
        preflop_bucket = get_preflop_bucket(hole_cards)
        street_bucket = get_street_bucket(hole_cards, board)

        key = (preflop_bucket, street_bucket)

        if key in self.cache:
            return self.cache[key]  # Cache hit

        # Cache miss: compute average EHS for all hands in bucket
        bucket_ehs = compute_bucket_average_ehs(preflop_bucket, street_bucket, board)
        self.cache[key] = bucket_ehs

        return bucket_ehs

Key Design Decisions

Bucket Granularity: Balance between precision and cache size
- Fewer buckets = higher hit rate, lower precision
- More buckets = lower hit rate, higher precision
- Sweet spot: 5-10 buckets per street
Precomputation Strategy:
- Pre-compute all 5,000 bucket EHS values before training
- Or compute lazily during training (cold start penalty)
Bucket Definition:
- Hand strength percentiles
- Equity distributions
- Potential (made hand + draws)

Profiling Update: The Real Story

UPDATE (2025-12-03): After profiling, we discovered the analysis above is partially incorrect.

What Profiling Revealed

Total _compute_ehs calls:  27,398
Actual GPU compute calls:     720
In-memory cache hits:     26,678  (97.4% hit rate!)

The existing in-memory _ehs_cache in PostflopAbstraction already achieves 97.4% hit rate!

Why Pre-computed Tables Don’t Help

Factor	Reality
In-memory cache hit rate	97.4% already
Pre-computed table benefit	Marginal (<0.1% improvement)
Real bottleneck	Per-call GPU overhead (23.8ms each)

Corrected Understanding

Actual Optimization Opportunities

Optimization	Expected Impact	Effort
Reduce EHS samples (100→50)	2x faster	Low
Batch EHS across tree branches	3-5x faster	High
Optimize `_compute_indices`	1.2x faster	Medium
Async GPU overlap	1.5x faster	High

Conclusion

Random pre-computed EHS caching is ineffective because:

~~Minuscule coverage of massive state space~~ Actually: In-memory cache already has 97.4% hit rate
~~Random access patterns don’t align with training~~ Actually: Cache grows organically with training paths
~~>99% cache miss rate~~ Actually: Only 2.6% miss rate

Bucket-based caching would NOT help significantly because:

In-memory cache already provides excellent coverage
The bottleneck is per-call GPU overhead, not cache misses
~720 GPU calls × 23.8ms = 17.1s regardless of caching strategy

Actual recommendation: Focus on reducing per-call GPU overhead:

Reduce Monte Carlo samples from 100 to 50
Batch EHS queries across multiple tree branches
Optimize combinatorial index computation