test-info
Test Started: Fri Nov 21 10:06:03 PM PST 2025 Test Directory: Rate cards/runs/parallel-test-2025-11-21-22-06-03
Approaches Being Tested
Approach A: Iterative Rule Refinement
- Strategy: Algorithmic/pseudocode format with corrected rules
- Key Change: Multiple keys required for split, explicit data sourcing
Approach B: Example-Driven Learning
- Strategy: Show reference examples (split vs non-split services)
- Key Change: Let agent infer pattern from concrete examples
Approach C: Multi-Phase Agent
- Strategy: Analysis phase then execution phase
- Key Change: Agent plans first, we review, then executes
Approach D: Validation-Driven Iteration
- Strategy: Auto-validate and self-correct
- Key Change: Built-in validation loop with error feedback
Success Metrics
- Target: 90%+ validation score (53/59 sheets)
- Sheet count: 60 total (59 rate cards + 1 summary)
- Correct weight splitting (14 services × 4 sheets = 56)
- Correct naming conventions