REFERENCE_STRUCTURE_ANALYSIS
Critical Discovery
The validation script revealed that all iterations have been 0% successful at matching the reference output structure. The agent has been generating a fundamentally different organization.
The Problem
What We’re Generating: ALL services split by weight (SUB1, 1LB, 6LB, 10LB) What Reference Expects: ONLY services with “SUB 1LB” key are split; others remain single sheets
Reference File Structure
Total Sheets: 60 (59 rate cards + 1 summary)
Breakdown by Carrier
- DHL: 11 sheets
- ENDICIA: 5 sheets
- FEDEX: 9 sheets
- OSM: 8 sheets
- UPS: 22 sheets
- VEHO: 4 sheets
Weight Splitting Rules
Rule 1: Services WITH “SUB 1LB 2025” Key → SPLIT INTO 4 SHEETS
These services get split into weight breakpoints:
- SUB1 (< 1 lb)
- 1LB (1-6 lbs)
- 6LB (6-10 lbs)
- 10LB (> 10 lbs)
Examples:
DHL SMPP GRO Service: 02_DHL_SMPP_GRO_SUB1_2025 03_DHL_SMPP_GRO_1LB_2025 04_DHL_SMPP_GRO_6LB_2025 05_DHL_SMPP_GRO_10LB_2025
FEDEX SMARTPOST: 20_FEDEX_SMARTP_SUB1_2025 21_FEDEX_SMARTP_1LB_2025 22_FEDEX_SMARTP_6LB_2025 23_FEDEX_SMARTP_10LB_2025Services That Should Be Split (14 total):
- DHL ECOMMERCE - DHLECOMMERCE DHL SM PARCEL GROUND
- DHL ECOMMERCE - DHLECOMMERCE DHL SM PARCEL PLUS GROUND
- DHL ECOMMERCE - DHLECOMMERCE DHL SM PARCEL EXPEDITED
- DHL ECOMMERCE - DHLECOMMERCE DHL SM PARCEL PLUS EXPEDITED
- ENDICIA - ENDICIA GROUND ADVANTAGE
- FEDEX - FEDEX SMARTPOST
- OSM - OSMWORLDWIDE GROUND ADVANTAGE
- OSM - OSMWORLDWIDE PARCEL
- UPS - UPS SUREPOST OVER ONE POUND
- UPS - RETURN
- UPS - UPS GROUND SAVER - 1 LB OR GREATER
- UPS MI - UPS PARCEL SELECT OVER 1LB
- UPS MI - UPS PARCEL SELECT UNDER 1LB
- VEHO - VEHO GROUND
Rule 2: Services WITHOUT “SUB 1LB” Key → SINGLE SHEET (ALL WEIGHTS)
These services keep all weight ranges in ONE sheet.
Examples:
16_ENDICIA_PRIO_MAIL_2025 (all weights 0-150 lbs)17_FEDEX_STD_OVERN_2025 (all weights 0-150 lbs)18_FEDEX_2DAY_2025 (all weights 0-150 lbs)24_FEDEX_GROUND_2025 (all weights 0-150 lbs)38_UPS_3DAY_SEL_2025 (all weights 0-150 lbs)Services That Should Be Single Sheet:
- ENDICIA PRIORITY MAIL
- FEDEX STANDARD OVERNIGHT
- FEDEX 2DAY
- FEDEX HOME DELIVERY
- FEDEX GROUND
- FEDEX PRIORITY OVERNIGHT
- UPS 3 DAY SELECT
- UPS 2ND DAY AIR
- UPS NEXT DAY AIR
- UPS NEXT DAY AIR SAVER
- UPS GROUND (both residential and commercial)
- All international services
- All specialized services (Asendia, Passport, etc.)
Weight Breakpoints
When splitting, use these breakpoints:
| Sheet Suffix | Weight Range | Description |
|---|---|---|
| SUB1 | < 1.0 lb | Typically 1-16 oz |
| 1LB | 1.0 - 6.0 lbs | Light packages |
| 6LB | 6.0 - 10.0 lbs | Medium packages |
| 10LB | > 10.0 lbs | Heavy packages |
Sheet Naming Convention
Format: ##_CARRIER_SERVICE_[WEIGHT]_2025
Examples:
- Single sheet:
17_FEDEX_STD_OVERN_2025 - Split sheets:
20_FEDEX_SMARTP_SUB1_202521_FEDEX_SMARTP_1LB_202522_FEDEX_SMARTP_6LB_202523_FEDEX_SMARTP_10LB_2025
Current Agent Issues
Issue 1: Splitting ALL Services
Problem: Agent splits every service by weight Expected: Only split services with “SUB 1LB” mapping key
Issue 2: Missing Single-Sheet Services
Problem: Express/priority services are being split when they shouldn’t be Expected: Keep all weights together for these services
Issue 3: Sheet Count Mismatch
Current Output: 50-55 sheets Expected Output: 60 sheets Gap: Missing ~5-10 single-sheet services
Validation Metrics
Using validate-output.py against reference file:
| Metric | Current | Target |
|---|---|---|
| Validation Score | 0/59 (0%) | 59/59 (100%) |
| Sheets Found | 1/59 (1.7%) | 59/59 (100%) |
| Structure Match | None | Exact match |
Required Agent Changes
-
Add weight-split detection logic:
- Check if service mapping has “SUB 1LB” key
- If YES → split into 4 sheets by weight breakpoints
- If NO → create single sheet with all weights
-
Implement correct naming:
- Follow
##_CARRIER_SERVICE_[WEIGHT]_2025format - Use proper weight suffixes (SUB1, 1LB, 6LB, 10LB)
- Follow
-
Match reference sheet order:
- Maintain numerical sequence (01, 02, 03…)
- Group by carrier, then service
-
Validate against reference:
- Run
validate-output.pyafter generation - Target: 90%+ validation score (53+ sheets matched)
- Run
Success Criteria
- ✅ 60 total sheets (59 rate cards + 1 summary)
- ✅ 14 services split into 4 sheets each (56 sheets)
- ✅ Remaining services as single sheets (3 sheets)
- ✅ Validation score ≥ 90% (53/59 sheets)
- ✅ Correct metadata headers per sheet
- ✅ Correct rate table structure per sheet