REFERENCE_STRUCTURE_ANALYSIS

Critical Discovery

The validation script revealed that all iterations have been 0% successful at matching the reference output structure. The agent has been generating a fundamentally different organization.

The Problem

What We’re Generating: ALL services split by weight (SUB1, 1LB, 6LB, 10LB) What Reference Expects: ONLY services with “SUB 1LB” key are split; others remain single sheets

Reference File Structure

Total Sheets: 60 (59 rate cards + 1 summary)

Breakdown by Carrier

DHL: 11 sheets
ENDICIA: 5 sheets
FEDEX: 9 sheets
OSM: 8 sheets
UPS: 22 sheets
VEHO: 4 sheets

Weight Splitting Rules

Rule 1: Services WITH “SUB 1LB 2025” Key → SPLIT INTO 4 SHEETS

These services get split into weight breakpoints:

SUB1 (< 1 lb)
1LB (1-6 lbs)
6LB (6-10 lbs)
10LB (> 10 lbs)

Examples:

DHL SMPP GRO Service:
  02_DHL_SMPP_GRO_SUB1_2025
  03_DHL_SMPP_GRO_1LB_2025
  04_DHL_SMPP_GRO_6LB_2025
  05_DHL_SMPP_GRO_10LB_2025

FEDEX SMARTPOST:
  20_FEDEX_SMARTP_SUB1_2025
  21_FEDEX_SMARTP_1LB_2025
  22_FEDEX_SMARTP_6LB_2025
  23_FEDEX_SMARTP_10LB_2025

Services That Should Be Split (14 total):

DHL ECOMMERCE - DHLECOMMERCE DHL SM PARCEL GROUND
DHL ECOMMERCE - DHLECOMMERCE DHL SM PARCEL PLUS GROUND
DHL ECOMMERCE - DHLECOMMERCE DHL SM PARCEL EXPEDITED
DHL ECOMMERCE - DHLECOMMERCE DHL SM PARCEL PLUS EXPEDITED
ENDICIA - ENDICIA GROUND ADVANTAGE
FEDEX - FEDEX SMARTPOST
OSM - OSMWORLDWIDE GROUND ADVANTAGE
OSM - OSMWORLDWIDE PARCEL
UPS - UPS SUREPOST OVER ONE POUND
UPS - RETURN
UPS - UPS GROUND SAVER - 1 LB OR GREATER
UPS MI - UPS PARCEL SELECT OVER 1LB
UPS MI - UPS PARCEL SELECT UNDER 1LB
VEHO - VEHO GROUND

Rule 2: Services WITHOUT “SUB 1LB” Key → SINGLE SHEET (ALL WEIGHTS)

These services keep all weight ranges in ONE sheet.

Examples:

16_ENDICIA_PRIO_MAIL_2025 (all weights 0-150 lbs)
17_FEDEX_STD_OVERN_2025 (all weights 0-150 lbs)
18_FEDEX_2DAY_2025 (all weights 0-150 lbs)
24_FEDEX_GROUND_2025 (all weights 0-150 lbs)
38_UPS_3DAY_SEL_2025 (all weights 0-150 lbs)

Services That Should Be Single Sheet:

ENDICIA PRIORITY MAIL
FEDEX STANDARD OVERNIGHT
FEDEX 2DAY
FEDEX HOME DELIVERY
FEDEX GROUND
FEDEX PRIORITY OVERNIGHT
UPS 3 DAY SELECT
UPS 2ND DAY AIR
UPS NEXT DAY AIR
UPS NEXT DAY AIR SAVER
UPS GROUND (both residential and commercial)
All international services
All specialized services (Asendia, Passport, etc.)

Weight Breakpoints

When splitting, use these breakpoints:

Sheet Suffix	Weight Range	Description
SUB1	< 1.0 lb	Typically 1-16 oz
1LB	1.0 - 6.0 lbs	Light packages
6LB	6.0 - 10.0 lbs	Medium packages
10LB	> 10.0 lbs	Heavy packages

Sheet Naming Convention

Format: ##_CARRIER_SERVICE_[WEIGHT]_2025

Examples:

Single sheet: 17_FEDEX_STD_OVERN_2025
Split sheets:
- 20_FEDEX_SMARTP_SUB1_2025
- 21_FEDEX_SMARTP_1LB_2025
- 22_FEDEX_SMARTP_6LB_2025
- 23_FEDEX_SMARTP_10LB_2025

Current Agent Issues

Issue 1: Splitting ALL Services

Problem: Agent splits every service by weight Expected: Only split services with “SUB 1LB” mapping key

Issue 2: Missing Single-Sheet Services

Problem: Express/priority services are being split when they shouldn’t be Expected: Keep all weights together for these services

Issue 3: Sheet Count Mismatch

Current Output: 50-55 sheets Expected Output: 60 sheets Gap: Missing ~5-10 single-sheet services

Validation Metrics

Using validate-output.py against reference file:

Metric	Current	Target
Validation Score	0/59 (0%)	59/59 (100%)
Sheets Found	1/59 (1.7%)	59/59 (100%)
Structure Match	None	Exact match

Required Agent Changes

Add weight-split detection logic:
- Check if service mapping has “SUB 1LB” key
- If YES → split into 4 sheets by weight breakpoints
- If NO → create single sheet with all weights
Implement correct naming:
- Follow ##_CARRIER_SERVICE_[WEIGHT]_2025 format
- Use proper weight suffixes (SUB1, 1LB, 6LB, 10LB)
Match reference sheet order:
- Maintain numerical sequence (01, 02, 03…)
- Group by carrier, then service
Validate against reference:
- Run validate-output.py after generation
- Target: 90%+ validation score (53+ sheets matched)

Success Criteria

✅ 60 total sheets (59 rate cards + 1 summary)
✅ 14 services split into 4 sheets each (56 sheets)
✅ Remaining services as single sheets (3 sheets)
✅ Validation score ≥ 90% (53/59 sheets)
✅ Correct metadata headers per sheet
✅ Correct rate table structure per sheet