frontmatter-free-proposal

Problem Statement

Currently, Lattice uses YAML frontmatter as an intermediate checkpoint:

Document → Claude extracts → YAML frontmatter → lattice sync → FalkorDB

The frontmatter was implemented because Claude Code needs somewhere to write output. However:

Users rarely review/edit the YAML before syncing
The “human review” benefit is theoretical, not practical
It adds complexity (two-step process, schema validation)
Frontmatter clutters markdown files

Proposal: Remove frontmatter as a requirement. Entities go directly to FalkorDB.

Current Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                     CURRENT: Two-Phase Flow                             │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   Phase 1: Entity Extraction (Claude Code)                              │
│   ┌─────────────┐     ┌─────────────┐     ┌─────────────────┐          │
│   │  Document   │────▶│  Claude     │────▶│  YAML           │          │
│   │  content    │     │  /entity-   │     │  frontmatter    │          │
│   │             │     │  extract    │     │  (in .md file)  │          │
│   └─────────────┘     └─────────────┘     └────────┬────────┘          │
│                                                    │                    │
│   Phase 2: Graph Sync (Lattice CLI)                │                    │
│   ┌─────────────────────────────────────────────────▼────────┐          │
│   │  lattice sync                                            │          │
│   │  - Read frontmatter from .md files                       │          │
│   │  - Validate schema                                       │          │
│   │  - Upsert to FalkorDB                                    │          │
│   │  - Generate embeddings                                   │          │
│   └──────────────────────────────────────────────────────────┘          │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Pain points:

Two commands needed (/entity-extract then /graph-sync)
Schema validation errors require re-editing frontmatter
Frontmatter bloats markdown files
Nobody actually reviews before sync

Proposed Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                   PROPOSED: Single-Phase Flow                           │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│   /graph-sync (Claude Code + Lattice CLI)                               │
│   ┌─────────────┐     ┌─────────────┐     ┌─────────────────┐          │
│   │  Document   │────▶│  Claude     │────▶│  JSON output    │          │
│   │  content    │     │  extraction │     │  (structured)   │          │
│   └─────────────┘     └─────────────┘     └────────┬────────┘          │
│                                                    │                    │
│                                                    ▼                    │
│                                           ┌─────────────────┐          │
│                                           │  Lattice CLI    │          │
│                                           │  (pipes JSON    │          │
│                                           │   to FalkorDB)  │          │
│                                           └────────┬────────┘          │
│                                                    │                    │
│                                                    ▼                    │
│                                           ┌─────────────────┐          │
│                                           │   FalkorDB      │          │
│                                           │   (source of    │          │
│                                           │    truth)       │          │
│                                           └─────────────────┘          │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Implementation Options

Option A: Claude → JSON → CLI Pipe

Claude outputs structured JSON, CLI parses and syncs:

# New /graph-sync command workflow
claude "Extract entities from doc.md as JSON" | lattice ingest --stdin

Slash command:

# /graph-sync command
1. Run `lattice status` to find changed docs
2. For each doc:
   - Read content
   - Extract entities (output as JSON)
   - Pipe to `lattice ingest` command
3. Report results

Claude output format:

{
  "document": "docs/topic/file.md",
  "title": "Document Title",
  "summary": "2-3 sentence summary for embeddings",
  "entities": [
    {"name": "FalkorDB", "type": "Technology", "description": "..."}
  ],
  "relationships": [
    {"source": "Lattice", "relation": "USES", "target": "FalkorDB"}
  ]
}

New CLI command:

lattice ingest --stdin              # Read JSON from stdin
lattice ingest --file extract.json  # Read from file
lattice ingest --doc docs/file.md   # Extract + ingest in one step

Option B: MCP Server for Direct Graph Access

Add FalkorDB as an MCP server, let Claude write directly:

// Claude could call MCP tools directly
mcp__falkordb__upsert_node({
  label: "Technology",
  name: "FalkorDB",
  description: "Graph database"
})

mcp__falkordb__upsert_relationship({
  source: "Lattice",
  relation: "USES",
  target: "FalkorDB"
})

Pros:

Simplest architecture (Claude → Graph directly)
No intermediate files or CLI commands

Cons:

Requires MCP server setup
Less control over batching/transactions
Harder to track what was synced

Option C: Hybrid - Optional Frontmatter

Keep frontmatter as optional, add direct sync:

lattice sync                    # Current: reads frontmatter
lattice sync --extract          # New: Claude extracts + syncs directly
lattice sync --extract doc.md   # Single doc, no frontmatter needed

Workflow:

# For reviewed docs (keep frontmatter)
/entity-extract doc.md
# Edit YAML if needed
lattice sync

# For bulk/trusted docs (skip frontmatter)
lattice sync --extract docs/

Recommended Approach: Option A (JSON Pipe)

Why Option A?

Factor	Option A (JSON)	Option B (MCP)	Option C (Hybrid)
Complexity	Low	High (MCP setup)	Medium
Control	High (CLI validates)	Low (direct writes)	High
Debugging	Easy (inspect JSON)	Hard	Medium
Rollback	Re-run extraction	Messy	Medium
Batch ops	Natural	Manual	Natural

Implementation Plan

Phase 1: New CLI Command

Add lattice ingest command:

@Command({
  name: 'ingest',
  description: 'Ingest extracted entities from JSON'
})
export class IngestCommand {
  @Option({
    flags: '--stdin',
    description: 'Read JSON from stdin'
  })
  stdin: boolean;

  @Option({
    flags: '--file <path>',
    description: 'Read JSON from file'
  })
  file: string;

  async run(): Promise<void> {
    const json = this.stdin
      ? await this.readStdin()
      : await this.readFile(this.file);

    const data = JSON.parse(json);
    await this.syncService.ingestExtraction(data);
  }
}

Phase 2: Update Slash Commands

New /graph-sync command:

---
description: Extract entities and sync to graph (no frontmatter)
model: sonnet
---

## Process

1. Run `lattice status` to find docs needing sync
2. For each document:
   a. Read document content
   b. Extract entities as structured JSON:
      ```json
      {
        "document": "path/to/doc.md",
        "title": "...",
        "summary": "...",
        "entities": [...],
        "relationships": [...]
      }
      ```
   c. Write JSON to temp file
   d. Run `lattice ingest --file /tmp/extract-{hash}.json`
3. Report results

Phase 3: Deprecate Frontmatter

Keep frontmatter parsing for backward compatibility
Add migration command: lattice migrate --to-graph
Document new workflow as preferred

Schema Changes

Current: Frontmatter in Markdown

---
entities:
  - name: FalkorDB
    type: Technology
    description: Graph database
relationships:
  - source: this
    relation: REFERENCES
    target: FalkorDB
---

Proposed: JSON Extraction Format

{
  "document": "docs/local-knowledge-graph/architecture.md",
  "contentHash": "abc123",
  "extraction": {
    "title": "Knowledge Graph Architecture",
    "summary": "Technical overview of the knowledge graph implementation...",
    "entities": [
      {
        "name": "FalkorDB",
        "type": "Technology",
        "description": "Graph database with vector search"
      }
    ],
    "relationships": [
      {
        "source": "Lattice",
        "relation": "USES",
        "target": "FalkorDB"
      }
    ]
  },
  "metadata": {
    "extractedAt": "2025-11-29T09:30:00Z",
    "extractedBy": "claude-sonnet-4"
  }
}

What About Summary/Title?

Currently, summary in frontmatter is used for embeddings. Options:

Option 1: Extract Summary with Entities

Include in JSON extraction (recommended):

{
  "document": "...",
  "title": "Extracted from # heading or first line",
  "summary": "AI-generated summary for embeddings",
  "entities": [...]
}

Option 2: Minimal Frontmatter

Keep only non-entity fields:

---
created: 2025-11-29
updated: 2025-11-29
status: complete
topic: local-knowledge-graph
# No entities or relationships
---

Option 3: No Frontmatter at All

Derive everything from content:

Title: First # heading
Created/Updated: Git history
Topic: Directory name
Summary: AI-generated during extraction

Migration Path

For Existing Users

# 1. Export entities from frontmatter to graph
lattice migrate --frontmatter-to-graph

# 2. Optionally strip entities from frontmatter
lattice migrate --strip-frontmatter-entities

# 3. Use new workflow going forward
/graph-sync  # Extracts and syncs directly

Backward Compatibility

lattice sync continues to work with frontmatter
New lattice ingest for JSON input
/entity-extract deprecated but still works

Trade-offs

What We Lose

Lost Capability	Impact	Mitigation
Edit before sync	Low (rarely used)	Re-extract if wrong
Git history of entities	Medium	Graph has its own history
Audit trail in files	Low	Use `lattice history` command
Offline entity inspection	Medium	`lattice export` to JSON

What We Gain

New Capability	Impact
Single command	High - simpler workflow
Cleaner markdown	Medium - no YAML bloat
Faster sync	Medium - no file I/O
Graph as truth	High - single source

Open Questions

Re-extraction on changes: If document content changes, should we re-extract automatically or require explicit command?
Conflict resolution: If entity exists in graph but document changed, update or warn?
Bulk operations: How to handle 100+ docs efficiently? Parallel Claude calls?
Embedding generation: Still happens in CLI during ingest, or move to extraction phase?
Validation: Where does schema validation happen? CLI or Claude?

Next Steps

Implement lattice ingest command
Create new /graph-sync slash command
Test with 10-20 documents
Add migration tooling
Update documentation
Deprecation timeline for frontmatter workflow

Conclusion

The frontmatter-free architecture simplifies Lattice by:

Removing the intermediate checkpoint that users don’t actually use
Making the graph the source of truth for entities
Enabling single-command workflow (/graph-sync does everything)
Keeping markdown files clean of entity metadata

The key insight: frontmatter was an implementation detail for Claude Code output, not a user feature. By using JSON as the transport format, we preserve Claude Code integration while eliminating unnecessary complexity.

Architecture - Current Lattice architecture
Implementation Comparison - Comparison with LightRAG, GraphRAG