implementation-comparison

A comprehensive comparison of our custom @research/graph solution against alternative knowledge graph frameworks and tools.

Overview

Building a knowledge graph from documents typically involves:

Entity Extraction - Identify entities and relationships from text
Graph Storage - Store in a graph database
Semantic Search - Enable vector-based similarity search
Query Interface - Natural language or Cypher queries

Different solutions make different trade-offs across these components.

Our Implementation: @research/graph

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    @research/graph Stack                         │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌──────────────┐     ┌──────────────┐     ┌────────────────┐  │
│  │  Markdown    │────▶│  Claude      │────▶│  YAML          │  │
│  │  Documents   │     │  /entity-    │     │  Frontmatter   │  │
│  │  ./docs/     │     │  extract     │     │  (entities)    │  │
│  └──────────────┘     └──────────────┘     └───────┬────────┘  │
│                                                    │           │
│  ┌──────────────────────────────────────────────────▼────────┐  │
│  │                    Sync Service                            │  │
│  │  - Manifest-based change detection                        │  │
│  │  - Entity deduplication in code                           │  │
│  │  - Incremental updates (add/update/delete)                │  │
│  └──────────────────────────────────────────────────┬────────┘  │
│                                                    │           │
│  ┌──────────────────────────────────────────────────▼────────┐  │
│  │                    FalkorDB (Docker)                       │  │
│  │  - Property graph (Cypher)                                │  │
│  │  - Vector indices (HNSW)                                  │  │
│  │  - Cross-entity semantic search                           │  │
│  └──────────────────────────────────────────────────┬────────┘  │
│                                                    │           │
│                              Voyage AI Embeddings ◀┘           │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Key Design Decisions

Decision	Choice	Rationale
Language	TypeScript/Bun	Match existing tooling, fast execution
Framework	NestJS	Dependency injection, modular architecture
Entity extraction	Human-initiated, AI-powered	Control + transparency over results
Schema storage	YAML frontmatter	Version-controlled, auditable, human-readable
Graph database	FalkorDB	Free, fast, built-in vector search
Embeddings	Voyage AI	High quality, cost-effective
Sync strategy	Incremental	Only process changed files

Features

41 TypeScript files in packages/graph/
10 CLI commands: sync, search, stats, validate, migrate, etc.
Entity types: Topic, Technology, Concept, Tool, Process, Person, Organization
Relationship types: REFERENCES (simplified for validation)
Cross-entity semantic search: Searches all node types with unified results

Custom Slash Commands (Automation)

We’ve built custom Claude Code slash commands to streamline the workflow:

Command	Purpose	Model
`/research [topic]`	Smart research: searches existing docs first, asks before new	Sonnet
`/entity-extract [file]`	Extract entities/relationships from a document	Haiku
`/graph-sync`	Batch extract + sync modified docs to graph	Sonnet
`/update-related [file]`	Update related documents after changes	-
`/research-readme [topic]`	Generate README for existing topic	-

Workflow automation:

/research "topic"           # 1. Search existing docs first
  ↓
Creates/updates docs        # 2. Write research if needed
  ↓
/graph-sync                 # 3. Extract entities + sync
  ↓
Knowledge graph updated     # 4. Semantic search available

Parallel execution: /graph-sync uses Task subagents to extract entities from multiple documents in parallel, significantly speeding up batch operations.

Alternative Solutions

1. LightRAG

What it is: A lightweight alternative to Microsoft’s GraphRAG, optimized for speed and cost.

Architecture: Graph-enhanced text indexing with dual-level retrieval (low-level entities + high-level themes).

Aspect	Details
Language	Python
Performance	~20-30ms faster response, “10x faster queries”
Cost	Claims 90% lower than GraphRAG
Entity extraction	Automatic (LLM-based)
Incremental updates	Union new docs into existing graph (~50% faster)
Accuracy	10% less relational fidelity vs GraphRAG

Best for: Teams prioritizing speed and cost over deep relationship analysis.

GitHub: HKUDS/LightRAG

2. Microsoft GraphRAG

What it is: Microsoft’s production-grade framework with community hierarchy extraction.

Architecture: Builds community summaries for global queries via Leiden algorithm clustering.

Aspect	Details
Language	Python
Performance	Slower than LightRAG, more thorough
Cost	Higher due to community restructuring on updates
Entity extraction	Automatic (LLM-based)
Incremental updates	Expensive (requires community rebuild)
Features	Global summarization, hierarchical communities

Best for: Enterprises needing comprehensive knowledge synthesis.

GitHub: microsoft/graphrag

3. FalkorDB GraphRAG-SDK

What it is: Official SDK from FalkorDB for quick knowledge graph construction.

Architecture: Auto-ontology detection with LiteLLM integration.

Aspect	Details
Language	Python
Setup	~20 lines of code
Entity extraction	Automatic with ontology detection
Graph database	FalkorDB (same as ours)
LLM support	OpenAI, Anthropic, Google, Ollama
Multi-agent	Built-in orchestration

Best for: Quick prototypes, Python teams.

Why we ejected: SDK bugs, limited customization, needed TypeScript.

GitHub: FalkorDB/GraphRAG-SDK

4. LlamaIndex Property Graph

What it is: Modular framework for building knowledge graphs with multiple extraction strategies.

Architecture: Schema-guided, free-form, or implicit extraction methods.

Aspect	Details
Language	Python
Extraction modes	Schema-guided, free-form, implicit
Entity deduplication	Text embeddings + word distance
Graph stores	In-memory, disk, Neo4j
Customization	High (modular extractors/retrievers)

Best for: Teams needing fine-grained control over extraction.

Docs: LlamaIndex Property Graph

5. nano-graphrag

What it is: Minimalist alternative with clean, readable code.

Architecture: Essential GraphRAG functionality without overhead.

Aspect	Details
Language	Python
Complexity	Very low
Query modes	Naive, Local, Global
Code quality	Clean, readable, maintainable

Best for: Learning GraphRAG concepts, simple use cases.

GitHub: gusye1234/nano-graphrag

6. LangChain Knowledge Graph RAG

What it is: Integrations for constructing and querying knowledge graphs within LangChain.

Aspect	Details
Language	Python
Integration	Works with existing LangChain chains
Hybrid retrieval	Vector + graph re-ranking
Graph stores	Neo4j, FalkorDB, others

Best for: Teams already using LangChain.

Feature Comparison Matrix

Feature	@research/graph	LightRAG	GraphRAG	SDK	LlamaIndex
Language	TypeScript	Python	Python	Python	Python
Extraction trigger	Human-initiated	Pipeline-auto	Pipeline-auto	Pipeline-auto	Pipeline-auto
Extraction engine	Claude (AI)	LLM API	LLM API	LLM API	LLM API
Schema control	Full (YAML)	Limited	Limited	Moderate	Schema-guided
Output transparency	Editable YAML	Internal	Internal	Internal	Configurable
Incremental sync	Yes	Yes	Expensive	Varies	Varies
Graph DB	FalkorDB	Multiple	Neo4j	FalkorDB	Multiple
Vector search	Yes	Yes	Yes	Yes	Yes
Cross-entity search	Yes	Yes	Limited	Yes	Yes
Version control	Full (YAML)	Code	Code	Code	Code
Human review	Built-in	No	No	No	No
Setup complexity	Medium	Low	High	Low	Medium
Customization	High	Medium	High	Low	High

Performance Benchmarks

From Public Research

Metric	LightRAG	GraphRAG	Notes
Response time	~20-30ms faster	Baseline	Per-query
Query cost	90% lower	Baseline	API calls
Update cost	~50% lower	Baseline	Incremental
Relational accuracy	90%	100%	Complex relationships

Our Implementation (Estimated)

Metric	Value	Notes
Sync time	~1-2s for 130 docs	Incremental only
Embedding cost	~$0.02 per full sync	Voyage AI
Search latency	<100ms	Cross-entity semantic
Entity extraction	Manual	Higher accuracy, slower

When to Use What

Use @research/graph When:

✅ You need TypeScript/Bun ecosystem
✅ Human review of entities is important
✅ Version-controlled schema is required
✅ You want full control over the implementation
✅ Your team already knows NestJS/TypeScript

Use LightRAG When:

✅ Speed and cost are primary concerns
✅ You’re comfortable with Python
✅ You need automatic entity extraction
✅ Complex relationship fidelity isn’t critical

Use Microsoft GraphRAG When:

✅ You need global summarization across documents
✅ Community hierarchy analysis is valuable
✅ You have budget for higher compute costs
✅ Enterprise-grade features are required

Use FalkorDB GraphRAG-SDK When:

✅ You want quick prototype in Python
✅ You’re already using FalkorDB
✅ Auto-ontology detection is sufficient
✅ You don’t need deep customization

Use LlamaIndex When:

✅ You need schema-guided extraction
✅ You want modular, swappable components
✅ You’re building a larger LlamaIndex application
✅ Entity disambiguation is critical

Our Unique Advantages

1. TypeScript Ecosystem

Most knowledge graph tools are Python-only. Our solution:

Runs on Bun (faster than Node.js)
Uses NestJS (enterprise patterns)
Integrates with existing TypeScript tooling

2. Human-Initiated, AI-Powered Extraction

Both approaches use AI for extraction. The difference is control:

Aspect	Our Approach	Pipeline-Auto
Trigger	`/entity-extract` command	Automatic on ingestion
Engine	Claude (Sonnet/Haiku)	SDK’s chosen LLM
Output	YAML in markdown (visible)	Internal graph (opaque)
Review	Edit before sync	Trust the pipeline
Quality	High (Claude + review)	Varies by SDK

The extraction is AI-powered (Claude does the work), but human-initiated (you decide when) with transparent output (editable YAML).

3. YAML Frontmatter Schema

Entities and relationships live in markdown:

Version-controlled with documents
Human-readable and editable
Auditable changes over time
Can be corrected before hitting the graph

4. Incremental Sync with Manifest

Our manifest-based change detection:

Content hash + frontmatter hash tracking
Only syncs changed documents
~1-2s sync time for 130+ docs

5. Cross-Entity Semantic Search

Unified search across all node types:

Documents, Concepts, Tools, etc.
Distance normalization for fair ranking
Single query returns mixed results

Potential Improvements

From Alternative Solutions

Feature	From	Potential Value
Pipeline-auto option	LightRAG/GraphRAG	Faster for bulk/low-stakes docs
Community hierarchy	GraphRAG	Better global summarization
Dual-level retrieval	LightRAG	Low-level + high-level queries
Entity disambiguation	LlamaIndex	Reduce duplicates

Identified Gaps

No pipeline-auto mode - Could add optional auto-extraction on file save
Single graph DB - FalkorDB only (could add adapters)
No global summarization - Per-document only
Limited visualization - FalkorDB UI is basic

Conclusion

Our @research/graph implementation occupies a unique position:

Dimension	Position
Control	Maximum (human-initiated, custom code)
Transparency	High (YAML output, editable)
Quality	High (Claude + human review option)
Cost	Low (self-hosted FalkorDB)
Ecosystem	TypeScript (rare in this space)

The key insight: all solutions use AI for extraction. The difference is workflow control:

Pipeline-auto (LightRAG, GraphRAG): Extract automatically, trust the output
Human-initiated (ours): Extract on command, review before sync

For research documentation where quality matters, having transparent, editable output is valuable. The TypeScript stack provides integration benefits with other tooling.

Recommendation: Continue with current implementation. Consider adding optional pipeline-auto mode for bulk operations where review isn’t needed.