A comprehensive comparison of our custom @research/graph solution against alternative knowledge graph frameworks and tools.

Overview

Building a knowledge graph from documents typically involves:

  1. Entity Extraction - Identify entities and relationships from text
  2. Graph Storage - Store in a graph database
  3. Semantic Search - Enable vector-based similarity search
  4. Query Interface - Natural language or Cypher queries

Different solutions make different trade-offs across these components.


Our Implementation: @research/graph

Architecture

┌─────────────────────────────────────────────────────────────────┐
│ @research/graph Stack │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────────┐ │
│ │ Markdown │────▶│ Claude │────▶│ YAML │ │
│ │ Documents │ │ /entity- │ │ Frontmatter │ │
│ │ ./docs/ │ │ extract │ │ (entities) │ │
│ └──────────────┘ └──────────────┘ └───────┬────────┘ │
│ │ │
│ ┌──────────────────────────────────────────────────▼────────┐ │
│ │ Sync Service │ │
│ │ - Manifest-based change detection │ │
│ │ - Entity deduplication in code │ │
│ │ - Incremental updates (add/update/delete) │ │
│ └──────────────────────────────────────────────────┬────────┘ │
│ │ │
│ ┌──────────────────────────────────────────────────▼────────┐ │
│ │ FalkorDB (Docker) │ │
│ │ - Property graph (Cypher) │ │
│ │ - Vector indices (HNSW) │ │
│ │ - Cross-entity semantic search │ │
│ └──────────────────────────────────────────────────┬────────┘ │
│ │ │
│ Voyage AI Embeddings ◀┘ │
│ │
└─────────────────────────────────────────────────────────────────┘

Key Design Decisions

DecisionChoiceRationale
LanguageTypeScript/BunMatch existing tooling, fast execution
FrameworkNestJSDependency injection, modular architecture
Entity extractionHuman-initiated, AI-poweredControl + transparency over results
Schema storageYAML frontmatterVersion-controlled, auditable, human-readable
Graph databaseFalkorDBFree, fast, built-in vector search
EmbeddingsVoyage AIHigh quality, cost-effective
Sync strategyIncrementalOnly process changed files

Features

  • 41 TypeScript files in packages/graph/
  • 10 CLI commands: sync, search, stats, validate, migrate, etc.
  • Entity types: Topic, Technology, Concept, Tool, Process, Person, Organization
  • Relationship types: REFERENCES (simplified for validation)
  • Cross-entity semantic search: Searches all node types with unified results

Custom Slash Commands (Automation)

We’ve built custom Claude Code slash commands to streamline the workflow:

CommandPurposeModel
/research [topic]Smart research: searches existing docs first, asks before newSonnet
/entity-extract [file]Extract entities/relationships from a documentHaiku
/graph-syncBatch extract + sync modified docs to graphSonnet
/update-related [file]Update related documents after changes-
/research-readme [topic]Generate README for existing topic-

Workflow automation:

/research "topic" # 1. Search existing docs first
Creates/updates docs # 2. Write research if needed
/graph-sync # 3. Extract entities + sync
Knowledge graph updated # 4. Semantic search available

Parallel execution: /graph-sync uses Task subagents to extract entities from multiple documents in parallel, significantly speeding up batch operations.


Alternative Solutions

1. LightRAG

What it is: A lightweight alternative to Microsoft’s GraphRAG, optimized for speed and cost.

Architecture: Graph-enhanced text indexing with dual-level retrieval (low-level entities + high-level themes).

AspectDetails
LanguagePython
Performance~20-30ms faster response, “10x faster queries”
CostClaims 90% lower than GraphRAG
Entity extractionAutomatic (LLM-based)
Incremental updatesUnion new docs into existing graph (~50% faster)
Accuracy10% less relational fidelity vs GraphRAG

Best for: Teams prioritizing speed and cost over deep relationship analysis.

GitHub: HKUDS/LightRAG

2. Microsoft GraphRAG

What it is: Microsoft’s production-grade framework with community hierarchy extraction.

Architecture: Builds community summaries for global queries via Leiden algorithm clustering.

AspectDetails
LanguagePython
PerformanceSlower than LightRAG, more thorough
CostHigher due to community restructuring on updates
Entity extractionAutomatic (LLM-based)
Incremental updatesExpensive (requires community rebuild)
FeaturesGlobal summarization, hierarchical communities

Best for: Enterprises needing comprehensive knowledge synthesis.

GitHub: microsoft/graphrag

3. FalkorDB GraphRAG-SDK

What it is: Official SDK from FalkorDB for quick knowledge graph construction.

Architecture: Auto-ontology detection with LiteLLM integration.

AspectDetails
LanguagePython
Setup~20 lines of code
Entity extractionAutomatic with ontology detection
Graph databaseFalkorDB (same as ours)
LLM supportOpenAI, Anthropic, Google, Ollama
Multi-agentBuilt-in orchestration

Best for: Quick prototypes, Python teams.

Why we ejected: SDK bugs, limited customization, needed TypeScript.

GitHub: FalkorDB/GraphRAG-SDK

4. LlamaIndex Property Graph

What it is: Modular framework for building knowledge graphs with multiple extraction strategies.

Architecture: Schema-guided, free-form, or implicit extraction methods.

AspectDetails
LanguagePython
Extraction modesSchema-guided, free-form, implicit
Entity deduplicationText embeddings + word distance
Graph storesIn-memory, disk, Neo4j
CustomizationHigh (modular extractors/retrievers)

Best for: Teams needing fine-grained control over extraction.

Docs: LlamaIndex Property Graph

5. nano-graphrag

What it is: Minimalist alternative with clean, readable code.

Architecture: Essential GraphRAG functionality without overhead.

AspectDetails
LanguagePython
ComplexityVery low
Query modesNaive, Local, Global
Code qualityClean, readable, maintainable

Best for: Learning GraphRAG concepts, simple use cases.

GitHub: gusye1234/nano-graphrag

6. LangChain Knowledge Graph RAG

What it is: Integrations for constructing and querying knowledge graphs within LangChain.

AspectDetails
LanguagePython
IntegrationWorks with existing LangChain chains
Hybrid retrievalVector + graph re-ranking
Graph storesNeo4j, FalkorDB, others

Best for: Teams already using LangChain.


Feature Comparison Matrix

Feature@research/graphLightRAGGraphRAGSDKLlamaIndex
LanguageTypeScriptPythonPythonPythonPython
Extraction triggerHuman-initiatedPipeline-autoPipeline-autoPipeline-autoPipeline-auto
Extraction engineClaude (AI)LLM APILLM APILLM APILLM API
Schema controlFull (YAML)LimitedLimitedModerateSchema-guided
Output transparencyEditable YAMLInternalInternalInternalConfigurable
Incremental syncYesYesExpensiveVariesVaries
Graph DBFalkorDBMultipleNeo4jFalkorDBMultiple
Vector searchYesYesYesYesYes
Cross-entity searchYesYesLimitedYesYes
Version controlFull (YAML)CodeCodeCodeCode
Human reviewBuilt-inNoNoNoNo
Setup complexityMediumLowHighLowMedium
CustomizationHighMediumHighLowHigh

Performance Benchmarks

From Public Research

MetricLightRAGGraphRAGNotes
Response time~20-30ms fasterBaselinePer-query
Query cost90% lowerBaselineAPI calls
Update cost~50% lowerBaselineIncremental
Relational accuracy90%100%Complex relationships

Our Implementation (Estimated)

MetricValueNotes
Sync time~1-2s for 130 docsIncremental only
Embedding cost~$0.02 per full syncVoyage AI
Search latency<100msCross-entity semantic
Entity extractionManualHigher accuracy, slower

When to Use What

Use @research/graph When:

  • ✅ You need TypeScript/Bun ecosystem
  • ✅ Human review of entities is important
  • ✅ Version-controlled schema is required
  • ✅ You want full control over the implementation
  • ✅ Your team already knows NestJS/TypeScript

Use LightRAG When:

  • ✅ Speed and cost are primary concerns
  • ✅ You’re comfortable with Python
  • ✅ You need automatic entity extraction
  • ✅ Complex relationship fidelity isn’t critical

Use Microsoft GraphRAG When:

  • ✅ You need global summarization across documents
  • ✅ Community hierarchy analysis is valuable
  • ✅ You have budget for higher compute costs
  • ✅ Enterprise-grade features are required

Use FalkorDB GraphRAG-SDK When:

  • ✅ You want quick prototype in Python
  • ✅ You’re already using FalkorDB
  • ✅ Auto-ontology detection is sufficient
  • ✅ You don’t need deep customization

Use LlamaIndex When:

  • ✅ You need schema-guided extraction
  • ✅ You want modular, swappable components
  • ✅ You’re building a larger LlamaIndex application
  • ✅ Entity disambiguation is critical

Our Unique Advantages

1. TypeScript Ecosystem

Most knowledge graph tools are Python-only. Our solution:

  • Runs on Bun (faster than Node.js)
  • Uses NestJS (enterprise patterns)
  • Integrates with existing TypeScript tooling

2. Human-Initiated, AI-Powered Extraction

Both approaches use AI for extraction. The difference is control:

AspectOur ApproachPipeline-Auto
Trigger/entity-extract commandAutomatic on ingestion
EngineClaude (Sonnet/Haiku)SDK’s chosen LLM
OutputYAML in markdown (visible)Internal graph (opaque)
ReviewEdit before syncTrust the pipeline
QualityHigh (Claude + review)Varies by SDK

The extraction is AI-powered (Claude does the work), but human-initiated (you decide when) with transparent output (editable YAML).

3. YAML Frontmatter Schema

Entities and relationships live in markdown:

  • Version-controlled with documents
  • Human-readable and editable
  • Auditable changes over time
  • Can be corrected before hitting the graph

4. Incremental Sync with Manifest

Our manifest-based change detection:

  • Content hash + frontmatter hash tracking
  • Only syncs changed documents
  • ~1-2s sync time for 130+ docs

Unified search across all node types:

  • Documents, Concepts, Tools, etc.
  • Distance normalization for fair ranking
  • Single query returns mixed results

Potential Improvements

From Alternative Solutions

FeatureFromPotential Value
Pipeline-auto optionLightRAG/GraphRAGFaster for bulk/low-stakes docs
Community hierarchyGraphRAGBetter global summarization
Dual-level retrievalLightRAGLow-level + high-level queries
Entity disambiguationLlamaIndexReduce duplicates

Identified Gaps

  1. No pipeline-auto mode - Could add optional auto-extraction on file save
  2. Single graph DB - FalkorDB only (could add adapters)
  3. No global summarization - Per-document only
  4. Limited visualization - FalkorDB UI is basic

Conclusion

Our @research/graph implementation occupies a unique position:

DimensionPosition
ControlMaximum (human-initiated, custom code)
TransparencyHigh (YAML output, editable)
QualityHigh (Claude + human review option)
CostLow (self-hosted FalkorDB)
EcosystemTypeScript (rare in this space)

The key insight: all solutions use AI for extraction. The difference is workflow control:

  • Pipeline-auto (LightRAG, GraphRAG): Extract automatically, trust the output
  • Human-initiated (ours): Extract on command, review before sync

For research documentation where quality matters, having transparent, editable output is valuable. The TypeScript stack provides integration benefits with other tooling.

Recommendation: Continue with current implementation. Consider adding optional pipeline-auto mode for bulk operations where review isn’t needed.


Sources