implementation-comparison
A comprehensive comparison of our custom @research/graph solution against alternative knowledge graph frameworks and tools.
Overview
Building a knowledge graph from documents typically involves:
- Entity Extraction - Identify entities and relationships from text
- Graph Storage - Store in a graph database
- Semantic Search - Enable vector-based similarity search
- Query Interface - Natural language or Cypher queries
Different solutions make different trade-offs across these components.
Our Implementation: @research/graph
Architecture
┌─────────────────────────────────────────────────────────────────┐│ @research/graph Stack │├─────────────────────────────────────────────────────────────────┤│ ││ ┌──────────────┐ ┌──────────────┐ ┌────────────────┐ ││ │ Markdown │────▶│ Claude │────▶│ YAML │ ││ │ Documents │ │ /entity- │ │ Frontmatter │ ││ │ ./docs/ │ │ extract │ │ (entities) │ ││ └──────────────┘ └──────────────┘ └───────┬────────┘ ││ │ ││ ┌──────────────────────────────────────────────────▼────────┐ ││ │ Sync Service │ ││ │ - Manifest-based change detection │ ││ │ - Entity deduplication in code │ ││ │ - Incremental updates (add/update/delete) │ ││ └──────────────────────────────────────────────────┬────────┘ ││ │ ││ ┌──────────────────────────────────────────────────▼────────┐ ││ │ FalkorDB (Docker) │ ││ │ - Property graph (Cypher) │ ││ │ - Vector indices (HNSW) │ ││ │ - Cross-entity semantic search │ ││ └──────────────────────────────────────────────────┬────────┘ ││ │ ││ Voyage AI Embeddings ◀┘ ││ │└─────────────────────────────────────────────────────────────────┘Key Design Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Language | TypeScript/Bun | Match existing tooling, fast execution |
| Framework | NestJS | Dependency injection, modular architecture |
| Entity extraction | Human-initiated, AI-powered | Control + transparency over results |
| Schema storage | YAML frontmatter | Version-controlled, auditable, human-readable |
| Graph database | FalkorDB | Free, fast, built-in vector search |
| Embeddings | Voyage AI | High quality, cost-effective |
| Sync strategy | Incremental | Only process changed files |
Features
- 41 TypeScript files in
packages/graph/ - 10 CLI commands: sync, search, stats, validate, migrate, etc.
- Entity types: Topic, Technology, Concept, Tool, Process, Person, Organization
- Relationship types: REFERENCES (simplified for validation)
- Cross-entity semantic search: Searches all node types with unified results
Custom Slash Commands (Automation)
We’ve built custom Claude Code slash commands to streamline the workflow:
| Command | Purpose | Model |
|---|---|---|
/research [topic] | Smart research: searches existing docs first, asks before new | Sonnet |
/entity-extract [file] | Extract entities/relationships from a document | Haiku |
/graph-sync | Batch extract + sync modified docs to graph | Sonnet |
/update-related [file] | Update related documents after changes | - |
/research-readme [topic] | Generate README for existing topic | - |
Workflow automation:
/research "topic" # 1. Search existing docs first ↓Creates/updates docs # 2. Write research if needed ↓/graph-sync # 3. Extract entities + sync ↓Knowledge graph updated # 4. Semantic search availableParallel execution: /graph-sync uses Task subagents to extract entities from multiple documents in parallel, significantly speeding up batch operations.
Alternative Solutions
1. LightRAG
What it is: A lightweight alternative to Microsoft’s GraphRAG, optimized for speed and cost.
Architecture: Graph-enhanced text indexing with dual-level retrieval (low-level entities + high-level themes).
| Aspect | Details |
|---|---|
| Language | Python |
| Performance | ~20-30ms faster response, “10x faster queries” |
| Cost | Claims 90% lower than GraphRAG |
| Entity extraction | Automatic (LLM-based) |
| Incremental updates | Union new docs into existing graph (~50% faster) |
| Accuracy | 10% less relational fidelity vs GraphRAG |
Best for: Teams prioritizing speed and cost over deep relationship analysis.
GitHub: HKUDS/LightRAG
2. Microsoft GraphRAG
What it is: Microsoft’s production-grade framework with community hierarchy extraction.
Architecture: Builds community summaries for global queries via Leiden algorithm clustering.
| Aspect | Details |
|---|---|
| Language | Python |
| Performance | Slower than LightRAG, more thorough |
| Cost | Higher due to community restructuring on updates |
| Entity extraction | Automatic (LLM-based) |
| Incremental updates | Expensive (requires community rebuild) |
| Features | Global summarization, hierarchical communities |
Best for: Enterprises needing comprehensive knowledge synthesis.
GitHub: microsoft/graphrag
3. FalkorDB GraphRAG-SDK
What it is: Official SDK from FalkorDB for quick knowledge graph construction.
Architecture: Auto-ontology detection with LiteLLM integration.
| Aspect | Details |
|---|---|
| Language | Python |
| Setup | ~20 lines of code |
| Entity extraction | Automatic with ontology detection |
| Graph database | FalkorDB (same as ours) |
| LLM support | OpenAI, Anthropic, Google, Ollama |
| Multi-agent | Built-in orchestration |
Best for: Quick prototypes, Python teams.
Why we ejected: SDK bugs, limited customization, needed TypeScript.
GitHub: FalkorDB/GraphRAG-SDK
4. LlamaIndex Property Graph
What it is: Modular framework for building knowledge graphs with multiple extraction strategies.
Architecture: Schema-guided, free-form, or implicit extraction methods.
| Aspect | Details |
|---|---|
| Language | Python |
| Extraction modes | Schema-guided, free-form, implicit |
| Entity deduplication | Text embeddings + word distance |
| Graph stores | In-memory, disk, Neo4j |
| Customization | High (modular extractors/retrievers) |
Best for: Teams needing fine-grained control over extraction.
Docs: LlamaIndex Property Graph
5. nano-graphrag
What it is: Minimalist alternative with clean, readable code.
Architecture: Essential GraphRAG functionality without overhead.
| Aspect | Details |
|---|---|
| Language | Python |
| Complexity | Very low |
| Query modes | Naive, Local, Global |
| Code quality | Clean, readable, maintainable |
Best for: Learning GraphRAG concepts, simple use cases.
GitHub: gusye1234/nano-graphrag
6. LangChain Knowledge Graph RAG
What it is: Integrations for constructing and querying knowledge graphs within LangChain.
| Aspect | Details |
|---|---|
| Language | Python |
| Integration | Works with existing LangChain chains |
| Hybrid retrieval | Vector + graph re-ranking |
| Graph stores | Neo4j, FalkorDB, others |
Best for: Teams already using LangChain.
Feature Comparison Matrix
| Feature | @research/graph | LightRAG | GraphRAG | SDK | LlamaIndex |
|---|---|---|---|---|---|
| Language | TypeScript | Python | Python | Python | Python |
| Extraction trigger | Human-initiated | Pipeline-auto | Pipeline-auto | Pipeline-auto | Pipeline-auto |
| Extraction engine | Claude (AI) | LLM API | LLM API | LLM API | LLM API |
| Schema control | Full (YAML) | Limited | Limited | Moderate | Schema-guided |
| Output transparency | Editable YAML | Internal | Internal | Internal | Configurable |
| Incremental sync | Yes | Yes | Expensive | Varies | Varies |
| Graph DB | FalkorDB | Multiple | Neo4j | FalkorDB | Multiple |
| Vector search | Yes | Yes | Yes | Yes | Yes |
| Cross-entity search | Yes | Yes | Limited | Yes | Yes |
| Version control | Full (YAML) | Code | Code | Code | Code |
| Human review | Built-in | No | No | No | No |
| Setup complexity | Medium | Low | High | Low | Medium |
| Customization | High | Medium | High | Low | High |
Performance Benchmarks
From Public Research
| Metric | LightRAG | GraphRAG | Notes |
|---|---|---|---|
| Response time | ~20-30ms faster | Baseline | Per-query |
| Query cost | 90% lower | Baseline | API calls |
| Update cost | ~50% lower | Baseline | Incremental |
| Relational accuracy | 90% | 100% | Complex relationships |
Our Implementation (Estimated)
| Metric | Value | Notes |
|---|---|---|
| Sync time | ~1-2s for 130 docs | Incremental only |
| Embedding cost | ~$0.02 per full sync | Voyage AI |
| Search latency | <100ms | Cross-entity semantic |
| Entity extraction | Manual | Higher accuracy, slower |
When to Use What
Use @research/graph When:
- ✅ You need TypeScript/Bun ecosystem
- ✅ Human review of entities is important
- ✅ Version-controlled schema is required
- ✅ You want full control over the implementation
- ✅ Your team already knows NestJS/TypeScript
Use LightRAG When:
- ✅ Speed and cost are primary concerns
- ✅ You’re comfortable with Python
- ✅ You need automatic entity extraction
- ✅ Complex relationship fidelity isn’t critical
Use Microsoft GraphRAG When:
- ✅ You need global summarization across documents
- ✅ Community hierarchy analysis is valuable
- ✅ You have budget for higher compute costs
- ✅ Enterprise-grade features are required
Use FalkorDB GraphRAG-SDK When:
- ✅ You want quick prototype in Python
- ✅ You’re already using FalkorDB
- ✅ Auto-ontology detection is sufficient
- ✅ You don’t need deep customization
Use LlamaIndex When:
- ✅ You need schema-guided extraction
- ✅ You want modular, swappable components
- ✅ You’re building a larger LlamaIndex application
- ✅ Entity disambiguation is critical
Our Unique Advantages
1. TypeScript Ecosystem
Most knowledge graph tools are Python-only. Our solution:
- Runs on Bun (faster than Node.js)
- Uses NestJS (enterprise patterns)
- Integrates with existing TypeScript tooling
2. Human-Initiated, AI-Powered Extraction
Both approaches use AI for extraction. The difference is control:
| Aspect | Our Approach | Pipeline-Auto |
|---|---|---|
| Trigger | /entity-extract command | Automatic on ingestion |
| Engine | Claude (Sonnet/Haiku) | SDK’s chosen LLM |
| Output | YAML in markdown (visible) | Internal graph (opaque) |
| Review | Edit before sync | Trust the pipeline |
| Quality | High (Claude + review) | Varies by SDK |
The extraction is AI-powered (Claude does the work), but human-initiated (you decide when) with transparent output (editable YAML).
3. YAML Frontmatter Schema
Entities and relationships live in markdown:
- Version-controlled with documents
- Human-readable and editable
- Auditable changes over time
- Can be corrected before hitting the graph
4. Incremental Sync with Manifest
Our manifest-based change detection:
- Content hash + frontmatter hash tracking
- Only syncs changed documents
- ~1-2s sync time for 130+ docs
5. Cross-Entity Semantic Search
Unified search across all node types:
- Documents, Concepts, Tools, etc.
- Distance normalization for fair ranking
- Single query returns mixed results
Potential Improvements
From Alternative Solutions
| Feature | From | Potential Value |
|---|---|---|
| Pipeline-auto option | LightRAG/GraphRAG | Faster for bulk/low-stakes docs |
| Community hierarchy | GraphRAG | Better global summarization |
| Dual-level retrieval | LightRAG | Low-level + high-level queries |
| Entity disambiguation | LlamaIndex | Reduce duplicates |
Identified Gaps
- No pipeline-auto mode - Could add optional auto-extraction on file save
- Single graph DB - FalkorDB only (could add adapters)
- No global summarization - Per-document only
- Limited visualization - FalkorDB UI is basic
Conclusion
Our @research/graph implementation occupies a unique position:
| Dimension | Position |
|---|---|
| Control | Maximum (human-initiated, custom code) |
| Transparency | High (YAML output, editable) |
| Quality | High (Claude + human review option) |
| Cost | Low (self-hosted FalkorDB) |
| Ecosystem | TypeScript (rare in this space) |
The key insight: all solutions use AI for extraction. The difference is workflow control:
- Pipeline-auto (LightRAG, GraphRAG): Extract automatically, trust the output
- Human-initiated (ours): Extract on command, review before sync
For research documentation where quality matters, having transparent, editable output is valuable. The TypeScript stack provides integration benefits with other tooling.
Recommendation: Continue with current implementation. Consider adding optional pipeline-auto mode for bulk operations where review isn’t needed.