embedded-duckdb-alternative
Analysis of replacing FalkorDB with embedded DuckDB VSS+PGQ for Lattice’s local knowledge graph use case.
The Problem
Current: Lattice requires users to:
- Install Docker
- Run FalkorDB container (
docker run -p 6379:6379 falkordb/falkordb) - Manage container lifecycle
- Understand Redis port binding
Barrier to entry: Docker + container management = friction for local knowledge graphs.
Proposed: Embedded DuckDB
Zero-setup alternative:
bun add -g @zabaca/lattice # That's it - no Docker!lattice initlattice syncNo Docker. No containers. No Redis. Just a single .duckdb file.
Architecture Comparison
| Aspect | FalkorDB (Current) | DuckDB Embedded |
|---|---|---|
| Setup | Docker required | Zero dependencies |
| Deployment | Container + port binding | Single binary |
| Data storage | Redis in-memory + RDB snapshots | Single .duckdb file |
| Vector search | Custom Cypher extension | VSS extension (HNSW) |
| Graph queries | Native (GraphBLAS) | DuckPGQ extension (SQL/PGQ) |
| Performance | Sub-ms traversals | Tens of ms |
| Scale limit | RAM size (~100M edges) | Disk space (unbounded) |
Ease-of-Use Analysis
Current Setup (FalkorDB)
# User needs to:1. Install Docker Desktop2. Run container docker run -d -p 6379:6379 falkordb/falkordb3. Ensure container starts on boot4. Install lattice bun add -g @zabaca/lattice5. Configure connection export FALKORDB_HOST=localhost export FALKORDB_PORT=6379Steps: 5 | External deps: 1 (Docker)
Proposed Setup (DuckDB)
# User needs to:1. Install lattice (DuckDB bundled) bun add -g @zabaca/latticeSteps: 1 | External deps: 0
DuckDB Embedded: Technical Details
Node.js Integration
Using DuckDB Neo API (new in 2024):
import Database from 'duckdb';
// Create embedded database - that's it!const db = new Database('./lattice-kb.duckdb');
// Load extensionsawait db.run('INSTALL vss FROM community');await db.run('LOAD vss');await db.run('INSTALL duckpgq FROM community');await db.run('LOAD duckpgq');
// Enable persistence for HNSW indexes (DuckDB 1.0.0+)await db.run('SET GLOBAL hnsw_enable_experimental_persistence = true');Key insight: The new DuckDB Neo client (replacing deprecated callback-based API) provides native TypeScript support and will be supported through DuckDB 1.5.x (~Early 2026).
HNSW Vector Indexes
// Create table with embeddingsawait db.run(` CREATE TABLE documents ( id INTEGER PRIMARY KEY, content VARCHAR, embedding FLOAT[512] -- Voyage voyage-3-lite )`);
// Create HNSW indexawait db.run(` CREATE INDEX doc_embedding_idx ON documents USING HNSW (embedding) WITH (metric = 'cosine')`);Persistence: With hnsw_enable_experimental_persistence = true, indexes save to the .duckdb file. No rebuild on restart.
Property Graph Queries
// Create property graph over existing tablesawait db.run(` CREATE PROPERTY GRAPH kg VERTEX TABLES (documents, entities) EDGE TABLES ( doc_mentions_entity SOURCE KEY (doc_id) REFERENCES documents (id) DESTINATION KEY (entity_id) REFERENCES entities (id) LABEL MENTIONS )`);
// Query with SQL/PGQ syntaxconst results = await db.query(` SELECT * FROM GRAPH_TABLE (kg MATCH (d:documents)-[m:MENTIONS]->(e:entities) WHERE d.id = 1 COLUMNS (d.id, d.content, e.name) )`);Performance Trade-offs
Validated from packages/duckpgq-vss (2025-12-06)
| Operation | FalkorDB | DuckDB VSS+PGQ |
|---|---|---|
| 2-3 hop traversals | Sub-ms (GraphBLAS) | ~10-50ms |
| Vector search (HNSW) | Custom impl | ~5-10ms |
| Hybrid query (separate) | ~5-10ms | ~100ms |
| Hybrid query (single CTE) | N/A | ❌ Crashes (DuckPGQ bug #276) |
For Lattice scale (~500-1000 entities):
- FalkorDB: ~5ms
- DuckDB: ~100ms
Reality check: For local research knowledge base, 100ms vs 5ms is imperceptible to users.
Implementation Complexity
Refactoring GraphService
Current (graph.service.ts):
@Injectable()export class GraphService { private redis: Redis;
async query(cypher: string): Promise<CypherResult> { return await this.redis.call('GRAPH.QUERY', graphName, cypher); }}Proposed (DuckDB):
@Injectable()export class GraphService { private db: Database;
async query(sql: string): Promise<QueryResult> { return await this.db.query(sql); }
async vectorSearch(query: string, limit: number): Promise<Document[]> { const embedding = await this.embeddingService.embed(query);
return await this.db.query(` SELECT id, content, array_cosine_distance(embedding, ?::FLOAT[512]) as distance FROM documents ORDER BY distance LIMIT ? `, [embedding, limit]); }
async graphExpand(docIds: number[]): Promise<RelatedDoc[]> { return await this.db.query(` SELECT d.id, d.content, e.name as via_entity FROM GRAPH_TABLE (kg MATCH (d:documents)-[:MENTIONS]->(e:entities) COLUMNS (d.id, d.content, e.name) ) WHERE d.id IN (?) `, [docIds]); }}Refactor scope: ~200-300 LOC in GraphService + query methods.
Deployment Advantages
Serverless-Ready
DuckDB works in AWS Lambda, Google Cloud Functions, Vercel Edge:
// No Docker! Just bundle the .duckdb fileexport async function handler(event) { const db = new Database('./lattice-kb.duckdb'); const results = await db.query('SELECT * FROM documents'); return { statusCode: 200, body: JSON.stringify(results) };}FalkorDB: Requires container infrastructure (ECS, Cloud Run, etc.)
Single-File Distribution
# Ship your knowledge baselattice export my-research.duckdb
# Share with others# They get: data + embeddings + graph + indexes in ONE fileFalkorDB: Requires Redis RDB export + import, coordination with Redis instance.
Limitations to Consider
1. DuckPGQ Maturity
| Issue | Impact | Mitigation |
|---|---|---|
| CTE crash bug #276 | Can’t do multi-GRAPH CTEs | Use separate queries (~100ms) |
| DuckDB version pinned to 1.3.1 | DuckPGQ not available for 1.4.x | Wait for upstream build |
| Limited pathfinding | No GraphBLAS-style matrix ops | Sufficient for Lattice scale |
2. Memory vs Disk
| FalkorDB | DuckDB |
|---|---|
| All-in-RAM | Hybrid (spills to disk) |
| Faster for hot data | Slower for cold data |
| RAM limit = hard limit | Disk limit = soft limit |
For Lattice: DuckDB’s hybrid approach is actually beneficial — indexes in RAM, bulk data on disk.
3. HNSW Index Persistence (Experimental)
From DuckDB VSS docs:
With the
hnsw_enable_experimental_persistenceoption enabled, the index will be persisted… However, this is an experimental feature and may not be stable.
Risk: Index persistence could break in future DuckDB versions.
Mitigation: Lattice can rebuild indexes on first run if persistence fails (add ~10s startup cost).
Migration Path
Phase 1: Proof of Concept
Goal: Validate DuckDB works for Lattice’s queries
- Create new
GraphServiceDuckDBimplementation - Run test suite against both FalkorDB and DuckDB
- Compare query results and performance
Phase 2: Dual-Backend Support
Goal: Let users choose backend
{ "backend": "duckdb", // or "falkordb" "duckdb": { "path": "./lattice-kb.duckdb" }, "falkordb": { "host": "localhost", "port": 6379 }}Phase 3: Default to Embedded
Goal: Make DuckDB the default, keep FalkorDB as opt-in for power users
# Default: zero setuplattice init# → Creates lattice-kb.duckdb
# Advanced: use FalkorDBlattice init --backend=falkordb# → Prompts to start Docker containerRecommendation for Lattice
For Local Knowledge Graphs: DuckDB Wins
| Factor | Weight | FalkorDB | DuckDB |
|---|---|---|---|
| Setup simplicity | ⭐⭐⭐⭐⭐ | ❌ Docker | ✅ Zero deps |
| Performance | ⭐⭐⭐ | ✅ Sub-ms | ⚠️ ~100ms |
| Portability | ⭐⭐⭐⭐ | ❌ Container | ✅ Single file |
| Scale limit | ⭐⭐ | ⚠️ RAM | ✅ Disk |
| Serverless support | ⭐⭐⭐⭐ | ❌ No | ✅ Yes |
Conclusion: For Lattice’s target audience (developers building local knowledge bases), DuckDB’s ease-of-use outweighs FalkorDB’s performance advantage.
When to Recommend FalkorDB
Keep FalkorDB as an option for:
- Production GraphRAG services (latency-critical)
- Large teams (already have Docker infra)
- Real-time applications (<10ms response time required)
Next Steps
- Prototype DuckDB backend in feature branch
- Benchmark against FalkorDB with Lattice’s actual queries
- Test HNSW persistence stability over time
- Document migration guide for existing Lattice users
- Release dual-backend support (let community validate)