Comparison of approaches for adding graph query capabilities to different database foundations.

The Core Question

Why did FalkorDB choose Redis instead of SQLite or DuckDB?

Redis was chosen for specific architectural reasons:

RequirementRedisSQLiteDuckDB
Memory-firstNative in-memoryDisk-first (WAL helps)Hybrid (can spill)
Extension APIRedis Module API (C)Virtual tables, loadable extensionsCommunity extensions
Low latencySub-millisecondSlower for traversalsOLAP-optimized
ParallelismOpenMP for GraphBLASSingle-writer lockVectorized, parallel
Built-in networkingClient protocol includedEmbedded onlyEmbedded only

From FalkorDB Design Docs:

“FalkorDB is going to target cases which require low latency and as such is going to be in memory.”


How Each Adds Graph Capabilities

FalkorDB (Redis)

Native implementation via Redis Module API:

  • Registers custom commands (GRAPH.QUERY, GRAPH.DELETE)
  • GraphBLAS sparse matrix operations run in C inside Redis process
  • CSC (Compressed Sparse Columns) for adjacency matrices
  • Not a wrapper - graph algorithms execute natively
Client → Redis Protocol → Redis Module API → FalkorDB (GraphBLAS)
Sparse matrix math

DuckDB (DuckPGQ)

Community extension with SQL/PGQ standard:

INSTALL duckpgq FROM community;
LOAD duckpgq;
-- Create property graph over existing tables
CREATE PROPERTY GRAPH ...
-- Query with Cypher-like syntax
SELECT * FROM GRAPH_TABLE (pg
MATCH (a:Person)-[e:KNOWS]->(b:Person)
WHERE a.name = 'Alice'
COLUMNS (b.name AS friend)
);
  • SIMD-friendly bulk path-finding
  • CSR (Compressed Sparse Row) in-memory representation
  • Runs on DuckDB’s vectorized engine

SQLite (Various)

Recursive CTEs (built-in):

WITH RECURSIVE path AS (
SELECT id, name, 0 AS depth FROM nodes WHERE id = 1
UNION ALL
SELECT n.id, n.name, p.depth + 1
FROM nodes n
JOIN edges e ON n.id = e.target_id
JOIN path p ON e.source_id = p.id
WHERE p.depth < 5
)
SELECT * FROM path;

Limitation: “Very slow for larger graphs because the same nodes are visited multiple times”

Extensions:


Performance Comparison

FalkorDB Strengths

AspectPerformanceWhy
2-3 hop traversalsSub-millisecondGraphBLAS matrix multiplication
GraphRAG queries280x faster than Neo4jOptimized for this workload
Concurrent readsHigh throughputRedis networking layer

DuckDB/DuckPGQ Strengths

AspectPerformanceWhy
Batch pathfindingFastSIMD multi-source algorithms
AggregationsExcellentOLAP-native, columnar
Large datasetsUnboundedSpills to disk
Disk performance8x faster than in-memory (compressed)Columnar compression

From DuckDB Memory Management:

“Counter-intuitively, using a disk-based DuckDB instance can be faster than an in-memory instance due to compression.”


Scale Limits

FalkorDB

ConstraintReality
Hard limitRAM size (Redis constraint)
Practical limitMemory fragmentation at scale
Observed issueOOM with 18.49x fragmentation
Team statement”Scaling without memory bloat is the big problem we’re working on”

Memory optimizations (v4.8-4.10):

  • 42% memory reduction
  • String interning for deduplication
  • 7x more efficient than competitors

DuckDB

ConstraintReality
Hard limitDisk space
Memory behaviorSpills to disk when exceeding memory_limit (default 80% RAM)
PerformanceSlower with disk spilling, but still works

When to Use Each

Approximate Scale Guidelines

Graph SizeRecommendationReasoning
< 10M edgesFalkorDBSub-ms latency wins
10M - 100M edgesDepends on RAM & patternTest both
100M+ edgesDuckPGQ or distributedFalkorDB can’t fit in RAM
Larger-than-RAMDuckPGQNo choice - FalkorDB OOMs

Workload Pattern Guidelines

Query TypeWinnerWhy
Real-time traversals (2-3 hops)FalkorDBGraphBLAS optimization
Batch pathfinding (all shortest paths)DuckPGQSIMD bulk algorithms
Graph + analytics (PageRank → aggregate → join)DuckDBOLAP-native
GraphRAG for LLMs (latency-critical)FalkorDBDesigned for this
Ad-hoc exploration (large datasets)DuckDBDisk spillover

Use Case Matrix

ScenarioFalkorDBDuckPGQSQLite
Knowledge graph for RAGOptimalOverkillToo slow
Social network analyticsGood (if fits RAM)Better at scaleNo
Fraud detection (real-time)OptimalToo slowNo
Fraud detection (batch)OKOptimalNo
Small embedded graphOverkillOverkillSimple

Lattice Recommendation

For Lattice (research documentation knowledge graph):

Current Profile

MetricValue
Documents~150 markdown files
Total size~1.5 MB
Entity count~500-1000
Relationship count~1000-3000
Growth projectionMaybe 500-1000 docs

Why FalkorDB is Correct

RequirementFalkorDBDuckPGQ
Scale1000 docs = trivialMassive overkill
Query patternReal-time Q&AWould work but slower
GraphRAG optimizationNativeNot designed for this
LatencySub-msTens of ms
LangChain integrationDirectWould need custom
SDKGraphRAG-SDK readyNo equivalent

Verdict: FalkorDB is the clear choice for Lattice.

DuckPGQ would only make sense if:

  • Corpus grew to millions of documents
  • You needed batch analytics over the graph
  • Real-time response wasn’t required

When to Reconsider

TriggerAction
RAM > 50% used by FalkorDBMonitor fragmentation
OOM errorsConsider sharding or DuckPGQ
Need batch analyticsExport to DuckDB for analysis
> 1M entitiesEvaluate distributed options

For a research knowledge graph with hundreds to low thousands of documents, FalkorDB will remain optimal for years.


Memory Gotchas (Lessons from Lattice)

Relationship Type Count is a Hidden Memory Multiplier

Problem: FalkorDB pre-allocates a sparse matrix for each relationship type, not just each edge.

Memory ≈ NODE_CREATION_BUFFER × relationship_types × matrix_overhead
≈ 16,384 × N types × overhead
Relationship TypesPre-allocated SlotsObserved Impact
15-20 types327,680OOM at 2GB limit
2 types32,768Fits in ~200MB

Real-world case: Lattice hit OOM with only 200 documents when using many relationship types (REFERENCES, MENTIONS, AUTHORED_BY, CITES, etc.). Reducing to 2 types resolved the issue.

Design Pattern: Coarse Types + Properties

Instead of:

(doc)-[:REFERENCES]->(entity)
(doc)-[:MENTIONS]->(entity)
(doc)-[:CITES]->(entity)
(doc)-[:DEPENDS_ON]->(entity)

Use:

(doc)-[:RELATES_TO {type: "reference"}]->(entity)
(doc)-[:RELATES_TO {type: "mention"}]->(entity)

Trade-off: Slightly more verbose queries, but 10x less memory.

Other Memory Considerations

FactorImpactMitigation
NODE_CREATION_BUFFER16,384 defaultReduce to 1024-2048 for small graphs
Full-text indicesCan be 70% of total memoryOnly index what you query
Vector dimensions1536-dim = 6KB/vectorUse 512-dim models if possible
Redis fragmentationUp to 18x reportedMonitor mem_fragmentation_ratio

Technical Deep Dive

FalkorDB: GraphBLAS Performance

GraphBLAS uses linear algebra for graph operations:

Adjacency Matrix A:
1 2 3 4
┌──────────────┐
1 │ 0 1 0 1 │ (node 1 → nodes 2, 4)
2 │ 1 0 1 0 │ (node 2 → nodes 1, 3)
3 │ 0 1 0 1 │
4 │ 1 0 1 0 │
└──────────────┘
2-hop neighbors = A × A (matrix multiplication)
3-hop neighbors = A × A × A

Why this is fast:

  • Matrix ops are highly optimized (BLAS libraries)
  • Sparse matrices only store non-zero entries (CSC format)
  • OpenMP parallelization

DuckPGQ: Vectorized Execution

DuckDB processes data in vectors (1024-2048 items):

Traditional (row-by-row): Vectorized:
for each row: for each vector of 1024 rows:
process row process all at once (SIMD)
Overhead: N function calls Overhead: N/1024 function calls

Why this matters for graphs:

  • Bulk path-finding across many start nodes simultaneously
  • Better CPU cache utilization (L1 cache fits vectors)

Sources

  1. FalkorDB Design Documentation
  2. DuckPGQ Documentation
  3. DuckDB Memory Management
  4. FalkorDB v4.8 Memory Improvements
  5. FalkorDB GitHub Issue - Memory fragmentation
  6. SQLite WITH Clause
  7. simple-graph GitHub
  8. DuckDB Blog: Graph Queries
  9. Memgraph Storage Usage