voyage-reranker
Purpose
Guide to using Voyage AI’s reranker API (rerank-2.5 series) for improving retrieval quality and implementing multi-label search strategies across knowledge graph entity types.
Voyage AI Reranker (August 2025)
Latest Models
Voyage AI’s rerank-2.5 series introduces instruction-following capabilities and 8x longer context than competitors:
| Model | Context | Best For | Performance vs Cohere v3.5 |
|---|---|---|---|
| rerank-2.5 | 32K tokens | Quality-critical retrieval | +7.94% (standard), +12.70% (MAIR) |
| rerank-2.5-lite | 32K tokens | Latency-sensitive applications | +7.16% (standard), +10.36% (MAIR) |
| rerank-2 | 16K tokens | Legacy | — |
| rerank-2-lite | 8K tokens | Legacy | — |
Context Advantage: 32K tokens = 8x Cohere Rerank v3.5, 2x rerank-2, enabling accurate retrieval across longer documents.
How It Works
Voyage’s reranker is a cross-encoder that:
- Takes a query and list of candidate documents
- Jointly processes each query-document pair
- Outputs relevance scores for precise ranking
- Refines results from fast embedding-based retrieval
Unlike bi-encoders (embeddings), cross-encoders see both query and document together, providing higher accuracy at the cost of higher latency.
API Usage
Python SDK:
import voyageai
client = voyageai.Client(api_key="your-api-key")
query = "bun link command"documents = [ "Tool: bun link. Bun's package linking command for local development workflows", "Tool: Pluribus. Multi-player poker AI system developed by Facebook AI", "Technology: Bun Package Manager. Fast package manager included in Bun runtime"]
result = client.rerank( query=query, documents=documents, model="rerank-2.5-lite", top_k=3)
for doc in result.results: print(f"{doc.index}: {doc.relevance_score:.4f}")REST API:
curl https://api.voyageai.com/v1/rerank \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $VOYAGE_API_KEY" \ -d '{ "query": "bun link command", "documents": ["...", "...", "..."], "model": "rerank-2.5-lite", "top_k": 3 }'API Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | Yes | Search query (max 8K tokens for rerank-2.5) |
documents | List[str] | Yes | Up to 1,000 documents to rerank |
model | string | Yes | rerank-2.5, rerank-2.5-lite, etc. |
top_k | int | No | Return only top-k most relevant results |
truncation | bool | No | Auto-truncate oversized inputs (default: true) |
Token Limits:
- Total tokens:
(query tokens × doc count) + sum(document tokens) - rerank-2.5/lite: 600K max
- rerank-1/lite: 300K max
Instruction-Following Capabilities
New in rerank-2.5: Steer relevance scoring with natural language:
# Domain-specific relevancequery = "Instruction: Prioritize recent research papers over blog posts. Query: transformers architecture"
# Custom criteriaquery = "Instruction: Focus on code examples with TypeScript. Query: react hooks"This allows fine-grained control over what the reranker considers “relevant” without retraining.
Multi-Label Querying Strategy
The Challenge
Knowledge graphs contain multiple entity types (labels):
- Document: Full research documents with comprehensive content
- Technology: Technical tools/frameworks with descriptions
- Tool: Specific utilities and commands
- Concept: Abstract ideas and patterns
Searching across all labels requires a strategy to combine and rank heterogeneous results.
Architecture: Two-Stage Retrieval + Reranking
Stage 1: Vector Search Across Labels
// Lattice's current approach (src/graph/graph.service.ts)async vectorSearchAll( queryEmbedding: number[], limit: number = 20): Promise<SearchResult[]> { const labels = ["Document", "Technology", "Tool", "Concept", "Person", ...];
// Query each label separately const labelPromises = labels.map(label => this.vectorSearch(label, queryEmbedding, limit) );
const allResults = await Promise.all(labelPromises);
// Flatten and sort by similarity score return allResults .flat() .sort((a, b) => b.score - a.score) .slice(0, limit);}Problem: Similarity scores aren’t directly comparable across entity types with different text lengths/structures.
Stage 2: Cross-Encoder Reranking
Apply Voyage reranker to refine combined results:
async searchWithReranking( query: string, limit: number = 20): Promise<SearchResult[]> { // Stage 1: Fast vector search (top 100) const queryEmbedding = await this.embedding.generateQueryEmbedding(query); const candidates = await this.graph.vectorSearchAll(queryEmbedding, 100);
// Stage 2: Precise reranking (top 20) const documents = candidates.map(c => `${c.label}: ${c.name}. ${c.description || ''}` );
const reranked = await voyageClient.rerank({ query: query, documents: documents, model: "rerank-2.5-lite", top_k: limit });
// Map back to original entities return reranked.results.map(r => ({ ...candidates[r.index], rerankScore: r.relevance_score }));}When to Use Reranking
| Scenario | Approach | Rationale |
|---|---|---|
| Simple entity search | Vector search only | Fast, good enough for single-label queries |
| Multi-label search | Vector + reranking | Essential for comparing heterogeneous entities |
| Long documents | Vector + reranking | Cross-encoder handles context better |
| Domain-specific | Reranking with instructions | Steer relevance criteria |
| Latency-critical | Vector only or rerank-2.5-lite | Balance speed vs accuracy |
Example: Multi-Label Search in Practice
Before (vector search only):
Search: "bun link"1. Pluribus (poker AI) - 84.21% ❌ High score, wrong domain2. Libratus (poker AI) - 82.57% ❌ High score, wrong domain3. bun link - 82.07% ✅ Correct but buriedAfter (query embeddings):
1. bun link - 69.90% ✅ Correct2. bun-link.md - 65.06% ✅ Correct3. Bun Package Manager - 57.25% ✅ CorrectWith reranking (theoretical):
1. bun link - 0.95 ✅ Cross-encoder confirms relevance2. Bun Package Manager - 0.89 ✅ Semantic connection3. bun-link.md - 0.87 ✅ Document with full contextImplementation Considerations
Cost vs Benefit
Voyage reranker pricing (as of 2025):
- rerank-2.5: ~$2.00 per 1M tokens
- rerank-2.5-lite: ~$0.50 per 1M tokens
For Lattice with ~1K queries/day:
- Vector search only: $0 (local computation)
- Vector + reranking (lite): ~$5-10/month
- Worth it? Yes for production, optional for personal use
Latency Impact
| Stage | Latency |
|---|---|
| Vector search (100 candidates) | ~10-50ms |
| Reranking (20 results) | ~100-300ms |
| Total | ~150-350ms |
Still well within acceptable range for interactive search.
Integration with Lattice
Proposed enhancement to src/commands/query.command.ts:
@Option({ flags: "--rerank", description: "Use Voyage reranker for improved accuracy"})parseRerank(value: boolean): boolean { return value;}
async run(inputs: string[], options: SearchCommandOptions): Promise<void> { const query = inputs[0]; const queryEmbedding = await this.embeddingService.generateQueryEmbedding(query);
if (options.rerank) { // Two-stage: vector + reranking const candidates = await this.graphService.vectorSearchAll(queryEmbedding, 100); const results = await this.rerankResults(query, candidates, options.limit); } else { // Fast vector search only const results = await this.graphService.vectorSearchAll(queryEmbedding, options.limit); }
// Display results...}Research Findings: Multi-Label KG Querying
Recent research (2025) on knowledge graph retrieval with reranking:
Knowledge Graph-Guided RAG
- Multi-path subgraph construction: Incorporate one-hop, multi-hop, and importance-based relations
- Query-aware attention: Reward models score subgraph triples by semantic relevance
- Key insight: Embedding chunk metadata with text > powerful rerankers alone
ReranKGC Framework
- Retrieve-and-rerank pipeline for multi-modal knowledge graph completion
- Uses KGC-CLIP to extract multi-modal knowledge for candidate re-ranking
- Published April 2025 in Neural Networks journal
AR-Align
- Unsupervised multi-view contrastive learning for entity alignment
- Attention-based reranking: Reranks hard entities by weighted similarity across different structures
- Improves precision for ambiguous entity matching
GraphRAG Best Practices
From Neo4j Advanced RAG Techniques (2025):
- Knowledge graphs unify scattered data (docs, tables, APIs)
- GraphRAG retrieves along entity connections
- Improves disambiguation and multi-hop answers
- Keeps sources traceable
Recommendation for Lattice: Implement relationship-aware reranking that considers entity connections, not just text similarity.
Best Practices
1. Use Query Input Type + Reranking
// Best: Both optimizationsconst queryEmbed = await embedding.generateQueryEmbedding(query); // input_type="query"const candidates = await graph.vectorSearchAll(queryEmbed, 100);const results = await voyageClient.rerank({ query, documents: candidates, top_k: 20 });2. Batch Reranking for Efficiency
// Efficient: Rerank 100 candidates → 20 results (1 API call)// Inefficient: Rerank each label separately (multiple API calls)3. Cache Reranking Results
For common queries, cache reranking results to avoid redundant API calls.
4. Use Instructions for Domain-Specific Search
const domainQuery = `Instruction: Prioritize technical documentation over blog posts. Query: ${userQuery}`;5. Monitor Token Usage
// Estimate tokens before calling rerankconst estimatedTokens = (query.length + documents.reduce((sum, d) => sum + d.length, 0)) / 4;if (estimatedTokens > 600000) { // Reduce candidate count or truncate documents}Comparison: Rerankers (2025)
| Provider | Model | Context | Price/1M tokens | Performance |
|---|---|---|---|---|
| Voyage AI | rerank-2.5 | 32K | ~$2.00 | Best (MAIR +12.7%) |
| Voyage AI | rerank-2.5-lite | 32K | ~$0.50 | Good (+10.36%) |
| Cohere | Rerank v3.5 | 4K | ~$2.00 | Baseline |
| Jina AI | jina-reranker-v2 | 8K | $0.70 | Competitive |
Voyage offers the longest context and best performance in 2025.
Sources
- Voyage AI: Rerankers Documentation
- Voyage AI: Reranker API Reference
- MongoDB: rerank-2.5 and rerank-2.5-lite Announcement
- Voyage AI Blog: rerank-2.5 Release
- Voyage AI: Pricing
- LangChain: VoyageAI Reranker Integration
- Neo4j: Advanced RAG Techniques for 2025
- ReranKGC: Multi-modal Knowledge Graph Completion
- AR-Align: Entity Alignment with Reranking
- Knowledge Graph-Guided RAG Research
Last updated: 2025-12-07