Research on Google’s RAG (Retrieval-Augmented Generation) file search capabilities.

Documents

DocumentDescription
How It WorksTechnical deep-dive into Gemini File Search and Vertex AI RAG Engine

Summary

Google offers two main RAG solutions:

  1. Gemini API File Search (Nov 2025) - Fully managed, developer-friendly

    • Automatic chunking, embedding, vector storage
    • Uses gemini-embedding-001 (3072-dim vectors)
    • Free storage, $0.15/1M tokens for indexing
    • Built-in citations
  2. Vertex AI RAG Engine - Enterprise-grade with flexibility

    • Choose your vector DB (Pinecone, Weaviate, etc.)
    • Multiple LLM options
    • VPC-SC/CMEK security

Key Concepts

  • Chunking: Breaking documents into smaller pieces (configurable size/overlap)
  • Embedding: Converting text to 3072-dimensional semantic vectors
  • Vector Search: Finding similar chunks via cosine similarity
  • Grounding: Injecting retrieved context into LLM prompts

Graph RAG

Gemini File Search does NOT support Graph RAG - it’s vector-only.

For GraphRAG, Google offers a separate architecture using Spanner Graph + Vertex AI:

  • Uses LLMGraphTransformer to extract entities/relationships
  • Combines vector search with graph traversal
  • Better for multi-hop reasoning and connected data