What is Retrieval Augmented Generation?

Retrieval Augmented Generation (RAG) combines a retrieval system with a large language model.

Instead of asking the model to rely on training data alone, a search system retrieves relevant documents and feeds them to the model as context.

Core Components of RAG

A typical RAG system contains three parts:

  1. Document store
  2. Vector database
  3. Language model

The document store contains structured knowledge that can be chunked and embedded.

Vector Databases

Vector databases store embeddings representing the semantic meaning of text or other data.

When content is embedded into vectors, similar concepts produce vectors that appear close together in high-dimensional space. A vector database indexes these embeddings and allows fast similarity searches.

Instead of matching keywords, the system compares vectors and retrieves content that is semantically related to a query.

Examples of vector databases include:

  • FAISS
  • Pinecone
  • Weaviate
  • Qdrant
Sidekick
Hi there! Not sure where to start? I'm your Sidekick, an AI assistant trained on the ideas, projects, and writings of Scott Weidner. Ask me anything.