📚 Naive RAG Q5 / 23

What are the main components of a Naive RAG pipeline?

AI-Powered Answer ✓ Answered

A Naive RAG (Retrieval Augmented Generation) pipeline enhances a Language Model's (LLM) ability to generate accurate and relevant responses by first retrieving information from a knowledge base. It typically consists of three primary stages.

1. Indexing/Data Preparation

This initial phase involves preparing the external knowledge base from which information will be retrieved. Documents are loaded from various sources (e.g., PDFs, web pages, databases), split into smaller, manageable chunks to fit into the LLM's context window, and then converted into numerical vector embeddings. These embeddings, along with the original text chunks, are stored in a vector database, making them searchable by semantic similarity.

2. Retrieval

When a user submits a query, it is first transformed into a vector embedding using the same embedding model used during the indexing phase. This query embedding is then used to perform a similarity search within the vector database. The goal is to identify and retrieve the top 'k' most semantically relevant text chunks from the knowledge base that are likely to contain the answer to the user's query.

3. Generation (LLM Invocation)

In the final stage, the retrieved text chunks are combined with the original user query to construct a detailed prompt. This prompt typically instructs the LLM to use the provided context to answer the question. The augmented prompt is then sent to the LLM, which processes both the query and the relevant context to generate a comprehensive, grounded, and accurate answer, reducing the likelihood of hallucinations and providing specific details from the knowledge base.