Skip to main content
Back to Glossary
Technical

RAG (Retrieval-Augmented Generation)

A technique that enhances AI responses by retrieving relevant information from a knowledge base before generating answers.


The Problem RAG Solves

AI models have knowledge cutoffs. They don't know about recent events or your private data. They can also "hallucinate" facts they don't actually know.

RAG fixes this by giving the AI relevant information at query time. Instead of relying solely on training data, the AI gets to see actual documents before answering.

How RAG Works

  1. You ask a question
  2. The system searches your documents for relevant information
  3. Relevant chunks are passed to the AI along with your question
  4. The AI generates an answer using both its training and the retrieved documents

It's like giving the AI a reference book to consult while answering.

RAG vs Fine-tuning

RAG advantages:

  • No retraining needed
  • Easy to update (just add documents)
  • Works with any model
  • Cites sources

Fine-tuning advantages:

  • Faster at query time (no retrieval step)
  • Better for style/format changes
  • Lower per-query costs

Most production systems use RAG because it's more flexible and easier to maintain.

RAG Quality Depends On

  • How well your documents are chunked
  • The quality of your search/retrieval
  • Whether relevant information actually exists in your corpus
  • How well the AI synthesizes retrieved information

Reviews of RAG-based tools should address retrieval quality, not just generation quality.

Related Terms

More in Technical