RAG
Retrieval-Augmented Generation — a technique that gives an AI model access to external documents before it answers, so it can cite real, up-to-date sources.
In plain English
RAG (Retrieval-Augmented Generation) is a pattern that improves AI accuracy by fetching relevant documents and injecting them into the model's context before it generates a response.
How it works:
- User asks a question
- The system searches a document store (your docs, a database, the web) for relevant content
- That content is added to the prompt sent to the LLM
- The LLM answers using both its training knowledge and the retrieved content
Why use RAG instead of fine-tuning?
- No retraining required — update your documents, not the model
- The model can cite sources, reducing hallucination
- Works with private or frequently-changing data
RAG is the most common way to build AI tools that "know" your company's information.