Knowledge Base
A structured collection of documents an AI system can search and quote — the source-of-truth corpus that grounds RAG and many AI agents.
In plain English
A knowledge base, in AI tooling, is the indexed body of documents an LLM is allowed to draw from when answering. It's the "ground truth" half of retrieval-augmented generation: the model is told to base its answer on what's in the knowledge base, not its training data.
What goes in a knowledge base:
- Internal docs (Notion, Confluence, Google Drive, SharePoint)
- Product documentation and help articles
- Support tickets and resolved cases
- Policy and procedure manuals
- Public web pages a company controls
How it gets indexed:
- Documents are chunked into passages (typically 200–800 words)
- Each chunk is embedded into a vector database
- At query time, the system retrieves the top-K most relevant chunks
- Chunks are passed to the LLM as context for the answer
Why it matters: Knowledge bases are how organisations safely use AI on private data — the model never trains on the docs, it just reads them at query time. Tools like Glean, Notion AI, Microsoft Copilot, and most enterprise chatbots are knowledge-base-driven.