RAG (Retrieval-Augmented Generation)
RAG is the technique behind most AI answer engines, where the model first retrieves relevant documents from the live web or an index and then generates an answer grounded in what it found.
Also known as: Retrieval-Augmented Generation
RAG is how an AI answer engine looks something up before it answers. Instead of relying only on what a language model memorized during training, a retrieval-augmented system first fetches relevant passages — from the live web, a search index, or a private knowledge base — and feeds them to the model as context so the generated answer is grounded in real, current sources.
This two-step shape (retrieve, then generate) is why AEO works at all. The retrieval step decides which pages are even eligible to be quoted, and it operates on passages, not whole pages — so being extractable (clean, self-contained, answer-first paragraphs) is what gets you pulled into the context window. If a crawler can't access your content as text, you're invisible to the retrieval step no matter how good your writing is.
Example. When you ask Perplexity a question, it runs a search, pulls the top matching passages from several sites, and writes an answer that cites them inline. Your paragraph competes to be one of those retrieved passages — that competition is the whole game.
Relevant pillars
Related terms
- GroundingGrounding is the practice of tying an AI model's answer to specific retrieved sources, so the response reflects real documents rather than the model's unverified internal memory.
- EmbeddingsEmbeddings are numerical representations of text that capture its meaning, letting AI systems find passages that are semantically related to a query even when they share no exact keywords.
- Vector SearchVector search is a retrieval method that finds passages by meaning rather than keywords, comparing the numeric embedding of a query against the embeddings of indexed content to surface the closest matches.
- ChunkingChunking is how a retrieval system splits your page into smaller passages before indexing it, so AI engines retrieve and cite chunks of a page rather than the whole document.