Attention
Attention is the mechanism that lets a language model weigh which words in the input matter most for understanding each part, and it tends to concentrate on prominent, early, and clearly-related text.
Attention is how a model decides what to focus on. Within a transformer, the attention mechanism scores how relevant every word is to every other, so the model can emphasize the parts that carry meaning for the task — connecting a pronoun to its subject, a claim to its evidence, an answer to its question.
The practical AEO takeaway is that attention isn't evenly spread. Clear, prominent, well-structured statements draw more of it than buried or tangential text, which reinforces position bias and the case for answer-first, extractable writing. Make the sentence you want quoted prominent and unambiguous, and you make it the part the model attends to and reuses.
Example. In a paragraph that opens with a crisp answer and then elaborates, the model's attention lands on that opening claim — exactly the sentence you'd want an engine to lift and cite.
Relevant pillar
Related terms
- TransformerThe transformer is the neural-network architecture behind modern language models, which uses an attention mechanism to weigh how words relate, enabling fluent understanding and generation of text.
- Position BiasPosition bias is the tendency of retrieval and language models to weight content near the start of a page or passage more heavily, making where you place an answer matter as much as the answer itself.
- Large Language Model (LLM)A large language model is an AI system trained on vast amounts of text to predict and generate language, and is the engine that writes the answers in AI search.