Cross-Encoder

A cross-encoder reads the question and your passage side by side. Unlike the fast embedding models that score query and passage separately, a cross-encoder processes them together, which makes it far more precise at judging true relevance — at the cost of being slower. That's why it's used in reranking: applied to a small candidate set, not the whole index.

Its precision is exactly what rewards genuinely on-point writing. A cross-encoder can tell the difference between a passage that merely mentions the topic and one that directly, completely answers the specific question — so the extractability discipline of writing a real, self-contained answer is what wins the rerank. Loosely-related filler that slipped through fast retrieval gets caught and demoted here.

Example. Two passages both mention "passport renewal," but only one states the actual processing time. A cross-encoder, reading each against the query "how long does passport renewal take," promotes the one that truly answers it.

Relevant pillar

Related terms