Что такое RAG

Retrieval-Augmented Generation (RAG) — это фреймворк искусственного интеллекта, который повышает точность и надежность больших языковых моделей (LLM) за счет интеграции внешних источников знаний.

🌐 Термины на других языках:

English Deutsch Español Français 日本語 한국어 Polski Português Русский Türkçe Українська

RAG сочетает в себе сильные стороны извлечения информации и генерации текста. Сначала он извлекает соответствующую информацию из базы знаний или внешних источников данных на основе запроса пользователя. Затем эта извлеченная информация дополняет подсказку LLM, предоставляя ей контекст и факты для формирования более обоснованного и точного ответа. Этот подход снижает зависимость от внутренних параметров LLM, снижая риск генерации неверной или галлюцинированной информации, особенно для специализированных или актуальных знаний. Процесс обычно включает в себя индексирование базы знаний, запрос к ней с вопросом пользователя, извлечение соответствующих документов, а затем передачу этих документов в LLM вместе с исходным вопросом для формирования ответа.

        graph LR
  Center["Что такое RAG"]:::main
  Pre_logic["logic"]:::pre --> Center
  click Pre_logic "/terms/logic"
  Rel_retrieval_augmented_generation["retrieval-augmented-generation"]:::related -.-> Center
  click Rel_retrieval_augmented_generation "/terms/retrieval-augmented-generation"
  Rel_rag_pipeline["rag-pipeline"]:::related -.-> Center
  click Rel_rag_pipeline "/terms/rag-pipeline"
  Rel_reinforcement_learning["reinforcement-learning"]:::related -.-> Center
  click Rel_reinforcement_learning "/terms/reinforcement-learning"
  classDef main fill:#7c3aed,stroke:#8b5cf6,stroke-width:2px,color:white,font-weight:bold,rx:5,ry:5;
  classDef pre fill:#0f172a,stroke:#3b82f6,color:#94a3b8,rx:5,ry:5;
  classDef child fill:#0f172a,stroke:#10b981,color:#94a3b8,rx:5,ry:5;
  classDef related fill:#0f172a,stroke:#8b5cf6,stroke-dasharray: 5 5,color:#94a3b8,rx:5,ry:5;
  linkStyle default stroke:#4b5563,stroke-width:2px;

🕸️ Open in Universe

🧠 Проверка знаний

1 / 3

🧒 Простыми словами

Imagine you're asking a super-smart robot a question, but it only knows things it learned a long time ago. RAG is like giving the robot a quick peek at a library book (the knowledge base) before it answers, so it can tell you the latest and most correct information!

🤓 Expert Deep Dive

RAG architectures fundamentally decouple knowledge acquisition from model inference by augmenting a generative Large Language Model (LLM) with an external, dynamically queried knowledge source. The core components typically include:

Document Indexing: A corpus of documents (e.g., articles, PDFs, web pages) is processed and embedded into a vector space using a pre-trained embedding model (e.g., Sentence-BERT, OpenAI's text-embedding-ada-002). These embeddings are stored in a vector [database](/ru/terms/vector-database) (e.g., Pinecone, Weaviate, FAISS) for efficient similarity search.
Retrieval: Upon receiving a user query, the query is also embedded into the same vector space. A similarity search (e.g., Approximate Nearest Neighbor - ANN) is performed against the vector database to identify the k most relevant document chunks (or passages) based on cosine similarity or dot product.
Augmentation & Generation: The original user query and the retrieved document chunks are concatenated into a single prompt. This augmented prompt is then fed into the LLM. The LLM uses this context to generate a response that is grounded in the retrieved information.

Mathematically, the retrieval step can be viewed as finding document embeddings $d_i$ such that their similarity to the query embedding $q$ is maximized:

$i^* = \arg\max_i \text{sim}(q, d_i)$

where $\text{sim}(u, v)$ is a similarity function like cosine similarity: $\frac{u \cdot v}{\|u\| \cdot \|v\|}$.

This approach mitigates the 'knowledge cut-off' problem inherent in LLMs and significantly reduces hallucination by providing factual grounding. Advanced RAG techniques explore re-ranking retrieved documents, query expansion, and fine-tuning the retriever and generator components jointly (e.g., REALM, DPR).

🔗 Связанные термины

Предварительные знания:

logic

📚 Источники

1. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

2. What is Retrieval-Augmented Generation (RAG)?