Велика мовна модель (LLM)

Велика мовна модель (LLM) — це алгоритм глибокого навчання, який використовує нейронну мережу для розуміння та генерування тексту, подібного до людського, на основі величезних наборів даних.

🌐 Терміни іншими мовами:

English Deutsch Español Français 日本語 한국어 Polski Português Русский Türkçe Українська

Велика мовна модель даних (LLM) є типом моделі штучного інтелекту, зокрема алгоритмом глибокого навчання, призначеним для розуміння, генерування та маніпулювання людською мовою. LLM побудовані на глибоких архітектурах нейронних мереж, найчастіше трансформаторської архітектури, яка використовує механізми самоосвіти, щоб зважити важливість різних слів у послідовності. Вони навчаються на масивних наборах даних тексту та коду, часто складаються з мільярдів або навіть трильйонів слів, що дозволяє їм вивчати складні шаблони, граматику, контекст та фактичні знання. Процес навчання зазвичай включає в себе безконтрольне навчання, де модель передбачає відсутність слів або наступного слова в послідовності. Ця фа

        graph LR
  Center["Велика мовна модель (LLM)"]:::main
  Pre_artificial_intelligence["artificial-intelligence"]:::pre --> Center
  click Pre_artificial_intelligence "/terms/artificial-intelligence"
  Pre_machine_learning["machine-learning"]:::pre --> Center
  click Pre_machine_learning "/terms/machine-learning"
  Rel_ai_agent["ai-agent"]:::related -.-> Center
  click Rel_ai_agent "/terms/ai-agent"
  Rel_llm["llm"]:::related -.-> Center
  click Rel_llm "/terms/llm"
  Rel_artificial_intelligence["artificial-intelligence"]:::related -.-> Center
  click Rel_artificial_intelligence "/terms/artificial-intelligence"
  classDef main fill:#7c3aed,stroke:#8b5cf6,stroke-width:2px,color:white,font-weight:bold,rx:5,ry:5;
  classDef pre fill:#0f172a,stroke:#3b82f6,color:#94a3b8,rx:5,ry:5;
  classDef child fill:#0f172a,stroke:#10b981,color:#94a3b8,rx:5,ry:5;
  classDef related fill:#0f172a,stroke:#8b5cf6,stroke-dasharray: 5 5,color:#94a3b8,rx:5,ry:5;
  linkStyle default stroke:#4b5563,stroke-width:2px;

🕸️ Open in Universe

🧒 Простими словами

Уяви суперрозумного папугу, який прочитав усі книги у світі. Він не 'думає' як людина, але він настільки добре вгадує наступне слово, що може писати казки, код та відповідати на запитання.

🤓 Expert Deep Dive

The [Transformer architecture](/uk/terms/transformer-architecture), with its self-attention mechanism, is foundational to modern LLMs. Self-attention allows the model to dynamically compute representations of tokens based on their relationships within the input sequence, overcoming the limitations of recurrent neural networks (RNNs) in handling long-range dependencies. The scale of LLMs, characterized by parameter counts (e.g., GPT-3 with 175 billion parameters) and dataset size (e.g., Common Crawl), directly correlates with emergent capabilities. Training involves optimizing a loss function (e.g., cross-entropy) over vast corpora, often requiring significant computational resources (TPUs/GPUs). Key challenges include mitigating biases present in training data, controlling model hallucinations (generating factually incorrect information), ensuring safety and ethical alignment, and managing the computational cost of inference. Techniques like quantization and knowledge distillation are employed to create smaller, more efficient models.

🔗 Пов'язані терміни

Попередні знання:

🧒 Простими словами

🤓 Expert Deep Dive

🔗 Пов'язані терміни

📚 Джерела