Arquitectura Transformer

Arquitectura de red neuronal basada en la autoatención.

🌐 Términos en otros idiomas:

English Deutsch Español Français 日本語 한국어 Polski Português Русский Türkçe Українська

Unlike earlier models (RNNs), Transformers process entire sequences of data simultaneously. This 'self-attention' allows the model to understand the relationship between distant words in a sentence, regardless of their position.

        graph LR
  Center["Arquitectura Transformer"]:::main
  Rel_attention_mechanism["attention-mechanism"]:::related -.-> Center
  click Rel_attention_mechanism "/terms/attention-mechanism"
  Rel_transformer["transformer"]:::related -.-> Center
  click Rel_transformer "/terms/transformer"
  Rel_natural_language_processing["natural-language-processing"]:::related -.-> Center
  click Rel_natural_language_processing "/terms/natural-language-processing"
  classDef main fill:#7c3aed,stroke:#8b5cf6,stroke-width:2px,color:white,font-weight:bold,rx:5,ry:5;
  classDef pre fill:#0f172a,stroke:#3b82f6,color:#94a3b8,rx:5,ry:5;
  classDef child fill:#0f172a,stroke:#10b981,color:#94a3b8,rx:5,ry:5;
  classDef related fill:#0f172a,stroke:#8b5cf6,stroke-dasharray: 5 5,color:#94a3b8,rx:5,ry:5;
  linkStyle default stroke:#4b5563,stroke-width:2px;

🕸️ Open in Universe

🧠 Prueba de conocimiento

1 / 1

🧒 Explícalo como si tuviera 5 años

A revolutionary way for computers to read. Instead of reading word by word, it looks at the whole page at once to understand how every word relates to the others.

🤓 Expert Deep Dive

Introduced the multi-head attention mechanism. It eliminates recursion, allowing for massive parallelization during training. It is the backbone of BERT, GPT, and T5.

🧠 Prueba de conocimiento

🧒 Explícalo como si tuviera 5 años

🤓 Expert Deep Dive

📚 Fuentes