Embeddings (Einbettungen)
Vektorrepräsentationen von Daten, die die semantische Bedeutung erfassen.
Embeddings are the foundation of modern Large Language Models (LLMs) and semantic search engines. By converting raw data into multi-dimensional vectors, machines can perform arithmetic on meaning. A famous example is the calculation: Vector('King') - Vector('Man') + Vector('Woman') ≈ Vector('Queen'). This capability allows AI to handle synonyms, analogies, and complex relationships without explicit rules. Beyond text, 'Multi-modal Embeddings' (like those from OpenAI's CLIP) can place an image of a dog and the word 'dog' in the same coordinate space, enabling search engines to find photos based on text descriptions. Strategic use of embeddings is now central to RAG (Retrieval-Augmented Generation) architectures.
graph LR
Center["Embeddings (Einbettungen)"]:::main
Pre_linear_algebra["linear-algebra"]:::pre --> Center
click Pre_linear_algebra "/terms/linear-algebra"
Pre_neural_network["neural-network"]:::pre --> Center
click Pre_neural_network "/terms/neural-network"
Center --> Child_vector_database["vector-database"]:::child
click Child_vector_database "/terms/vector-database"
Rel_semantic_search["semantic-search"]:::related -.-> Center
click Rel_semantic_search "/terms/semantic-search"
Rel_decryption["decryption"]:::related -.-> Center
click Rel_decryption "/terms/decryption"
classDef main fill:#7c3aed,stroke:#8b5cf6,stroke-width:2px,color:white,font-weight:bold,rx:5,ry:5;
classDef pre fill:#0f172a,stroke:#3b82f6,color:#94a3b8,rx:5,ry:5;
classDef child fill:#0f172a,stroke:#10b981,color:#94a3b8,rx:5,ry:5;
classDef related fill:#0f172a,stroke:#8b5cf6,stroke-dasharray: 5 5,color:#94a3b8,rx:5,ry:5;
linkStyle default stroke:#4b5563,stroke-width:2px;
🧒 Erkläre es wie einem 5-Jährigen
📍 Think of a giant room where every idea has a specific spot. Words that mean similar things, like 'happy' and 'joyful', stand right next to each other. Words that have nothing to do with each other, like 'pizza' and 'bicycle', stand in opposite corners. These 'spots' are called embeddings, and they help computers understand what words actually mean.
🤓 Expert Deep Dive
## The Geometry of Meaning
### Vector Databases
Because modern AI uses millions of embeddings, we use specialized 'Vector Databases' (like Pinecone, Milvus, or Weaviate) to store and query them. These databases use HNSW (Hierarchical Navigable Small Worlds) algorithms to find the nearest neighbors in sub-second time.
### Contextual vs. Static
Old embeddings (Word2Vec) were static—the word 'play' always had the same vector. New embeddings (Transformers) are contextual—'play' as a verb and 'play' as a drama have different vectors depending on the surrounding words.
### Quantization
To save memory, vectors can be 'quantized' (e.g., from 32-bit floats to 8-bit integers) or even binary. This allows for massive scaling of search indexes without losing significant semantic accuracy.
❓ Häufig gestellte Fragen
What is the difference between an embedding and a vector?
A vector is just a list of numbers. An embedding is a specific type of vector that is learned by an AI to represent the meaning of an object.
How are embeddings used in RAG?
In RAG, your documents are converted into embeddings and stored in a vector database. When a user asks a question, it is also converted into an embedding, and the most similar documents are retrieved to provide context to the AI.