Ajuste Fino (Fine-tuning)

El ajuste fino es el proceso de tomar un modelo de aprendizaje automático preentrenado y entrenarlo aún más con un conjunto de datos específico para mejorar su rendimiento en una tarea particular.

🌐 Términos en otros idiomas:

English Deutsch Español Français 日本語 한국어 Polski Português Русский Türkçe Українська

Fine-tuning es una técnica de aprendizaje de transferencia en la que un modelo de aprendizaje de máquina pre-entrenado, típicamente entrenado en un conjunto de datos grande y general (por ejemplo, ImageNet para la visión, un corpus grande para NLP), se adapta para una tarea específica en el fondo utilizando un conjunto de datos más pequeño y específico de tareas. El proceso implica tomar la arquitectura y los pesos del modelo pre-entrenado y continuar el proceso de capacitación, generalmente con una tasa de aprendizaje más baja, en el nuevo conjunto de datos. A menudo, las capas finales de la red se sustituyen o modifican para coincidir con los requisitos de salida de la nueva tarea (por ejemplo, cambiar un clasificador de clase 1000 a un clasificador de clase 10). Fine-tuning aprovecha las características generales aprendidas por el modelo en el

        graph LR
  Center["Ajuste Fino (Fine-tuning)"]:::main
  Pre_machine_learning["machine-learning"]:::pre --> Center
  click Pre_machine_learning "/terms/machine-learning"
  Pre_large_language_model["large-language-model"]:::pre --> Center
  click Pre_large_language_model "/terms/large-language-model"
  Center --> Child_lora["lora"]:::child
  click Child_lora "/terms/lora"
  Center --> Child_rlhf["rlhf"]:::child
  click Child_rlhf "/terms/rlhf"
  Rel_front_running["front-running"]:::related -.-> Center
  click Rel_front_running "/terms/front-running"
  Rel_inference["inference"]:::related -.-> Center
  click Rel_inference "/terms/inference"
  classDef main fill:#7c3aed,stroke:#8b5cf6,stroke-width:2px,color:white,font-weight:bold,rx:5,ry:5;
  classDef pre fill:#0f172a,stroke:#3b82f6,color:#94a3b8,rx:5,ry:5;
  classDef child fill:#0f172a,stroke:#10b981,color:#94a3b8,rx:5,ry:5;
  classDef related fill:#0f172a,stroke:#8b5cf6,stroke-dasharray: 5 5,color:#94a3b8,rx:5,ry:5;
  linkStyle default stroke:#4b5563,stroke-width:2px;

🕸️ Open in Universe

🧠 Prueba de conocimiento

1 / 3

🧒 Explícalo como si tuviera 5 años

It's like taking a chef who knows how to cook many things (pre-trained model) and teaching them your specific family recipes (new dataset) so they become great at cooking just your favorite dishes.

🤓 Expert Deep Dive

Fine-tuning operates on the principle that representations learned on large-scale, diverse datasets capture fundamental patterns applicable to related tasks. In deep learning, this typically involves adjusting the weights of a pre-trained network (e.g., ResNet, BERT) using backpropagation on a target dataset. The learning rate is often set significantly lower than during pre-training to avoid drastic weight updates that could disrupt the learned features. Layer freezing is a common strategy: earlier layers capturing low-level features (e.g., edges, textures in images; word embeddings in text) are often frozen, while later layers capturing more task-specific features are fine-tuned. Alternatively, adapter modules can be inserted between layers, allowing task-specific parameters to be learned while keeping the original model weights fixed. The effectiveness relies heavily on the similarity between the pre-training and fine-tuning data distributions and tasks. Domain shift can necessitate more extensive fine-tuning or different adaptation strategies. Overfitting remains a primary concern, especially with very small target datasets, often mitigated by regularization techniques or early stopping.

🔗 Términos relacionados

Requisitos previos:

Más información:

📚 Fuentes

1. Training with Transformers

2. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

3. RoBERTa: A Robustly Optimized BERT Pretraining Approach

4. LoRA: Low-Rank Adaptation of Large Language Models

5. Attention is All You Need

6. Fine-tuning Large Language Models: A Comprehensive Guide

7. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding