Latência de Inferência

O tempo que um modelo de ML leva para processar uma entrada e gerar uma previsão.

🌐 Termos em outros idiomas:

English Deutsch Español Français 日本語 한국어 Polski Português Русский Türkçe Українська

Fator crucial para experiência do usuário em apps interativos. É impactada pelo tamanho do modelo, hardware e rede. Estratégias de otimização incluem reduzir a precisão numérica (quantization) e usar arquiteturas de modelos mais eficientes (como MobileNet para visão).

        graph LR
  Center["Latência de Inferência"]:::main
  Rel_network_latency["network-latency"]:::related -.-> Center
  click Rel_network_latency "/terms/network-latency"
  classDef main fill:#7c3aed,stroke:#8b5cf6,stroke-width:2px,color:white,font-weight:bold,rx:5,ry:5;
  classDef pre fill:#0f172a,stroke:#3b82f6,color:#94a3b8,rx:5,ry:5;
  classDef child fill:#0f172a,stroke:#10b981,color:#94a3b8,rx:5,ry:5;
  classDef related fill:#0f172a,stroke:#8b5cf6,stroke-dasharray: 5 5,color:#94a3b8,rx:5,ry:5;
  linkStyle default stroke:#4b5563,stroke-width:2px;

🕸️ Open in Universe

🧠 Teste de conhecimento

1 / 1

🧒 Explique como se eu tivesse 5 anos

Latency is like the delay when you call someone's name and wait for them to say 'Hello'. If they are right next to you, latency is low. If they are across a football field, the sound takes time to travel, so latency is higher.

🤓 Expert Deep Dive

Latency is composed of several delays: Processing Delay (router speed), Queuing Delay (waiting in line), Transmission Delay (pushing bits onto the wire), and Propagation Delay (the speed of light in the medium). Every mile of fiber optic cable adds about 0.005ms of propagation latency.

🧠 Teste de conhecimento

🧒 Explique como se eu tivesse 5 anos

🤓 Expert Deep Dive

📚 Fontes