latency
머신러닝 모델이 입력을 받아 결과를 반환하는 데 걸리는 시간입니다.
실시간 서비스의 품질을 결정짓는 핵심 요소입니다. 모델이 복잡할수록, 데이터가 많을수록 지연 시간이 길어집니다. 이를 줄이기 위해 모델 경량화(Quantization), 전용 AI 칩셋(NPU) 사용, 또는 클라우드 대신 엣지 컴퓨팅을 활용하는 방법이 있습니다.
graph LR
Center["latency"]:::main
Rel_network_latency["network-latency"]:::related -.-> Center
click Rel_network_latency "/terms/network-latency"
classDef main fill:#7c3aed,stroke:#8b5cf6,stroke-width:2px,color:white,font-weight:bold,rx:5,ry:5;
classDef pre fill:#0f172a,stroke:#3b82f6,color:#94a3b8,rx:5,ry:5;
classDef child fill:#0f172a,stroke:#10b981,color:#94a3b8,rx:5,ry:5;
classDef related fill:#0f172a,stroke:#8b5cf6,stroke-dasharray: 5 5,color:#94a3b8,rx:5,ry:5;
linkStyle default stroke:#4b5563,stroke-width:2px;
🧠 지식 테스트
1 / 1
🧒 5살도 이해할 수 있게 설명
Latency is like the delay when you call someone's name and wait for them to say 'Hello'. If they are right next to you, latency is low. If they are across a football field, the sound takes time to travel, so latency is higher.
🤓 Expert Deep Dive
Latency is composed of several delays: Processing Delay (router speed), Queuing Delay (waiting in line), Transmission Delay (pushing bits onto the wire), and Propagation Delay (the speed of light in the medium). Every mile of fiber optic cable adds about 0.005ms of propagation latency.