Inference Latency Calculator
Calculate AI model inference latency and throughput.
Inference Configuration
B
Total Response Latency
3.00 s
100.0 tokens/sec
β‘Time to First Token
1.00 s
πInter-token Latency
10.0 ms
Latency Breakdown
Prefill Time (Input)1.00 s
Generation Time (Output)2.00 s
Total Tokens700
Effective Throughput233.3 tok/s