Inference Latency Calculator

Calculate AI model inference latency and throughput.

Inference Configuration

B

Total Response Latency

3.00 s

100.0 tokens/sec

⚑Time to First Token
1.00 s
πŸ”„Inter-token Latency
10.0 ms

Latency Breakdown

Prefill Time (Input)1.00 s
Generation Time (Output)2.00 s
Total Tokens700
Effective Throughput233.3 tok/s