New Same benchmark, more providers: Ollama vs OpenCode Zen & Go. Compare on TokenDyno →

Ollama Cloud tokens per second — live benchmark

Real inference speed, measured continuously. Every row is a live Ollama Cloud model — sorted by tokens per second, benchmarked every ~10 minutes.

● live — last benchmark 1m ago
Trend 24h
Nemotron 3 Nano 30B (non-reasoning) Pro 219.4 304.0 418ms 100% 7.4 7m ago
Nemotron 3 Nano 30B (non-reasoning) Free 195.0 206.0 431ms 100% 7.4 41m ago
Qwen3 Coder 480B (non-reasoning) Pro 100.8 168.9 1.1s 100% 18 6m ago
Ministral 3 3B (non-reasoning) Free 149.7 137.1 844ms 100% 5.6 42m ago
GLM 5.2 Pro 108.3 136.1 910ms 99% 50.7 9m ago
GPT-OSS 120B Free 106.8 121.9 423ms 100% 23.8 45m ago
Ministral 3 8B (non-reasoning) Pro 90.5 120.7 466ms 100% 8.9 8m ago
Ministral 3 8B (non-reasoning) Free 83.1 109.6 516ms 100% 8.9 42m ago
Nemotron 3 Super Free 82.9 108.6 563ms 100% 25.4 41m ago
Gemini 3 Flash Preview Pro 103.1 108.4 1.8s 100% 37.8 15m ago
GPT-OSS 120B Pro 101.7 107.7 420ms 100% 23.8 9m ago
Ministral 3 14B (non-reasoning) Free 83.8 106.0 480ms 100% 10 42m ago
MiniMax M3 Pro 96.5 104.6 1.1s 100% 44.4 8m ago
Kimi K2.7 Code Pro 113.9 102.8 860ms 100% 41.9 9m ago
Ministral 3 3B (non-reasoning) Pro 155.4 101.2 526ms 100% 5.6 8m ago
Ministral 3 14B (non-reasoning) Pro 85.1 100.2 467ms 100% 10 8m ago
Qwen3 Coder Next (non-reasoning) Free 91.3 98.9 371ms 100% 21.2 41m ago
MiniMax M2.5 Pro 65.0 96.0 280ms 99% 33.7 8m ago
GPT-OSS 20B Pro 97.0 94.2 530ms 100% 14.9 9m ago
DeepSeek V4 Flash Pro 189.0 93.1 632ms 100% 40.3 15m ago
MiniMax M3 Free 97.8 91.4 2.0s 100% 44.4 42m ago
GLM 5 Pro 97.0 85.8 711ms 100% 39.5 10m ago
DeepSeek V4 Pro Pro 138.7 85.4 670ms 100% 44.3 15m ago
Kimi K2.5 Pro 140.4 82.9 692ms 100% 38.1 9m ago
MiniMax M2.1 Free 109.9 82.8 100% 31.4 43m ago
Devstral 2 123B (non-reasoning) Pro 40.4 80.3 555ms 100% 15.5 15m ago
Devstral Small 2 24B (non-reasoning) Free 38.9 76.4 656ms 100% 13.1 49m ago
Qwen3 Coder Next (non-reasoning) Pro 84.0 75.4 342ms 100% 21.2 6m ago
MiniMax M2.5 Free 67.7 70.9 302ms 100% 33.7 43m ago
Devstral Small 2 24B (non-reasoning) Pro 44.0 69.5 2.3s 100% 13.1 15m ago
Devstral 2 123B (non-reasoning) Free 35.6 68.2 572ms 100% 15.5 49m ago
GPT-OSS 20B Free 84.7 62.4 2.1s 100% 14.9 44m ago
Gemma3 12B (non-reasoning) Pro 39.7 61.7 516ms 100% 3.4 15m ago
Qwen3.5 397B Pro 78.3 56.5 1.5s 100% 33.7 6m ago
DeepSeek V3.2 Pro 32.7 56.4 4.1s 99% 33.4 16m ago
Kimi K2.6 Pro 72.5 56.3 2.5s 99% 42.8 9m ago
Gemma3 4B (non-reasoning) Pro 44.4 45.7 623ms 100% 1.1 14m ago
Mistral Large 3 675B (non-reasoning) Pro 48.4 42.5 1.9s 100% 16.2 8m ago
MiniMax M2.7 Pro 48.0 38.7 910ms 100% 38.1 8m ago
Gemma3 12B (non-reasoning) Free 39.7 37.8 527ms 100% 3.4 49m ago
Nemotron 3 Ultra Free 33.4 32.8 5.3s 95% 37.8 41m ago
MiniMax M2.1 Pro 108.4 32.6 100% 31.4 9m ago
GLM 4.7 Pro 80.1 32.3 1.8s 100% 33.8 11m ago
Nemotron 3 Super Pro 78.3 29.3 26.7s 100% 25.4 7m ago
Gemma3 4B (non-reasoning) Free 34.6 28.2 648ms 100% 1.1 46m ago
Gemma4 31B Free 79.0 24.0 593ms 100% 29.4 45m ago
GLM 5.1 Pro 101.9 23.7 924ms 100% 40.2 10m ago
GLM 4.7 Free 75.7 20.5 1.5s 100% 33.8 45m ago
Gemma3 27B (non-reasoning) Free 20.5 20.0 561ms 100% 4.8 47m ago
Gemma3 27B (non-reasoning) Pro 20.3 19.3 659ms 100% 4.8 15m ago
Qwen3 Coder 480B (non-reasoning) Free 98.1 17.3 644ms 100% 18 41m ago
Nemotron 3 Ultra Pro 44.3 14.6 12.8s 96% 37.8 7m ago
DeepSeek V3.1 671B (non-reasoning) Pro 13.5 9.7 1.0s 100% 21 16m ago
Gemma4 31B Pro 89.1 8.4 35.6s 100% 29.4 12m ago

Intelligence Index scores from Artificial Analysis.

Ollama Free is sampled about hourly to avoid burning through the weekly free-tier balance.

2 models unavailable or stale