A few benchmarks of running various models on my 2 x NVIDIA RTX 5060ti 16GB setup.
Results (Model, Total Duration, Response Processing)
Gemma:
- 1B (0m:4s, 212 t/s)
- 4B (0m:8s, 108 t/s)
- 12B (0m:18s, 44 t/s)
- 27B (0m:42s, 22 t/s)
DeepSeek R1 70B (7m:31s, 3.04 t/s)
Dawid Ostrowski