Run Google Gemma 2 models locally — from the 2B nano model to the 27B powerhouse. Optimized picks for every budget.
16GB VRAM • Runs Gemma 2 27B at Q4 • 40+ tok/s on 9B
| Model | Full Precision | Q8 (8-bit) | Q4 (4-bit) |
|---|---|---|---|
| Gemma 2 2B | 4 GB | 2.5 GB | 1.5 GB |
| Gemma 2 9B | 18 GB | 10 GB | 6 GB |
| Gemma 2 27B | 54 GB | 29 GB | 16 GB |
* Add 1-2GB overhead for context window. Values are approximate.
Check if your GPU can run specific Gemma models at every quantization level.
Open VRAM CalculatorRent GPU compute from $0.39/hr. Compare 24+ providers with live pricing.
Browse Cloud GPUs