Run DeepSeek R1 and V3 distilled models locally. GPU picks from budget to multi-GPU setups for the full 671B model.
24GB VRAM • Runs R1 Distill 32B at Q4 • 70+ tok/s on 7B distill
| Model | Full Precision | Q8 (8-bit) | Q4 (4-bit) |
|---|---|---|---|
| R1 Distill 7B | 14 GB | 8 GB | 5 GB |
| R1 Distill 32B | 64 GB | 34 GB | 18 GB |
| R1 Distill 70B | 140 GB | 75 GB | 40 GB |
| R1 Full 671B (MoE) | 1.3 TB | 671 GB | ~350 GB |
* MoE models activate ~37B params per token. Add 1-2GB overhead for context window.
Check if your GPU can run specific DeepSeek models at every quantization level.
Open VRAM CalculatorRent GPU compute from $0.39/hr. Compare 24+ providers with live pricing.
Browse Cloud GPUs