◈
HARDWARE
HQ
Home
Plan
Analyze
Market
Tools
◈
HARDWARE
HQ
Home
Model Finder
Can I Run It?
Build Advisor
Gear Locker
Silicon Scoop
Search
Live
Inference Lab
Simulate AI model performance and identify bottlenecks
AI Model
Select a model...
Llama 3.1 70B (Meta) - 70B params
Claude 3.5 Sonnet (Anthropic) - 10B params
Llama 3.1 8B (Meta) - 8.0B params
Claude 3.5 Haiku (Anthropic) - 3.0B params
Stable Diffusion XL (Stability AI) - 2.6B params
Gemini 1.5 Pro (Google) - 2.5B params
DALL-E 3 (OpenAI) - 2.5B params
GPT-4 (OpenAI) - 1.8B params
GPT-4 Turbo (OpenAI) - 1.3B params
GPT-3.5 Turbo (OpenAI) - 750M params
GPU Hardware
Select a GPU...
NVIDIA NVIDIA GB200 NVL72 - 14131.2GB HBM3E
NVIDIA NVIDIA DGX H100 - 640GB HBM3
AWS AWS EC2 P5 Instance (8x H100) - 640GB HBM3
Microsoft Azure ND H100 v5 (8x H100) - 640GB HBM3
Google Google Cloud A3 (8x H100) - 640GB HBM3
Lambda Labs Lambda Labs 1-Click Cluster (8x H100) - 640GB HBM3
NVIDIA NVIDIA DGX Station A100 - 320GB HBM2E
AMD AMD Instinct MI325X - 256GB HBM3E
AMD AMD Instinct MI300X - 192GB HBM3
NVIDIA NVIDIA B100 - 192GB HBM3E
NVIDIA NVIDIA B200 - 192GB HBM3E
NVIDIA NVIDIA H200 SXM - 141GB HBM3E
AMD AMD Instinct MI250X - 128GB HBM2E
AMD AMD Instinct MI300A - 128GB HBM3
NVIDIA NVIDIA DGX Spark - 128GB LPDDR5
NVIDIA NVIDIA GH200 Grace Hopper Superchip - 96GB HBM3
NVIDIA NVIDIA A100 80GB - 80GB HBM2E
NVIDIA NVIDIA H100 PCIe - 80GB HBM3
NVIDIA NVIDIA H100 SXM - 80GB HBM3
CoreWeave CoreWeave H100 Instance - 80GB HBM3
NVIDIA NVIDIA A16 - 64GB GDDR6
NVIDIA NVIDIA Jetson Orin AGX 64GB - 64GB Unknown
AMD AMD Radeon Pro W7900 - 48GB GDDR6
NVIDIA NVIDIA A40 - 48GB GDDR6
NVIDIA NVIDIA L40S - 48GB GDDR6
NVIDIA NVIDIA RTX 6000 Ada - 48GB GDDR6
NVIDIA NVIDIA RTX A6000 - 48GB GDDR6
NVIDIA NVIDIA RTX 6000 Ada Generation - 48GB GDDR6
NVIDIA NVIDIA A100 40GB PCIe - 40GB HBM2E
AMD AMD Radeon Pro W6800 - 32GB GDDR6
AMD AMD Radeon Pro W7800 - 32GB GDDR6
NVIDIA NVIDIA Jetson AGX Orin - 32GB Unknown
NVIDIA NVIDIA RTX 5000 Ada - 32GB GDDR6
NVIDIA NVIDIA RTX 5090 - 32GB GDDR7
NVIDIA NVIDIA Tesla V100 32GB - 32GB HBM2
NVIDIA NVIDIA Tesla V100 32GB - 32GB HBM2
NVIDIA NVIDIA RTX 5090 - 32GB GDDR7
AMD AMD Radeon RX 7900 XTX - 24GB GDDR6
NVIDIA NVIDIA A10 24GB - 24GB GDDR6
NVIDIA NVIDIA A30 - 24GB HBM2
NVIDIA NVIDIA RTX 3090 - 24GB GDDR6X
NVIDIA NVIDIA RTX 4090 - 24GB GDDR6X
NVIDIA NVIDIA RTX A5000 - 24GB GDDR6
NVIDIA NVIDIA Tesla P40 24GB - 24GB GDDR5
NVIDIA NVIDIA L4 Tensor Core GPU - 24GB GDDR6
NVIDIA NVIDIA Tesla P40 - 24GB GDDR5X
AMD AMD Radeon RX 7900 XT - 20GB GDDR6
AMD AMD Radeon RX 6800 - 16GB GDDR6
AMD AMD Radeon RX 6800 XT - 16GB GDDR6
AMD AMD Radeon RX 6950 XT - 16GB GDDR6
AMD AMD Radeon RX 7800 XT - 16GB GDDR6
Intel Intel Arc A770 - 16GB GDDR6
NVIDIA NVIDIA Jetson Orin NX 16GB - 16GB Unknown
NVIDIA NVIDIA RTX 4060 Ti 16GB - 16GB GDDR6
NVIDIA NVIDIA RTX 4070 Ti Super - 16GB GDDR6X
NVIDIA NVIDIA RTX 4080 - 16GB GDDR6X
NVIDIA NVIDIA RTX 4080 Super - 16GB GDDR6X
NVIDIA NVIDIA RTX A4000 - 16GB GDDR6
NVIDIA NVIDIA RTX 5080 - 16GB GDDR7
AMD AMD Radeon RX 6700 XT - 12GB GDDR6
AMD AMD Radeon RX 7700 XT - 12GB GDDR6
NVIDIA NVIDIA RTX 3060 12GB - 12GB GDDR6
NVIDIA NVIDIA RTX 3080 12GB - 12GB GDDR6X
NVIDIA NVIDIA RTX 3080 Ti - 12GB GDDR6X
NVIDIA NVIDIA RTX 4070 - 12GB GDDR6X
NVIDIA NVIDIA RTX 4070 Ti - 12GB GDDR6X
AMD AMD Radeon Pro W7600 - 8GB GDDR6
AMD AMD Radeon RX 6600 - 8GB GDDR6
AMD AMD Radeon RX 6600 XT - 8GB GDDR6
AMD AMD Radeon RX 7600 - 8GB GDDR6
Intel Intel Arc A580 - 8GB GDDR6
Intel Intel Arc A750 - 8GB GDDR6
NVIDIA NVIDIA Jetson Orin Nano 8GB - 8GB Unknown
NVIDIA NVIDIA RTX 3050 - 8GB GDDR6
NVIDIA NVIDIA RTX 3060 Ti - 8GB GDDR6
NVIDIA NVIDIA RTX 3070 - 8GB GDDR6
NVIDIA NVIDIA RTX 3070 Ti - 8GB GDDR6X
NVIDIA NVIDIA RTX 4060 - 8GB GDDR6
NVIDIA NVIDIA RTX 4060 Ti 8GB - 8GB GDDR6
NVIDIA NVIDIA T1000 - 8GB GDDR6
Intel Intel Arc A380 - 6GB GDDR6
AMD AMD Radeon RX 6500 XT - 4GB GDDR6
NVIDIA NVIDIA T600 - 4GB GDDR6
Inference Settings
Quantization (15 options)
FP32 (Full Precision)
Memory: 200% | Speed: 50% | Quality: 100%
FP16 (Half Precision)
Memory: 100% | Speed: 100% | Quality: 99%
BF16 (Brain Float)
Memory: 100% | Speed: 100% | Quality: 99%
INT8
Memory: 50% | Speed: 150% | Quality: 95%
INT4
Memory: 25% | Speed: 200% | Quality: 88%
GGUF Q8_0
Memory: 50% | Speed: 140% | Quality: 96%
GGUF Q6_K
Memory: 38% | Speed: 160% | Quality: 94%
GGUF Q5_K_M
Memory: 33% | Speed: 170% | Quality: 92%
GGUF Q4_K_M
Memory: 28% | Speed: 180% | Quality: 90%
GGUF Q4_0
Memory: 25% | Speed: 200% | Quality: 85%
GGUF Q3_K_M
Memory: 20% | Speed: 220% | Quality: 80%
GGUF Q2_K
Memory: 15% | Speed: 250% | Quality: 70%
AWQ 4-bit
Memory: 25% | Speed: 200% | Quality: 92%
GPTQ 4-bit
Memory: 25% | Speed: 190% | Quality: 91%
EXL2 (Variable)
Memory: 30% | Speed: 220% | Quality: 93%
Context Length: 2,048
512
8K
Batch Size: 1
1
16
Run Inference