Model Finder
Find the perfect AI model for your task and hardware. Filter by use case, VRAM requirements, and licensing.
Llama 3.1 8B
Meta
Most popular local LLM - runs on consumer hardware with excellent performance
Claude 3.5 Sonnet
Anthropic
Anthropic's most intelligent model - excels at complex reasoning, coding, and analysis
GPT-4o
OpenAI
OpenAI's flagship omni model - text, vision, and audio in one
Whisper Large v3
OpenAI
Best open speech recognition - 99 languages supported
Llama 3.1 70B
Meta
Excellent balance of capability and efficiency for local deployment
Stable Diffusion XL 1.0
Stability AI
Industry-standard image generation model with excellent quality and flexibility
FLUX.1 [dev]
Black Forest Labs
State-of-the-art image generation with exceptional prompt following
Llama 3.3 70B
Meta
Latest Llama with improved tool use and multilingual capabilities
Llama 3.1 405B
Meta
Meta's largest and most capable open model with state-of-the-art performance
Mistral 7B
Mistral AI
The original Mistral - set new standards for 7B models
Gemini 1.5 Pro
Google's flagship with 2M token context - can process entire codebases
GPT-4o Mini
OpenAI
Small, fast, and cheap GPT-4o for most tasks
Qwen2.5 Coder 32B
Alibaba
State-of-the-art coding model - rivals GPT-4 on coding benchmarks
Stable Diffusion 1.5
Stability AI
Classic SD model with massive ecosystem of fine-tunes and LoRAs
FLUX.1 [dev]
Black Forest Labs
State-of-the-art image generation with exceptional quality
FLUX.1 [schnell]
Black Forest Labs
Fast FLUX with Apache license for commercial use
DALL·E 3
OpenAI
OpenAI's latest image generation model
BGE Large EN v1.5
BAAI
Best-in-class English embedding model for RAG applications
Whisper Large v3 Turbo
OpenAI
8x faster Whisper with minimal quality loss
ElevenLabs Multilingual v2
ElevenLabs
Industry-leading voice synthesis with cloning
Mixtral 8x7B
Mistral AI
Highly efficient MoE model - GPT-3.5 quality at fraction of compute
DeepSeek V3
DeepSeek
State-of-the-art MoE model rivaling GPT-4 at fraction of compute
Gemini 2.0 Flash
Google's latest multimodal model with native tool use and real-time streaming
Qwen2.5 Coder 7B
Alibaba
Best coding model for consumer GPUs
FLUX.1 [schnell]
Black Forest Labs
Fast FLUX variant - 4-step generation with Apache 2.0 license
BGE-M3
BAAI
Multi-lingual, multi-functionality, multi-granularity embedding
All-MiniLM-L6-v2
Sentence Transformers
Classic lightweight embedding - runs on CPU
Llama 3.2 11B Vision
Meta
Efficient multimodal model for consumer hardware
Qwen 2.5 72B
Alibaba
Alibaba's flagship model - rivals GPT-4 on many benchmarks
Qwen 2.5 7B
Alibaba
Best-in-class 7B model with 128K context
Claude 3.5 Haiku
Anthropic
Fast and cost-effective Claude for high-volume applications
Gemini 1.5 Flash
Fast Gemini with 1M context for high-volume applications
SDXL Turbo
Stability AI
Distilled SDXL for near-real-time generation (1-4 steps)
FLUX.1 [pro]
Black Forest Labs
Highest quality FLUX via API
BGE Base EN v1.5
BAAI
Efficient embedding model for resource-constrained environments
Faster Whisper Large v3
SYSTRAN
CTranslate2-optimized Whisper - 4x faster inference
Llama 3.2 3B
Meta
Compact model for edge deployment and low-resource environments
Mistral Nemo 12B
Mistral AI
Efficient 12B model with 128K context - great for consumer GPUs
Gemma 2 9B
Efficient 9B model with strong instruction following
GPT-4 Turbo
OpenAI
GPT-4 with vision and 128K context
Qwen2.5 Coder 14B
Alibaba
Strong coding performance in efficient package
Stable Diffusion 3.5 Large
Stability AI
Largest open SD3 model with best quality
FLUX.1 [pro]
Black Forest Labs
Highest quality FLUX - API access only
ControlNet SDXL
Various
Controlled generation for SDXL - pose, depth, canny, etc.
Imagen 3
Google's highest quality image generation
E5 Large v2
Microsoft
Microsoft's high-quality embedding model
Nomic Embed Text v1.5
Nomic AI
Matryoshka embedding with variable dimension support
All-MPNet-Base-v2
Sentence Transformers
Higher quality classic embedding
XTTS v2
Coqui
Best open-source voice cloning with 17 languages
SD 3.5 Large Turbo
Stability AI
Fast version of SD3.5 Large (4-step generation)
Llama 3.2 90B Vision
Meta
Large multimodal model with image understanding capabilities
Mixtral 8x22B
Mistral AI
Large MoE model with excellent efficiency per active parameter
Qwen 2.5 32B
Alibaba
Sweet spot between capability and efficiency
Phi-3 Mini 3.8B
Microsoft
Remarkably capable 3.8B model for edge deployment
Claude 3 Opus
Anthropic
Most capable Claude 3 model for complex tasks
o1
OpenAI
OpenAI's reasoning model - thinks before answering for complex problems
Qwen2-VL 7B
Alibaba
Efficient vision-language model for consumer hardware
Zephyr 7B Beta
Hugging Face
DPO-tuned Mistral with excellent chat performance
Codestral 22B
Mistral AI
Mistral's flagship code model
DeepSeek Coder V2 236B
DeepSeek
Massive MoE coding model with excellent performance
Stable Diffusion 3 Medium
Stability AI
Latest SD architecture with improved text rendering and composition
AnimateDiff v3
AnimateDiff
Animation module for Stable Diffusion models
Ideogram 2.0
Ideogram
Best-in-class text rendering in images
BGE Small EN v1.5
BAAI
Smallest BGE for edge deployment
GTE-Qwen2 7B Instruct
Alibaba
State-of-the-art embedding with 128K context
Whisper Medium
OpenAI
Balanced speed and accuracy for most use cases
MusicGen Large
Meta
Generate music from text descriptions
Qwen 2.5 14B
Alibaba
Strong performance in compact size
Gemma 2 27B
Google's largest open model with strong reasoning
Qwen2-VL 72B
Alibaba
State-of-the-art open vision-language model
Hermes 3 8B
Nous Research
Efficient agentic model for consumer hardware
OpenHermes 2.5 7B
Teknium
Popular fine-tune with great function calling
LLaVA 1.6 7B
LLaVA Team
Compact vision-language model
DeepSeek Coder V2 16B
DeepSeek
Efficient MoE coding model for consumer hardware
Code Llama 7B
Meta
Compact Code Llama for edge deployment
Playground v2.5
Playground AI
Aesthetic-focused model optimized for pleasing images
Playground v2.5
Playground
Aesthetic-focused model rivaling Midjourney
E5 Base v2
Microsoft
Efficient E5 variant
GTE Large EN v1.5
Alibaba
High-quality English embedding with 8K context
Bark
Suno
Generate speech, music, and sound effects from text
Mistral Large 2
Mistral AI
Mistral's flagship model with strong reasoning and function calling
Gemma 2 2B
Smallest Gemma 2 for edge deployment
DeepSeek V2.5
DeepSeek
Efficient MoE with excellent code and reasoning
Claude 3 Sonnet
Anthropic
Balanced Claude 3 model for most use cases
o1-mini
OpenAI
Faster, cheaper reasoning model for coding and STEM
Nemotron 70B
NVIDIA
NVIDIA's instruction-tuned model optimized for helpfulness
TinyLlama 1.1B
Zhang Peiyuan
Compact model trained on 3T tokens - surprisingly capable
LLaVA 1.6 34B
LLaVA Team
State-of-the-art open vision-language model
DeepSeek Coder 6.7B
DeepSeek
Efficient coding model for consumer GPUs
CogVideoX-5B
THUDM
State-of-the-art open video generation model
E5-Mistral 7B Instruct
Microsoft
LLM-based embedding with 32K context
Jina Embeddings v3
Jina AI
Task-specific multilingual embeddings
Whisper Small
OpenAI
Efficient Whisper for edge deployment
Fish Speech 1.4
Fish Audio
High-quality zero-shot TTS with voice cloning
SeamlessM4T v2 Large
Meta
Unified speech translation across 100+ languages
Qwen 2.5 3B
Alibaba
Compact model for resource-constrained environments
Phi-3 Small 7B
Microsoft
Efficient reasoning model with long context
Command R+
Cohere
Optimized for RAG and agentic workflows
Claude 3 Haiku
Anthropic
Fastest Claude 3 for simple tasks
Hermes 3 70B
Nous Research
Best open model for agentic tasks and function calling
Dolphin 2.9 8B
Cognitive Computations
Smaller uncensored Dolphin
LLaVA 1.6 13B
LLaVA Team
Efficient LLaVA for consumer GPUs
Moondream 2
Vikhyat
Tiny vision model - runs anywhere
Qwen2.5 Coder 3B
Alibaba
Compact coding model for resource-constrained environments
DeepSeek Coder 33B
DeepSeek
Strong coding model with good debugging capability
Code Llama 34B
Meta
Strong Code Llama with infilling support
Mochi 1 Preview
Genmo
High-quality open video generation with Apache license
GTE-Qwen2 1.5B Instruct
Alibaba
Efficient GTE with long context support
MusicGen Medium
Meta
Balanced music generation model
Stable Audio Open
Stability AI
Generate variable-length audio up to 47 seconds
Llama 3.2 1B
Meta
Smallest Llama for mobile and embedded deployment
Phi-3 Medium 14B
Microsoft
Strong reasoning and math in compact form
Yi 1.5 34B
01.AI
Strong bilingual (English/Chinese) model
Command R
Cohere
Efficient RAG and tool-use model
Gemini 1.5 Flash 8B
Smallest Gemini for cost-sensitive applications
Grok 2
xAI
xAI's flagship model with real-time information access
OpenChat 3.5 7B
OpenChat
High-quality 7B chat model trained with C-RLFT
StarCoder2 15B
BigCode
Strong open code model trained on The Stack v2
Vicuna 13B
LMSYS
Popular chat model for consumer GPUs
Dolphin 2.9 70B
Cognitive Computations
Uncensored model for research and creative use
MiniCPM-V 2.6
OpenBMB
Strong OCR and chart understanding
Mathstral 7B
Mistral AI
Mistral's math-specialized model
Code Llama 70B
Meta
Meta's largest coding model
Code Llama 13B
Meta
Efficient Code Llama for consumer hardware
CodeGemma 7B
Google's coding model with infilling support
Stable Diffusion 2.1
Stability AI
SD 2.1 with improved quality at 768px
PixArt-Σ
PixArt-alpha
Efficient DiT model capable of 4K generation
Kolors
Kwai
Strong Chinese-English image generation
Voyage 3
Voyage AI
Premium embedding API with best-in-class retrieval
Parler TTS Large
Hugging Face
Natural-sounding TTS with style control via text descriptions
MeloTTS
MyShell
Fast multilingual TTS for edge deployment
SOLAR 10.7B
Upstage
Efficient 10.7B model with depth up-scaling
Vicuna 7B
LMSYS
Efficient chat model
StableLM Zephyr 3B
Stability AI
Fast chat model for edge deployment
Neural Chat 7B
Intel
Intel-optimized chat model
SmolLM 1.7B
Hugging Face
Small but capable model from HuggingFace
Jamba 1.5 Large
AI21 Labs
Novel Mamba-Transformer hybrid with 256K context
StarCoder2 15B
BigCode
Multi-language coding model trained on The Stack v2
WizardCoder 33B
WizardLM
Strong instruction-following coding model
PixArt-Σ
PixArt
Efficient 4K image generation with small footprint
Parler TTS Mini
Hugging Face
Efficient Parler TTS for edge deployment
MusicGen Small
Meta
Efficient music generation for consumer hardware
VoiceCraft
Meta
Edit speech with natural voice cloning
Qwen 2.5 0.5B
Alibaba
Smallest Qwen for embedded and mobile
Yi 1.5 9B
01.AI
Efficient bilingual model
InternLM 2.5 20B
Shanghai AI Lab
Strong Chinese-English model from Shanghai AI Lab
Starling LM 7B
Berkeley
RLHF-tuned model with high MT-Bench scores
StarCoder2 7B
BigCode
Efficient code model for consumer hardware
Vicuna 33B
LMSYS
Strong chat model from LMSYS
Granite 34B Code
IBM
IBM's enterprise-grade code model
Qwen2.5 Coder 1.5B
Alibaba
Smallest Qwen coder for embedded/mobile
StarCoder2 7B
BigCode
Efficient StarCoder for consumer hardware
Kandinsky 3
Sber AI
Multilingual image generation with strong Russian support
HunyuanDiT
Tencent
Tencent's bilingual DiT model
Whisper Base
OpenAI
Smallest Whisper for real-time on CPU
Bark Small
Suno
Smaller Bark for faster generation
AudioGen Medium
Meta
Generate sound effects and ambient audio from text
MARS5 TTS
CAMB.AI
Novel TTS with fine-grained prosody control
Orca 2 13B
Microsoft
Microsoft's reasoning-focused model
StableLM 2 12B
Stability AI
Stability AI's flagship chat model
Granite 8B Code
IBM
Efficient IBM code model
Jamba 1.5 Mini
AI21 Labs
Smaller Jamba with same 256K context
CodeGemma 2B
Tiny CodeGemma for embedded systems
WizardCoder 15B
WizardLM
Efficient WizardCoder for consumer hardware
Yi 1.5 6B
01.AI
Compact bilingual model for edge
Grok 2 Mini
xAI
Fast Grok for simple tasks
InternLM 2.5 7B
Shanghai AI Lab
Efficient Chinese-English model
WizardLM 2 8x22B
Microsoft
Microsoft's MoE model for complex reasoning
StarCoder2 3B
BigCode
Smallest StarCoder2 for edge deployment
Orca 2 7B
Microsoft
Smaller Orca for edge deployment
StableLM 2 1.6B
Stability AI
Smallest StableLM for mobile/edge
SmolLM 360M
Hugging Face
Tiny model for embedded systems
Amazon Titan Text Express
Amazon
Amazon's general-purpose LLM via Bedrock
DeepSeek Coder 1.3B
DeepSeek
Tiny coding model for embedded systems
StarCoder2 3B
BigCode
Compact StarCoder for edge deployment
MPT 7B Instruct
MosaicML
Efficient model with 65K context support
RWKV-6 World 7B
RWKV Foundation
Linear attention model with infinite context potential
Falcon 180B
TII
TII's largest open model
Baichuan 2 13B
Baichuan
Strong Chinese chat model
MPT 30B Chat
MosaicML
MosaicML's chat model with 8K context
BLOOM 176B
BigScience
Largest open multilingual model - 46 languages
Falcon 40B
TII
Strong open multilingual model
Dolly v2 12B
Databricks
First commercially usable instruction-tuned model
RedPajama INCITE 7B
Together
Open reproduction of LLaMA trained on RedPajama
BLOOMZ 7B1
BigScience
Instruction-tuned BLOOM
OpenLLaMA 13B
OpenLM
Fully open reproduction of LLaMA
Pythia 12B
EleutherAI
Research model with full training checkpoints
OpenLLaMA 7B
OpenLM
Smaller OpenLLaMA variant
Pythia 6.9B
EleutherAI
Smaller Pythia variant
Cerebras GPT 13B
Cerebras
Compute-optimal GPT trained by Cerebras