Best models for 24 GB VRAM

RTX 3090 / 4090 / 5090 / M-series Mac with 36 GB+

24 GB is the LLM enthusiast sweet spot — 30 B class models at Q4 with 32K+ context, plus full FLUX.1 image generation.

1
Alibaba
Qwen 2.5 32B
Premium 32B model. Top-tier reasoning. Mac with 32GB+ RAM.
32B≥ 18.99 GB
2
Google
Gemma 3 27B
Google's flagship open model. Near GPT-4 quality. Needs 20GB+ RAM.
27B≥ 15.91 GB
3
Mistral AI
Mistral Small 22B
22B parameter model. Strong reasoning and multilingual. Needs 16GB+ RAM.
22B≥ 12.93 GB
4
Alibaba
Qwen 2.5 Coder 14B
Powerful 14B code model. Excellent for complex programming tasks.
14B≥ 8.87 GB
5
Black Forest Labs
FLUX.1 Dev (GGUF)
Highest quality FLUX model. 20-50 steps. Mac with 24GB+ RAM.
12B≥ 14 GB
6
Stability AI
Stable Diffusion XL (CoreML)
Higher quality image generation. CoreML optimized for iOS. Requires 6GB+ usable memory (iPad/Mac recommended).
3.5B≥ 3.34 GB
7
Stability AI
Stable Diffusion 3 Medium (GGUF)
SD 3 with MMDiT architecture. Superior text rendering.
2.5B≥ 9.15 GB
8
01.AI
Yi 1.5 9B Chat
9B bilingual model with strong reasoning.
9B≥ 5.46 GB

Not sure which fits your machine? Auto-detect your hardware →