Best models for 12 GB VRAM

RTX 3060 12 GB / RTX 4070 / M-series Mac with 24 GB

12 GB unlocks bigger 12–14B models with a comfortable context window. Strong sweet spot for code, vision, and reasoning.

1
Microsoft
Phi-4
Microsoft's 14B parameter model. Punches well above its weight on reasoning.
14B≥ 8.93 GB
2
Alibaba
Qwen 2.5 14B
Strong 14B model with excellent coding and reasoning. iPad Pro recommended.
14B≥ 8.87 GB
3
Alibaba
Qwen 2.5 Coder 14B
Powerful 14B code model. Excellent for complex programming tasks.
14B≥ 8.87 GB
4
Google
Gemma 3 12B
High quality 12B model. Excellent for iPad Pro and Mac.
12B≥ 7.3 GB
5
Mistral AI
Mistral Nemo 12B
Mistral's 12B model with excellent instruction following.
12B≥ 7.46 GB
6
Microsoft
Phi-3.5 Vision
Vision-language model from Microsoft. Can understand images and documents.
4.2B≥ 3.2 GB
7
LLaVA
LLaVA 1.6 7B
Multimodal vision-language model. Understands images and answers questions about them.
7B≥ 5 GB
8
Black Forest Labs
FLUX.1 Schnell (GGUF)
Fast 1-4 step generation. State-of-the-art quality. Needs 16GB+ RAM.
12B≥ 14 GB

Not sure which fits your machine? Auto-detect your hardware →