Back to RunThisModel

AI Models You Can Run with 20GB VRAM

108 models compatible across 9 categories

Compatible Models
108
Largest Model
Qwen 2.5 32B (32B)
Categories
9
💬

Chat / LLM (46 models)

Qwen 2.5 32B

Alibaba

32B
19.0GB VRAMQ4_K_M
85%

Gemma 3 27B

Google

27B
15.9GB VRAMQ4_K_M
85%

Mistral Small 22B

Mistral AI

22B
12.9GB VRAMQ4_K_M
85%

Phi-4

Microsoft

14B
15.0GB VRAMQ8_0
98%

Qwen 2.5 14B

Alibaba

14B
15.1GB VRAMQ8_0
98%

Gemma 3 12B

Google

12B
12.2GB VRAMQ8_0
98%

Mistral Nemo 12B

Mistral AI

12B
12.6GB VRAMQ8_0
98%

Solar 10.7B

Upstage

10.7B
11.1GB VRAMQ8_0
98%

Falcon 3 10B

TII

10B
10.7GB VRAMQ8_0
98%

Gemma 2 9B Instruct

Google

9.2B
9.7GB VRAMQ8_0
98%

Yi 1.5 9B Chat

01.AI

9B
9.2GB VRAMQ8_0
98%

DeepSeek R1 Distill 8B

DeepSeek

8B
8.4GB VRAMQ8_0
98%

Llama 3.1 8B Instruct

Meta

8B
17.0GB VRAMFP16
100%

Granite 3.3 8B

IBM

8B
8.6GB VRAMQ8_0
98%

EXAONE 3.5 7.8B

LG AI

7.8B
8.2GB VRAMQ8_0
98%

InternLM 2.5 7B

Shanghai AI Lab

7.7B
8.2GB VRAMQ8_0
98%

Qwen 2.5 7B Instruct

Alibaba

7.6B
9.0GB VRAMQ8_0
98%

Mistral 7B Instruct v0.3

Mistral AI

7.3B
15.5GB VRAMFP16
100%

Falcon 3 7B

TII

7B
8.3GB VRAMQ8_0
98%

OLMo 2 7B

Allen AI

7B
7.7GB VRAMQ8_0
98%

OpenChat 3.5 7B

OpenChat

7B
7.7GB VRAMQ8_0
98%

Yi 1.5 6B Chat

01.AI

6B
6.5GB VRAMQ8_0
98%

Gemma 3 4B

Google

4B
4.3GB VRAMQ8_0
98%

Nemotron Mini 4B

NVIDIA

4B
4.7GB VRAMQ8_0
98%

Danube 3 4B

H2O.ai

4B
4.4GB VRAMQ8_0
98%

Phi-3.5 Mini 3.8B

Microsoft

3.8B
4.3GB VRAMQ8_0
98%

Phi-4 Mini 3.8B

Microsoft

3.8B
4.3GB VRAMQ8_0
98%

Llama 3.2 3B Instruct

Meta

3.2B
3.7GB VRAMQ8_0
98%

Qwen 2.5 3B

Alibaba

3B
3.9GB VRAMQ8_0
98%

Falcon 3 3B

TII

3B
3.8GB VRAMQ8_0
98%

StableLM Zephyr 3B

Stability AI

3B
3.3GB VRAMQ8_0
98%

Rocket 3B

Pansophic

3B
3.3GB VRAMQ8_0
98%

Gemma 2 2B

Google

2.6B
3.1GB VRAMQ8_0
98%

EXAONE 3.5 2.4B

LG AI

2.4B
3.1GB VRAMQ8_0
98%

Granite 3.3 2B

IBM

2B
3.0GB VRAMQ8_0
98%

SmolLM2 1.7B

HuggingFace

1.7B
2.2GB VRAMQ8_0
98%

Qwen 2.5 1.5B

Alibaba

1.5B
2.3GB VRAMQ8_0
98%

DeepSeek R1 Distill 1.5B

DeepSeek

1.5B
2.3GB VRAMQ8_0
98%

Llama 3.2 1B Instruct

Meta

1.24B
2.8GB VRAMFP16
100%

TinyLlama 1.1B

TinyLlama

1.1B
1.6GB VRAMQ8_0
98%

Gemma 3 1B

Google

1B
1.5GB VRAMQ8_0
98%

Falcon 3 1B

TII

1B
2.2GB VRAMQ8_0
98%

Qwen 2.5 0.5B

Alibaba

0.5B
1.1GB VRAMQ8_0
98%

Danube 3 500M

H2O.ai

0.5B
1.0GB VRAMQ8_0
98%

SmolLM2 360M

HuggingFace

0.36B
0.9GB VRAMQ8_0
98%

SmolLM2 135M

HuggingFace

0.135B
0.8GB VRAMFP16
100%
💻

Coding (16 models)

👁

Multimodal (6 models)

🎨

Image Generation (9 models)

🎤

Speech Recognition (9 models)

🔊

Text to Speech (14 models)

🎵

Audio Generation (1 model)

🧩

Embedding (5 models)

🔄

Reranker (2 models)

Compatible GPUs (30 with 20GB+ VRAM)

Frequently Asked Questions

What is the best AI model I can run with 20GB VRAM?
The largest model you can run with 20GB VRAM is Qwen 2.5 32B (32B parameters) using Q4_K_M quantization. There are 108 total compatible models.
Can I run Llama with 20GB VRAM?
Yes! You can run 6 Llama models with 20GB VRAM: Code Llama 13B Instruct (Q4_K_M), Llama 3.1 8B Instruct (FP16), Code Llama 7B (Q8_0), Llama 3.2 3B Instruct (Q8_0), Llama 3.2 1B Instruct (FP16), TinyLlama 1.1B (Q8_0).
What GPU has 20GB VRAM?
GPUs with 20GB or more VRAM include: AMD Radeon RX 7900 XT, Apple M3, Apple M2, AMD Radeon RX 7900 XTX, NVIDIA GeForce RTX 3090, and 25 more.

Other VRAM Tiers