~/runthismodel
daemon okbuild 5a3c91d00:00:00Z
Back to RunThisModel

AI Models You Can Run with 192GB VRAM

145 models compatible across 9 categories

Compatible Models
145
Largest Model
Qwen3 235B-A22B (235B)
Categories
9
💬

Chat / LLM (74 models)

Qwen3 235B-A22B

Alibaba

235B
144.0GB VRAMQ4_K_M
85%

Mixtral 8x22B Instruct

Mistral AI

141B
88.0GB VRAMQ4_K_M
85%

Magnum v4 72B

Anthracite

72B
144.5GB VRAMBF16
100%

Llama 3.1 70B Instruct

Meta

70B
142.0GB VRAMFP16
100%

Euryale L3.3 70B v2.3

Sao10K

70B
140.5GB VRAMBF16
100%

Llama 3.1 70B (lorablated)

mlabonne

70B
140.5GB VRAMBF16
100%

Mixtral 8x7B Instruct

Mistral AI

46.7B
30.5GB VRAMQ5_K_M
92%

Phi-3.5 MoE

Microsoft

41.9B
24.1GB VRAMQ4_K_M
85%

Qwen 2.5 32B

Alibaba

32B
19.0GB VRAMQ4_K_M
85%

Skyfall 31B v4.2

TheDrummer

31B
62.5GB VRAMBF16
100%

Qwen3 30B-A3B

Alibaba

30.5B
36.0GB VRAMQ8_0
98%

Gemma 3 27B

Google

27B
15.9GB VRAMQ4_K_M
85%

Dolphin Mistral 24B (Venice Edition)

Cognitive Computations

24B
48.5GB VRAMBF16
100%

Dolphin 3.0 R1 Mistral 24B

Cognitive Computations

24B
48.5GB VRAMBF16
100%

Cydonia 24B v4.3

TheDrummer

24B
48.5GB VRAMBF16
100%

Mistral Small 22B

Mistral AI

22B
12.9GB VRAMQ4_K_M
85%

Magnum v4 22B

Anthracite

22B
44.5GB VRAMBF16
100%

DeepSeek MoE 16B

DeepSeek

16.4B
11.0GB VRAMQ4_K_M
85%

Rocinante XL 16B v1

TheDrummer

16B
32.5GB VRAMBF16
100%

Phi-4

Microsoft

14B
15.0GB VRAMQ8_0
98%

Qwen 2.5 14B

Alibaba

14B
15.1GB VRAMQ8_0
98%

Gemma 3 12B

Google

12B
12.2GB VRAMQ8_0
98%

Mistral Nemo 12B

Mistral AI

12B
12.6GB VRAMQ8_0
98%

Magnum v4 12B

Anthracite

12B
24.5GB VRAMBF16
100%

Rocinante 12B v1.1

TheDrummer

12B
24.5GB VRAMBF16
100%

Mistral Nemo Base 12B

Mistral AI

12B
24.5GB VRAMBF16
100%

Solar 10.7B

Upstage

10.7B
11.1GB VRAMQ8_0
98%

Falcon 3 10B

TII

10B
10.7GB VRAMQ8_0
98%

Gemma 2 9B Instruct

Google

9.2B
9.7GB VRAMQ8_0
98%

Yi 1.5 9B Chat

01.AI

9B
9.2GB VRAMQ8_0
98%

Gemma 3 MoE 9B

Google

9B
7.0GB VRAMQ4_K_M
85%

DeepSeek R1 Distill 8B

DeepSeek

8B
8.4GB VRAMQ8_0
98%

Llama 3.1 8B Instruct

Meta

8B
17.0GB VRAMFP16
100%

Granite 3.3 8B

IBM

8B
8.6GB VRAMQ8_0
98%

Dolphin 3.0 Llama 3.1 8B

Cognitive Computations

8B
16.5GB VRAMBF16
100%

NeuralDaredevil 8B (abliterated)

mlabonne

8B
16.5GB VRAMBF16
100%

Llama 3.1 8B Instruct (abliterated)

mlabonne

8B
16.5GB VRAMBF16
100%

Stheno L3 8B v3.2

Sao10K

8B
16.5GB VRAMBF16
100%

Qwen3 8B Base

Alibaba

8B
16.5GB VRAMBF16
100%

EXAONE 3.5 7.8B

LG AI

7.8B
8.2GB VRAMQ8_0
98%

InternLM 2.5 7B

Shanghai AI Lab

7.7B
8.2GB VRAMQ8_0
98%

Qwen 2.5 7B Instruct

Alibaba

7.6B
9.0GB VRAMQ8_0
98%

Mistral 7B Instruct v0.3

Mistral AI

7.3B
15.5GB VRAMFP16
100%

Falcon 3 7B

TII

7B
8.3GB VRAMQ8_0
98%

OLMo 2 7B

Allen AI

7B
7.7GB VRAMQ8_0
98%

OpenChat 3.5 7B

OpenChat

7B
7.7GB VRAMQ8_0
98%

OLMoE 1B-7B

AI2

6.9B
7.3GB VRAMQ8_0
98%

Yi 1.5 6B Chat

01.AI

6B
6.5GB VRAMQ8_0
98%

Gemma 3 4B

Google

4B
4.3GB VRAMQ8_0
98%

Nemotron Mini 4B

NVIDIA

4B
4.7GB VRAMQ8_0
98%

Danube 3 4B

H2O.ai

4B
4.4GB VRAMQ8_0
98%

Phi-3.5 Mini 3.8B

Microsoft

3.8B
4.3GB VRAMQ8_0
98%

Phi-4 Mini 3.8B

Microsoft

3.8B
4.3GB VRAMQ8_0
98%

Granite 3.0 3B-A800M

IBM

3.4B
2.4GB VRAMQ4_K_M
85%

Llama 3.2 3B Instruct

Meta

3.2B
3.7GB VRAMQ8_0
98%

Qwen 2.5 3B

Alibaba

3B
3.9GB VRAMQ8_0
98%

Falcon 3 3B

TII

3B
3.8GB VRAMQ8_0
98%

StableLM Zephyr 3B

Stability AI

3B
3.3GB VRAMQ8_0
98%

Rocket 3B

Pansophic

3B
3.3GB VRAMQ8_0
98%

Gemma 2 2B

Google

2.6B
3.1GB VRAMQ8_0
98%

EXAONE 3.5 2.4B

LG AI

2.4B
3.1GB VRAMQ8_0
98%

Granite 3.3 2B

IBM

2B
3.0GB VRAMQ8_0
98%

SmolLM2 1.7B

HuggingFace

1.7B
2.2GB VRAMQ8_0
98%

Qwen 2.5 1.5B

Alibaba

1.5B
2.3GB VRAMQ8_0
98%

DeepSeek R1 Distill 1.5B

DeepSeek

1.5B
2.3GB VRAMQ8_0
98%

Granite 3.0 1B-A400M

IBM

1.3B
1.3GB VRAMQ4_K_M
85%

Llama 3.2 1B Instruct

Meta

1.24B
2.8GB VRAMFP16
100%

TinyLlama 1.1B

TinyLlama

1.1B
1.6GB VRAMQ8_0
98%

Gemma 3 1B

Google

1B
1.5GB VRAMQ8_0
98%

Falcon 3 1B

TII

1B
2.2GB VRAMQ8_0
98%

Qwen 2.5 0.5B

Alibaba

0.5B
1.1GB VRAMQ8_0
98%

Danube 3 500M

H2O.ai

0.5B
1.0GB VRAMQ8_0
98%

SmolLM2 360M

HuggingFace

0.36B
0.9GB VRAMQ8_0
98%

SmolLM2 135M

HuggingFace

0.135B
0.8GB VRAMFP16
100%
💻

Coding (17 models)

👁

Multimodal (6 models)

🎨

Image Generation (9 models)

🎤

Speech Recognition (9 models)

🔊

Text to Speech (14 models)

🎵

Audio Generation (3 models)

🧩

Embedding (5 models)

🔄

Reranker (2 models)

Compatible GPUs (4 with 192GB+ VRAM)

Frequently Asked Questions

What is the best AI model I can run with 192GB VRAM?
The largest model you can run with 192GB VRAM is Qwen3 235B-A22B (235B parameters) using Q4_K_M quantization. There are 145 total compatible models.
Can I run Llama with 192GB VRAM?
Yes! You can run 10 Llama models with 192GB VRAM: Llama 3.1 70B Instruct (FP16), Llama 3.1 70B (lorablated) (BF16), Code Llama 13B Instruct (Q4_K_M), Llama 3.1 8B Instruct (FP16), Dolphin 3.0 Llama 3.1 8B (BF16), Llama 3.1 8B Instruct (abliterated) (BF16), Code Llama 7B (Q8_0), Llama 3.2 3B Instruct (Q8_0), Llama 3.2 1B Instruct (FP16), TinyLlama 1.1B (Q8_0).
What GPU has 192GB VRAM?
GPUs with 192GB or more VRAM include: Apple M4 Ultra, Apple M3 Ultra, Apple M2 Ultra, AMD Instinct MI300X.

Other VRAM Tiers