Compatible Models
109
Largest Model
Llama 3.1 70B Instruct (70B)
Categories
9
💬
Chat / LLM (47 models)
Llama 3.1 70B Instruct
Meta
50.0GB VRAMQ5_K_M
90%
Qwen 2.5 32B
Alibaba
19.0GB VRAMQ4_K_M
85%
Gemma 3 27B
15.9GB VRAMQ4_K_M
85%
Mistral Small 22B
Mistral AI
12.9GB VRAMQ4_K_M
85%
Phi-4
Microsoft
15.0GB VRAMQ8_0
98%
Qwen 2.5 14B
Alibaba
15.1GB VRAMQ8_0
98%
Gemma 3 12B
12.2GB VRAMQ8_0
98%
Mistral Nemo 12B
Mistral AI
12.6GB VRAMQ8_0
98%
Solar 10.7B
Upstage
11.1GB VRAMQ8_0
98%
Falcon 3 10B
TII
10.7GB VRAMQ8_0
98%
Gemma 2 9B Instruct
9.7GB VRAMQ8_0
98%
Yi 1.5 9B Chat
01.AI
9.2GB VRAMQ8_0
98%
DeepSeek R1 Distill 8B
DeepSeek
8.4GB VRAMQ8_0
98%
Llama 3.1 8B Instruct
Meta
17.0GB VRAMFP16
100%
Granite 3.3 8B
IBM
8.6GB VRAMQ8_0
98%
EXAONE 3.5 7.8B
LG AI
8.2GB VRAMQ8_0
98%
InternLM 2.5 7B
Shanghai AI Lab
8.2GB VRAMQ8_0
98%
Qwen 2.5 7B Instruct
Alibaba
9.0GB VRAMQ8_0
98%
Mistral 7B Instruct v0.3
Mistral AI
15.5GB VRAMFP16
100%
Falcon 3 7B
TII
8.3GB VRAMQ8_0
98%
OLMo 2 7B
Allen AI
7.7GB VRAMQ8_0
98%
OpenChat 3.5 7B
OpenChat
7.7GB VRAMQ8_0
98%
Yi 1.5 6B Chat
01.AI
6.5GB VRAMQ8_0
98%
Gemma 3 4B
4.3GB VRAMQ8_0
98%
Nemotron Mini 4B
NVIDIA
4.7GB VRAMQ8_0
98%
Danube 3 4B
H2O.ai
4.4GB VRAMQ8_0
98%
Phi-3.5 Mini 3.8B
Microsoft
4.3GB VRAMQ8_0
98%
Phi-4 Mini 3.8B
Microsoft
4.3GB VRAMQ8_0
98%
Llama 3.2 3B Instruct
Meta
3.7GB VRAMQ8_0
98%
Qwen 2.5 3B
Alibaba
3.9GB VRAMQ8_0
98%
Falcon 3 3B
TII
3.8GB VRAMQ8_0
98%
StableLM Zephyr 3B
Stability AI
3.3GB VRAMQ8_0
98%
Rocket 3B
Pansophic
3.3GB VRAMQ8_0
98%
Gemma 2 2B
3.1GB VRAMQ8_0
98%
EXAONE 3.5 2.4B
LG AI
3.1GB VRAMQ8_0
98%
Granite 3.3 2B
IBM
3.0GB VRAMQ8_0
98%
SmolLM2 1.7B
HuggingFace
2.2GB VRAMQ8_0
98%
Qwen 2.5 1.5B
Alibaba
2.3GB VRAMQ8_0
98%
DeepSeek R1 Distill 1.5B
DeepSeek
2.3GB VRAMQ8_0
98%
Llama 3.2 1B Instruct
Meta
2.8GB VRAMFP16
100%
TinyLlama 1.1B
TinyLlama
1.6GB VRAMQ8_0
98%
Gemma 3 1B
1.5GB VRAMQ8_0
98%
Falcon 3 1B
TII
2.2GB VRAMQ8_0
98%
Qwen 2.5 0.5B
Alibaba
1.1GB VRAMQ8_0
98%
Danube 3 500M
H2O.ai
1.0GB VRAMQ8_0
98%
SmolLM2 360M
HuggingFace
0.9GB VRAMQ8_0
98%
SmolLM2 135M
HuggingFace
0.8GB VRAMFP16
100%
💻
Coding (16 models)
Qwen 2.5 Coder 14B
Alibaba
15.1GB VRAMQ8_0
98%
Code Llama 13B Instruct
Meta
7.8GB VRAMQ4_K_M
85%
Yi Coder 9B
01.AI
9.2GB VRAMQ8_0
98%
CodeGemma 7B
8.9GB VRAMQ8_0
98%
Qwen 2.5 Coder 7B
Alibaba
8.0GB VRAMQ8_0
98%
StarCoder2 7B
BigCode
7.6GB VRAMQ8_0
98%
Code Llama 7B
Meta
7.2GB VRAMQ8_0
98%
DeepSeek Coder 6.7B
DeepSeek
7.2GB VRAMQ8_0
98%
Qwen 2.5 Coder 3B
Alibaba
3.9GB VRAMQ8_0
98%
StarCoder2 3B
BigCode
3.5GB VRAMQ8_0
98%
Stable Code 3B
Stability AI
3.3GB VRAMQ8_0
98%
CodeGemma 2B
3.0GB VRAMQ8_0
98%
Qwen 2.5 Coder 1.5B
Alibaba
2.3GB VRAMQ8_0
98%
Yi Coder 1.5B
01.AI
2.0GB VRAMQ8_0
98%
DeepSeek Coder 1.3B
DeepSeek
1.8GB VRAMQ8_0
98%
Qwen 2.5 Coder 0.5B
Alibaba
1.1GB VRAMQ8_0
98%
👁
Multimodal (6 models)
🎨
Image Generation (9 models)
FLUX.1 Schnell (GGUF)
Black Forest Labs
14.0GB VRAMQ5_0
90%
FLUX.1 Dev (GGUF)
Black Forest Labs
14.0GB VRAMQ5_0
100%
Stable Diffusion XL (CoreML)
Stability AI
3.3GB VRAMCoreML
100%
SDXL Turbo (GGUF)
Stability AI
5.0GB VRAMQ5_0
85%
Stable Diffusion 3 Medium (GGUF)
Stability AI
9.2GB VRAMQ8_0
95%
Stable Diffusion 2.1 Base (CoreML)
Stability AI / Apple
1.6GB VRAMCoreML-Palettized
85%
Stable Diffusion 1.5 (CoreML)
Runway
2.5GB VRAMCoreML-Palettized
90%
Stable Diffusion 1.5 (GGUF)
Runway / GPUStack
2.3GB VRAMQ8_0
95%
Stable Diffusion 2.1 (GGUF)
Stability AI
2.7GB VRAMQ8_0
95%
🎤
Speech Recognition (9 models)
Whisper Large v3
OpenAI
3.4GB VRAMQ8_0
98%
Whisper Large v3 Turbo
OpenAI
2.0GB VRAMQ8_0
95%
Whisper Medium
OpenAI
1.9GB VRAMQ8_0
92%
Distil-Whisper Large v3
HuggingFace
1.9GB VRAMQ8_0
96%
Whisper Small
OpenAI
0.9GB VRAMQ8_0
85%
Whisper Base
OpenAI
0.3GB VRAMQ8_0
80%
Whisper Base English
OpenAI
0.3GB VRAMQ8_0
82%
Whisper Tiny English (Quantized)
OpenAI
0.1GB VRAMQ5_1
65%
Whisper Tiny
OpenAI
0.2GB VRAMQ8_0
70%
🔊
Text to Speech (14 models)
Kokoro 82M TTS
Kokoro
0.6GB VRAMONNX-Q8F16
95%
Piper TTS - Amy (English)
Rhasspy
0.1GB VRAMONNX
85%
Piper TTS - Lessac (English)
Rhasspy
0.1GB VRAMONNX
85%
Piper TTS - LibriTTS-R (English)
Rhasspy
0.6GB VRAMONNX
80%
Piper TTS - Spanish (MLS)
Rhasspy
0.1GB VRAMONNX
80%
Piper TTS - French (Siwis)
Rhasspy
0.5GB VRAMONNX
80%
Piper TTS - German (Thorsten)
Rhasspy
0.1GB VRAMONNX
80%
Piper TTS - Chinese (Huayan)
Rhasspy
0.1GB VRAMONNX
80%
Piper TTS - Japanese (Kokoro)
Rhasspy
0.1GB VRAMONNX
80%
Piper TTS - Korean
Rhasspy
0.1GB VRAMONNX
80%
Piper TTS - Russian (Irina)
Rhasspy
0.1GB VRAMONNX
80%
Piper TTS - Portuguese (Faber)
Rhasspy
0.1GB VRAMONNX
80%
Piper TTS - Italian (Riccardo)
Rhasspy
0.5GB VRAMONNX
80%
Piper TTS - Arabic (Kareem)
Rhasspy
0.1GB VRAMONNX
80%
🎵
Audio Generation (1 model)
🧩
Embedding (5 models)
🔄
Reranker (2 models)
Compatible GPUs (12 with 64GB+ VRAM)
Apple M1 Max
64GBapple
NVIDIA A100 80GB
80GB$15,000nvidia
NVIDIA H100
80GB$30,000nvidia
Apple M2 Max
96GBapple
Apple M4 Max
128GBapple
Apple M3 Max
128GBapple
Apple M1 Ultra
128GBapple
AMD Instinct MI250X
128GB$10,000amd
Apple M4 Ultra
192GBapple
Apple M3 Ultra
192GBapple
Apple M2 Ultra
192GBapple
AMD Instinct MI300X
192GB$15,000amd
Frequently Asked Questions
What is the best AI model I can run with 64GB VRAM?
The largest model you can run with 64GB VRAM is Llama 3.1 70B Instruct (70B parameters) using Q5_K_M quantization. There are 109 total compatible models.
Can I run Llama with 64GB VRAM?
Yes! You can run 7 Llama models with 64GB VRAM: Llama 3.1 70B Instruct (Q5_K_M), Code Llama 13B Instruct (Q4_K_M), Llama 3.1 8B Instruct (FP16), Code Llama 7B (Q8_0), Llama 3.2 3B Instruct (Q8_0), Llama 3.2 1B Instruct (FP16), TinyLlama 1.1B (Q8_0).
What GPU has 64GB VRAM?
GPUs with 64GB or more VRAM include: Apple M1 Max, NVIDIA A100 80GB, NVIDIA H100, Apple M2 Max, Apple M4 Max, and 7 more.