~/runthismodel
daemon okbuild 5a3c91d00:00:00Z
./rankings·top-of-list · 137 models across 8 categoriessorted by min VRAM asc · params desc as tie-break
top of list
best model in each category, ranked
rank #1 = smallest VRAM that still ships quality. ranking ties broken by parameter count. click a row to drill into the model card.
chat & general 74coding 17image gen 9speech-to-text 9text-to-speech 14audio gen 3multimodal / vision 6embedding 5
chat & general·74 modelsllm
general-purpose language models for conversation, writing, and reasoning
rankmodelauthorparamsmin vramaction
1SmolLM2 135M
Tiny 135M model. Default LLM - guaranteed to run on any iPhone. Only 145MB download. Per
HuggingFace0.135B0.6GBopen
2SmolLM2 360M
Compact 360M model. Good for basic tasks on very constrained devices.
HuggingFace0.36B0.8GBopen
3Danube 3 500M
Ultra-tiny 500M model. Even smaller than SmolLM. Runs anywhere.
H2O.ai0.5B0.8GBopen
4Qwen 2.5 0.5B
Ultra-small 0.5B model from Alibaba. Minimal resource requirements.
Alibaba0.5B1.0GBopen
5TinyLlama 1.1B
Lightweight 1.1B chat model based on Llama architecture. Great for phones.
TinyLlama1.1B1.1GBopen
6Llama 3.2 1B Instruct
Ultra-compact 1B model. Runs on virtually any device including smartphones.
Meta1.24B1.3GBopen
7Gemma 3 1B
Google's latest tiny 1B model. Excellent quality for its size.
Google1B1.3GBopen
8Granite 3.0 1B-A400M
Tiny IBM MoE for edge and CPU inference. 1.3 B total, only 400 M active.
IBM1.3B1.3GBopen
9SmolLM2 1.7B
Capable 1.7B model from HuggingFace. Good balance for mobile devices.
HuggingFace1.7B1.5GBopen
10Falcon 3 1B
Ultra-compact 1B model from Technology Innovation Institute.
TII1B1.5GBopen
11Qwen 2.5 1.5B
Compact 1.5B model with strong multilingual and coding abilities.
Alibaba1.5B1.5GBopen
12DeepSeek R1 Distill 1.5B
Compact reasoning model distilled from DeepSeek R1. Strong chain-of-thought in a tiny pa
DeepSeek1.5B1.5GBopen
13Granite 3.3 2B
IBM's compact 2B model. Good at following instructions.
IBM2B1.9GBopen
14EXAONE 3.5 2.4B
Compact model from LG. Optimized for Korean and English.
LG AI2.4B2.0GBopen
15StableLM Zephyr 3B
Compact 3B model from Stability AI. Good chat quality for its size.
Stability AI3B2.1GBopen
16Rocket 3B
Fast 3B model tuned for helpful responses.
Pansophic3B2.1GBopen
17Gemma 2 2B
Google's compact 2.6B model. Efficient and capable for mobile use.
Google2.6B2.1GBopen
18Falcon 3 3B
Compact 3B Falcon model with good performance.
TII3B2.4GBopen
19Llama 3.2 3B Instruct
Meta's compact 3B model designed for edge and mobile deployment.
Meta3.2B2.4GBopen
20Granite 3.0 3B-A800M
IBM enterprise-grade small MoE. 3.4 B total, 800 M active. Long context, function-callin
IBM3.4B2.4GBopen
21Qwen 2.5 3B
Versatile 3B model with strong reasoning and multilingual capabilities.
Alibaba3B2.5GBopen
22Danube 3 4B
Capable 4B model from H2O.ai. Good for phones.
H2O.ai4B2.7GBopen
23Phi-3.5 Mini 3.8B
Tiny but capable 3.8B model. Runs on almost any hardware including phones.
Microsoft3.8B2.7GBopen
24Gemma 3 4B
Balanced 4B model with strong reasoning. Great for iPhones.
Google4B2.8GBopen
25Phi-4 Mini 3.8B
Latest Phi mini with strong reasoning. Drop-in upgrade from Phi-3.5 Mini.
Microsoft3.8B2.8GBopen
26Nemotron Mini 4B
NVIDIA's compact 4B model optimized for edge deployment.
NVIDIA4B3.0GBopen
27Yi 1.5 6B Chat
Efficient 6B bilingual (English/Chinese) model.
01.AI6B3.9GBopen
28OLMoE 1B-7B
Fully open MoE — 7 B total, only 1.3 B active per token. Tiny footprint, surprisingly ca
AI26.9B4.4GBopen
29Mistral 7B Instruct v0.3
Efficient 7B model from Mistral AI with strong performance for its size.
Mistral AI7.3B4.6GBopen
30OpenChat 3.5 7B
Fine-tuned Mistral 7B for chat. Strong instruction following.
OpenChat7B4.6GBopen
31OLMo 2 7B
Fully open research model. Transparent training.
Allen AI7B4.7GBopen
32InternLM 2.5 7B
Strong 7B model from China. Good at tool use and math.
Shanghai AI Lab7.7B4.9GBopen
33EXAONE 3.5 7.8B
7.8B model from LG. Strong bilingual Korean/English.
LG AI7.8B4.9GBopen
34Falcon 3 7B
Full-size Falcon 3 with strong performance across benchmarks.
TII7B5.0GBopen
35DeepSeek R1 Distill 8B
Compact reasoning model. Good reasoning capabilities in a small package.
DeepSeek8B5.1GBopen
36Llama 3.1 8B Instruct
Meta's 8B parameter instruction-tuned model. Great balance of performance and efficiency
Meta8B5.1GBopen
37Dolphin 3.0 Llama 3.1 8B
Eric Hartford's flagship uncensored fine-tune of Llama 3.1 8B. Steerable assistant with
Cognitive Computations8B5.1GBopen
38NeuralDaredevil 8B (abliterated)
Llama-3 8B with refusal direction ablated, then DPO-recovered to restore capability. Bes
mlabonne8B5.1GBopen
39Llama 3.1 8B Instruct (abliterated)
Pure refusal-direction ablation of Llama-3.1-8B-Instruct. No retraining — keeps the offi
mlabonne8B5.1GBopen
40Stheno L3 8B v3.2
Long-running 8B roleplay reference. Trained for character voice consistency and long-for
Sao10K8B5.1GBopen
41Granite 3.3 8B
IBM's 8B instruction model. Enterprise quality.
IBM8B5.1GBopen
42Qwen3 8B Base
Official Qwen3 8B foundation model — pretrained only, no RLHF or refusal training. The '
Alibaba8B5.3GBopen
43Qwen 2.5 7B Instruct
Efficient 7B model with strong coding and reasoning abilities.
Alibaba7.6B5.3GBopen
44Yi 1.5 9B Chat
9B bilingual model with strong reasoning.
01.AI9B5.5GBopen
45Gemma 2 9B Instruct
Google's efficient 9B model. Great performance-to-size ratio.
Google9.2B5.9GBopen
46Falcon 3 10B
10B Falcon model. Good iPad model.
TII10B6.4GBopen
47Solar 10.7B
Depth-upscaled 10.7B model. Strong reasoning.
Upstage10.7B6.5GBopen
48Gemma 3 MoE 9B
Gemma 3 MoE variant. 9 B total, 2.5 B active. Strong fit for 12 GB cards.
Google9B7.0GBopen
49Gemma 3 12B
High quality 12B model. Excellent for iPad Pro and Mac.
Google12B7.3GBopen
50Mistral Nemo 12B
Mistral's 12B model with excellent instruction following.
Mistral AI12B7.5GBopen
51Magnum v4 12B
Mistral-Nemo-12B fine-tuned on curated Claude-style prose data. Built for long-form crea
Anthracite12B7.5GBopen
52Rocinante 12B v1.1
Mistral-Nemo-12B roleplay fine-tune optimized for character chat. Stable workhorse for t
TheDrummer12B7.5GBopen
53Mistral Nemo Base 12B
Official Mistral-Nemo 12B foundation model (NVIDIA collab) — pretrained only, no instruc
Mistral AI12B7.7GBopen
54Qwen 2.5 14B
Strong 14B model with excellent coding and reasoning. iPad Pro recommended.
Alibaba14B8.9GBopen
55Phi-4
Microsoft's 14B parameter model. Punches well above its weight on reasoning.
Microsoft14B8.9GBopen
56Rocinante XL 16B v1
Newest Rocinante release — 16B upscaled Mistral-Nemo for richer prose at the 12-16GB tie
TheDrummer16B9.6GBopen
57DeepSeek MoE 16B
DeepSeek first MoE — 16.4 B total, 2.8 B active. The original consumer-runnable open MoE
DeepSeek16.4B11.0GBopen
58Mistral Small 22B
22B parameter model. Strong reasoning and multilingual. Needs 16GB+ RAM.
Mistral AI22B12.9GBopen
59Magnum v4 22B
Mistral-Small-22B base, Anthracite's Claude-style prose training. Sits between 12B and 7
Anthracite22B12.9GBopen
60Dolphin 3.0 R1 Mistral 24B
Only widely-available uncensored R1-style reasoning model. Mistral-Small-24B base with c
Cognitive Computations24B13.8GBopen
61Cydonia 24B v4.3
Top-of-line 24B roleplay model, Mistral-Small-3.2-24B base. Active development cycle — T
TheDrummer24B13.8GBopen
62Dolphin Mistral 24B (Venice Edition)
Headline 24B uncensored pick — top community engagement among uncensored models on HF. S
Cognitive Computations24B14.9GBopen
63Gemma 3 27B
Google's flagship open model. Near GPT-4 quality. Needs 20GB+ RAM.
Google27B15.9GBopen
64Skyfall 31B v4.2
31B creative-writing model — sweet spot between 24B and 70B. Built on Mistral-Small-3.1
TheDrummer31B18.2GBopen
65Qwen 2.5 32B
Premium 32B model. Top-tier reasoning. Mac with 32GB+ RAM.
Alibaba32B19.0GBopen
66Qwen3 30B-A3B
Mixture-of-Experts model with 30 B total parameters but only 3 B active per token. Runs
Alibaba30.5B20.0GBopen
67Phi-3.5 MoE
Microsoft MoE — 16 experts of 3.8 B, 6.6 B active per token. Strong reasoning at modest
Microsoft41.9B24.1GBopen
68Mixtral 8x7B Instruct
The OG public MoE — 8 experts, 2 active per token, 47 B total / 13 B active. Apache-2.0.
Mistral AI46.7B25.1GBopen
69Llama 3.1 70B Instruct
Meta's flagship 70B parameter model. Excellent performance rivaling GPT-4 on many benchm
Meta70B40.1GBopen
70Euryale L3.3 70B v2.3
Canonical 70B creative-writing and roleplay model. Llama-3.3-70B base with extended trai
Sao10K70B40.1GBopen
71Llama 3.1 70B (lorablated)
Llama-3.1-70B-Instruct with abliteration applied via LoRA merge. Cleanest 70B refusal-re
mlabonne70B40.1GBopen
72Magnum v4 72B
Qwen2.5-72B fine-tuned on Claude-Opus-style literary data. Highest-quality long-form pro
Anthracite72B44.7GBopen
73Mixtral 8x22B Instruct
141 B total / 39 B active MoE. Larger Mixtral; needs serious hardware.
Mistral AI141B88.0GBopen
74Qwen3 235B-A22B
Flagship MoE — 235 B total parameters, 22 B active. Frontier quality but needs 80 GB+ VR
Alibaba235B144.0GBopen
coding·17 modelscode
specialized models for code generation, completion, and debugging
rankmodelauthorparamsmin vramaction
1Qwen 2.5 Coder 0.5B
Smallest code model. Default code assistant - runs on any iPhone. Great for code complet
Alibaba0.5B1.1GBopen
2DeepSeek Coder 1.3B
Compact code model with strong coding capabilities. Great for mobile coding assistants.
DeepSeek1.3B1.3GBopen
3Yi Coder 1.5B
Tiny code model. Great for phones. Fast completions.
01.AI1.5B1.4GBopen
4Qwen 2.5 Coder 1.5B
Compact code model with solid code generation and understanding abilities.
Alibaba1.5B1.5GBopen
5CodeGemma 2B
Lightweight code completion model from Google. Fast on-device code suggestions.
Google2B2.0GBopen
6Stable Code 3B
Compact code model with good completion quality.
Stability AI3B2.1GBopen
7StarCoder2 3B
Code completion model trained on The Stack v2. 600+ languages.
BigCode3B2.3GBopen
8Qwen 2.5 Coder 3B
Capable 3B code model. Good balance of coding ability and resource usage.
Alibaba3B2.5GBopen
9Code Llama 7B
Meta's code-specialized Llama model. Good at code completion.
Meta7B4.3GBopen
10DeepSeek Coder 6.7B
Powerful 6.7B code model with excellent code generation across many languages.
DeepSeek6.7B4.3GBopen
11StarCoder2 7B
Larger code model with better completions.
BigCode7B4.7GBopen
12Qwen 2.5 Coder 7B
Strong 7B code model rivaling larger coding models. Excellent for local development.
Alibaba7.6B4.9GBopen
13Yi Coder 9B
Strong 9B code model with good reasoning.
01.AI9B5.5GBopen
14CodeGemma 7B
Google's instruction-tuned code model. Strong code generation and understanding.
Google8.5B5.5GBopen
15Code Llama 13B Instruct
13B code model for complex tasks. iPad Pro recommended.
Meta13B7.8GBopen
16Qwen 2.5 Coder 14B
Powerful 14B code model. Excellent for complex programming tasks.
Alibaba14B8.9GBopen
17Codestral 22B (abliterated)
Mistral Codestral with refusal direction ablated. Code-specialized model without the 'I
failspy22B12.9GBopen
image gen·9 modelsimage
text-to-image models for art, photos, and design
rankmodelauthorparamsmin vramaction
1Stable Diffusion 2.1 Base (CoreML)
Smallest CoreML image generation model. Palettized for minimal size (1.14GB). Runs on an
Stability AI / Apple0.86B1.6GBopen
2Stable Diffusion 1.5 (GGUF)
SD 1.5 in single-file GGUF format. Alternative to CoreML. Uses stable-diffusion.cpp with
Runway / GPUStack0.86B2.1GBopen
3Stable Diffusion 1.5 (CoreML)
Classic image generation model. Pre-converted to CoreML for iOS/Mac. Downloads as zip, a
Runway0.86B2.5GBopen
4Stable Diffusion 2.1 (GGUF)
SD 2.1 in GGUF format. Better quality than 1.5.
Stability AI0.86B2.7GBopen
5Stable Diffusion XL (CoreML)
Higher quality image generation. CoreML optimized for iOS. Requires 6GB+ usable memory (
Stability AI3.5B3.3GBopen
6SDXL Turbo (GGUF)
Single-step SDXL. Near-instant image generation.
Stability AI3.5B5.0GBopen
7Stable Diffusion 3 Medium (GGUF)
SD 3 with MMDiT architecture. Superior text rendering.
Stability AI2.5B9.2GBopen
8FLUX.1 Schnell (GGUF)
Fast 1-4 step generation. State-of-the-art quality. Needs 16GB+ RAM.
Black Forest Labs12B14.0GBopen
9FLUX.1 Dev (GGUF)
Highest quality FLUX model. 20-50 steps. Mac with 24GB+ RAM.
Black Forest Labs12B14.0GBopen
speech-to-text·9 modelsstt
transcription and speech recognition models
rankmodelauthorparamsmin vramaction
1Whisper Tiny English (Quantized)
Smallest possible speech recognition model. Only 32MB. English only. Default speech mode
OpenAI0.039B0.1GBopen
2Whisper Tiny
Tiny multilingual speech recognition. Only 75MB. Supports 99 languages. Runs on any devi
OpenAI0.039B0.2GBopen
3Whisper Base
Base whisper model. Good balance of speed and accuracy. 142MB.
OpenAI0.074B0.3GBopen
4Whisper Base English
English-only base model. Faster and more accurate for English.
OpenAI0.074B0.3GBopen
5Whisper Small
Compact Whisper model. Good accuracy for everyday transcription tasks.
OpenAI0.24B0.9GBopen
6Distil-Whisper Large v3
Distilled Whisper. 6x faster than large-v3 with 1% accuracy loss.
HuggingFace0.76B1.9GBopen
7Whisper Medium
Mid-size Whisper model. Strong multilingual speech recognition.
OpenAI0.77B1.9GBopen
8Whisper Large v3 Turbo
Optimized large Whisper model. Near-best accuracy with faster inference.
OpenAI0.81B2.0GBopen
9Whisper Large v3
Largest Whisper model. Best accuracy across all languages and accents.
OpenAI1.55B3.4GBopen
text-to-speech·14 modelstts
voice synthesis and text-to-speech models
rankmodelauthorparamsmin vramaction
1Piper TTS - Amy (English)
Lightweight TTS voice. High quality English speech synthesis. Default TTS model - runs o
Rhasspy0.02B0.1GBopen
2Piper TTS - Lessac (English)
High quality English male voice. 63MB download. Runs on any device.
Rhasspy0.02B0.1GBopen
3Piper TTS - Spanish (MLS)
Spanish female voice. Natural prosody.
Rhasspy0.02B0.1GBopen
4Piper TTS - German (Thorsten)
German male voice.
Rhasspy0.02B0.1GBopen
5Piper TTS - Chinese (Huayan)
Chinese Mandarin voice.
Rhasspy0.02B0.1GBopen
6Piper TTS - Japanese (Kokoro)
Japanese voice.
Rhasspy0.02B0.1GBopen
7Piper TTS - Korean
Korean voice.
Rhasspy0.02B0.1GBopen
8Piper TTS - Russian (Irina)
Russian female voice.
Rhasspy0.02B0.1GBopen
9Piper TTS - Portuguese (Faber)
Portuguese voice.
Rhasspy0.02B0.1GBopen
10Piper TTS - Arabic (Kareem)
Arabic voice.
Rhasspy0.02B0.1GBopen
11Piper TTS - French (Siwis)
French female voice.
Rhasspy0.02B0.5GBopen
12Piper TTS - Italian (Riccardo)
Italian male voice.
Rhasspy0.02B0.5GBopen
13Piper TTS - LibriTTS-R (English)
Medium quality English voice with natural prosody. 63MB download.
Rhasspy0.02B0.6GBopen
14Kokoro 82M TTS
High quality 82M parameter TTS model. Excellent speech synthesis with multiple voice opt
Kokoro0.082B0.6GBopen
audio gen·3 modelsaudio
AI music and audio creation
rankmodelauthorparamsmin vramaction
1MusicGen Small
Music generation from text prompts. Requires multiple ONNX files (~435MB total). Experim
Meta0.3B0.8GBopen
2Stable Audio Open
47-second variable-length audio generation. Sound effects and short loops.
Stability AI1B6.0GBopen
3ACE-Step 1.5XL
Music generation rivaling Suno. Generates structured songs with vocals from a text promp
ACE Studio1.5B8.0GBopen
multimodal / vision·6 modelsvlm
models that understand both images and text
rankmodelauthorparamsmin vramaction
1Qwen2-VL 2B
Compact vision-language model. Default multimodal model. Can understand images and answe
Alibaba2.2B1.4GBopen
2Moondream 2
Ultra-compact vision model. Only 1GB. Answers questions about images.
Moondream1.8B1.5GBopen
3MiniCPM-V 2.6
Efficient multimodal model with strong image understanding. Optimized for edge devices.
OpenBMB2B2.1GBopen
4PaliGemma 3B
Google's vision model. Strong at visual QA, captioning, and OCR.
Google3B2.5GBopen
5Phi-3.5 Vision
Vision-language model from Microsoft. Can understand images and documents.
Microsoft4.2B3.2GBopen
6LLaVA 1.6 7B
Multimodal vision-language model. Understands images and answers questions about them.
LLaVA7B5.0GBopen
embedding·5 modelsembed
text embedding models for search and retrieval
rankmodelauthorparamsmin vramaction
1BGE Small EN v1.5
Compact English embedding model. Good for basic semantic search.
BAAI0.033B0.1GBopen
2Snowflake Arctic Embed S
Compact embedding model from Snowflake. Good multilingual support.
Snowflake0.033B0.1GBopen
3all-MiniLM-L6-v2
Tiny embedding model. Only 23MB. Perfect for on-device search.
Sentence Transformers0.023B0.1GBopen
4Nomic Embed Text v1.5
High quality text embedding model. 137M params. Good for RAG and search.
Nomic AI0.137B0.3GBopen
5BGE Large EN v1.5
High quality English embedding model. Best accuracy for English search.
BAAI0.335B0.8GBopen
cloud://gpu·escape hatch
can't run the model you want?
cloud GPUs give you instant access to any model, any size.
runpod · from $0.25/hrvast.ai · from $0.15/hrbest gpu buyer guide