Speech & Audio

Whisper Model Sizes: Which One Should You Use?

RunThisModel Research·April 9, 2026

OpenAI's Whisper is the gold standard for speech-to-text. It comes in 6 sizes, and choosing the right one depends on your hardware, speed requirements, and accuracy needs.

Model Lineup

ModelParametersVRAMSpeed vs LargeAccuracy
Tiny39M~1GB10x fasterBasic
Base74M~1GB7x fasterGood
Small244M~2GB4x fasterVery Good
Medium769M~5GB2x fasterExcellent
Turbo809M~6GB8x fasterNear-Large
Large-v31.55B~10GBBaselineBest

The Turbo Sweet Spot

Whisper Turbo is the standout option for most users. It delivers 95% of Large-v3's accuracy at 8x the speed, using only 6GB VRAM. Unless you're processing critical transcriptions where every word matters, Turbo is the recommended choice.

Hardware Recommendations

Smartphones & Low-end devices: Tiny or Base — runs on CPU efficiently.

Laptops with integrated graphics: Small — 2GB VRAM is widely available.

Desktop with dedicated GPU: Turbo — the best balance of speed and quality.

Workstation / Server: Large-v3 — maximum accuracy for production pipelines.

All Whisper models can also run on CPU using whisper.cpp, which is especially useful for the smaller models. Check compatibility for your device on our model checker.

Run Any Model in the Cloud

No hardware limits. Pay only for what you use.