Whisper Model Sizes: Which One Should You Use?

OpenAI's Whisper is the gold standard for speech-to-text. It comes in 6 sizes, and choosing the right one depends on your hardware, speed requirements, and accuracy needs.

Model Lineup

Model	Parameters	VRAM	Speed vs Large	Accuracy
Tiny	39M	~1GB	10x faster	Basic
Base	74M	~1GB	7x faster	Good
Small	244M	~2GB	4x faster	Very Good
Medium	769M	~5GB	2x faster	Excellent
Turbo	809M	~6GB	8x faster	Near-Large
Large-v3	1.55B	~10GB	Baseline	Best

The Turbo Sweet Spot

Whisper Turbo is the standout option for most users. It delivers 95% of Large-v3's accuracy at 8x the speed, using only 6GB VRAM. Unless you're processing critical transcriptions where every word matters, Turbo is the recommended choice.

Hardware Recommendations

Smartphones & Low-end devices: Tiny or Base — runs on CPU efficiently.

Laptops with integrated graphics: Small — 2GB VRAM is widely available.

Desktop with dedicated GPU: Turbo — the best balance of speed and quality.

Workstation / Server: Large-v3 — maximum accuracy for production pipelines.

All Whisper models can also run on CPU using whisper.cpp, which is especially useful for the smaller models. Check compatibility for your device on our model checker.

Model Lineup

The Turbo Sweet Spot

Hardware Recommendations

Run Any Model in the Cloud