Whisper Medium is an automatic speech recognition (ASR) model developed by OpenAI, boasting 0.77 billion parameters. This model is particularly adept at transcribing audio content with high accuracy, making it a solid choice for tasks such as converting spoken language into text, creating subtitles, or transcribing meetings and lectures. Its performance is notable for its balance between accuracy and computational efficiency, which is crucial for local deployment where resources might be more limited.
In the context of its size class, Whisper Medium punches well above its weight. Despite having fewer parameters than larger models like Whisper Large, it maintains a high level of accuracy, often comparable to more resource-intensive models. This makes it an excellent option for users who need robust ASR capabilities without the overhead of extensive computational resources. The model’s efficiency is further enhanced by its available quantization (Q8_0), which reduces the memory footprint and improves inference speed, making it suitable for deployment on devices with as little as 1.9 GB of VRAM.
Whisper Medium is ideal for developers and professionals looking to integrate ASR into applications running on mid-range hardware, such as laptops or edge devices. It is particularly useful for projects that require real-time or near-real-time transcription, where latency and resource consumption are critical factors. Users with more powerful hardware will also benefit from its efficiency, allowing them to handle multiple streams or larger datasets simultaneously.
| Quantization | Bits | File Size | VRAM Needed | RAM Needed | Quality |
|---|---|---|---|---|---|
| Q8_0 | 8 | 1.428 GB | 1.93 GB | 2.43 GB | 92% |
How to run Whisper Medium
Pick a runtime — copy & paste. Commands are pre-filled with this model’s repo.
Pure-C reimplementation. CoreML/Metal/CUDA. 1-line setup.
whisper.cpp home →- 1
Build
git clone https://github.com/ggerganov/whisper.cpp && cd whisper.cpp && make - 2
Get the model
bash ./models/download-ggml-model.sh medium - 3
Transcribe
./main -m models/ggml-medium.bin -f input.wav
Community benchmarks
Real tokens/sec reports from people running Whisper Medium on actual hardware.
No community runs yet for this model. Be the first to submit your numbers.
how much VRAM do I need to run Whisper Medium?
Whisper Medium requires 1.93 GB VRAM minimum with Q8_0 quantization. For full precision you need 1.93 GB.
which quant should I pick?
Q4_K_M is the best quality/VRAM balance — ~92% of FP16 quality at ~25% the footprint. Q8_0 is near-lossless if you have the headroom.