Ggml-medium.bin [best]
whisper.cpp/models/README.md at master · ggml-org/ ... - GitHub
The ggml-medium.bin file is a specific, pre-trained version of OpenAI’s Whisper automatic speech recognition (ASR) and translation model. It has been converted into the to run efficiently on CPU and GPU hardware using the whisper.cpp engine.
: OpenAI released Whisper as a Python-based PyTorch model. While powerful, it originally required a heavy Python environment and significant GPU resources to run smoothly. The Transformation (GGML) : Georgi Gerganov developed the ggml-medium.bin
If your transcriptions are running slower than real-time, apply these optimizations:
: On modern systems, it typically transcribes audio at several times the speed of real-time. For example, some users report processing 20 minutes of audio in under 20 seconds on capable hardware. File Variants : ggml-medium.bin : The standard multilingual model. whisper
ggml-medium.bin is widely considered the "sweet spot" for local transcription using whisper.cpp
: It balances high-fidelity results with manageable RAM requirements, making it ideal for on-device applications like local Zoom meeting summarization or automated video subtitling. Common Use Cases : OpenAI released Whisper as a Python-based PyTorch model
It performs remarkably well on Apple Silicon (via Metal) and reasonably fast on modern x86 CPU architectures. How to Use ggml-medium.bin
Accuracy, evaluation, and limitations
While ggml-medium.bin is optimized, it still requires decent hardware. ggml-small.bin Low-end CPUs, Raspberry Pi ~1.53GB Consumer PCs, Mac M1/M2/M3 ggml-large-v3.bin High-end GPUs/Workstations