WhisperSubTranslate

WhisperSubTranslate is a free, portable desktop application that generates subtitles (SRT) from any video file using whisper.cpp, with all processing performed locally on the user's machine. It eliminates the need for cloud uploads or account creation. The software optionally translates generated subtitles through services like MyMemory, DeepL, OpenAI, or Gemini. It features automatic GPU/CPU fallback (CUDA), a batch queue system with progress tracking, and automatic model downloads. Built with Electron and Node.js, it is packaged as a portable .exe file for Windows systems.

★★★★★

5.0(1 ratings)

Download WhisperSubTranslate (Official links)

File size: 734 MB

The latest version of WhisperSubTranslate is: 1.5.1

Operating system: Windows

Languages: English

Developer: Blue-B

Price: $0.00 USD

Local transcription engine. The core functionality relies on whisper.cpp, a high-performance implementation of OpenAI's Whisper speech recognition model. Running completely offline ensures all video and audio data stays on the user's computer, guaranteeing privacy and enabling usage without internet connectivity. The engine is optimized for both CPU execution and CUDA-compatible graphics cards.
SRT subtitle file generation. The application converts detected speech from video files into standard SubRip (SRT) subtitle format. These files contain precise timestamps and text segments, making them compatible with virtually all media players, video editing software, and online platforms for adding captions to content.
Integrated subtitle translation. After generating subtitles in the original language, WhisperSubTranslate can translate them into different target languages. It connects with multiple translation providers, including MyMemory (free tier available), DeepL (high-quality translations), OpenAI (GPT models), and Google's Gemini. Users can select their preferred service and configure the necessary API credentials.
Batch processing queue system. The interface includes a queue manager that allows adding numerous video files simultaneously. The program processes each file in sequence automatically, enabling efficient subtitle generation for entire video collections or series without requiring manual initiation for each individual file.
Automatic GPU/CPU fallback. The software automatically detects the presence of an NVIDIA CUDA-compatible GPU. When available, processing is accelerated using the graphics card. If no compatible GPU is found or if CUDA initialization fails, the system seamlessly falls back to CPU processing, ensuring task completion regardless of hardware configuration.
Automatic Whisper model downloads. WhisperSubTranslate manages the acquisition of required Whisper language models (tiny, base, small, medium, large) directly through its interface. Users select the desired model based on accuracy and speed requirements, and the application downloads and stores it locally without requiring manual file hunting or external downloads.
Portable Windows executable. The tool is compiled into a single .exe file that requires no installation. It can be executed from any folder, including USB drives, without leaving traces in the Windows registry or requiring administrative privileges, making it ideal for use on multiple or restricted systems.
Multi-format media support. The application accepts a wide range of common video and audio formats, including MP4, MKV, AVI, MOV, MP3, M4A, and WAV. This broad compatibility eliminates the need for users to convert their media files to a specific format before processing them for subtitles.
Real-time progress tracking. During file processing, the interface displays current progress through status bars and numerical indicators. Users can monitor which phase is active (audio extraction, transcription, translation) and estimate the remaining time for each job in the queue.
Source and target language configuration. The program allows users to specify the original language of the video's audio to improve transcription accuracy. When utilizing the translation feature, users can also select the desired output language for the resulting subtitles, with settings adjustable per individual job in the queue.
Electron and Node.js architecture. The application is built on Electron, a framework for creating cross-platform desktop applications using web technologies. Node.js handles the core processing logic, file system operations, and communication with translation APIs, providing a robust backend for the subtitle generation workflow.

Development of WhisperSubTranslate was initiated by GitHub user Blue-B. The first version of the program was released in 2023, created in response to the need for a straightforward, private tool for obtaining video transcriptions without depending on cloud-based services. The project is written primarily in JavaScript, utilizing the Node.js runtime and the Electron framework to construct the graphical interface and manage backend operations. It integrates whisper.cpp, a C++ implementation of the Whisper model, through appropriate bindings or system calls.