StarWhisper

StarWhisper is a speech-to-text conversion solution that operates without an internet connection. The application processes audio directly on the user's Windows device, employing GPU acceleration for real-time transcriptions. All audio and text data remains on the local computer, without being transmitted to external servers. The software is built upon the optimized Whisper.cpp engine, providing accurate speech recognition with support for multiple languages and models of various sizes.

★★★★★

5.0(1 ratings)

Download StarWhisper (Official links)

File size: 2.4 MB

The latest version of StarWhisper is: 1.3.105

Operating system: Windows

Languages: English

Developer: StarWhisper

Price: $0.00 USD

Real-time transcription. This function converts speech into text immediately as the user speaks. The engine processes the audio stream with minimal latency, displaying the text in the main interface. This feature is designed for continuous dictation in word processors, email clients, or any text input field within the Windows system.
Fully offline operation. All speech recognition processing runs locally on the user's machine. No internet connection is required for core functionality. Language models are stored on the hard drive and loaded into system memory or VRAM during use.
GPU acceleration. The application offloads most of the computational workload to the graphics processing unit when available. This implementation significantly reduces CPU usage and enables real-time performance, even with the largest and most accurate recognition models.
Configurable language models. The user can select from different Whisper models, ranging from 'tiny' to 'large'. Smaller models offer higher transcription speed, while larger models provide superior accuracy, especially for complex audio or specific accents.
Push-to-dictate mode. A modality where transcription is only activated while the user holds down a configurable key. This method is suitable for inserting short phrases or commands without needing to manually toggle continuous dictation on and off.
Minimalist floating window. The user interface consists of a transparent window that remains always visible on top of other applications. It displays the transcription status, recent converted text, and basic controls without visual distractions.
Customizable keyboard shortcuts. Users can define key combinations to start and stop recording, activate push-to-dictate mode, pause recognition, or show/hide the application window. Shortcuts function globally within the system.
Automatic formatting and punctuation. The engine not only transcribes words but also inserts punctuation marks like periods, commas, question marks, and capitalizes the beginning of sentences. This post-processing improves the readability of the generated text.
Audio file transcription. Capability to upload pre-existing audio files in common formats (WAV, MP3, FLAC) and generate a complete text transcription. The function processes the entire file and saves the result in an editable text document.
Manual and automatic language selection. The user can set the input language to improve accuracy, or let the model detect it automatically. Automatic language detection analyzes the first few seconds of audio to determine the most likely linguistic setting.
CPU compatibility mode. A fallback mechanism that activates automatically on systems without a dedicated GPU or with problematic drivers. In this mode, all neural network calculations run on the central processor, maintaining full offline functionality.
Visual status indicators. The interface features icons and color changes that inform the user about the current status: standby, recording, processing, or paused. These indicators provide immediate feedback on system activity.
Transcription history. A log that automatically saves recent dictation sessions. Users can review, copy, or export previously transcribed texts from a dedicated section of the application.
Basic noise reduction. Pre-processing of the input audio that applies filters to minimize constant ambient noise before the signal reaches the recognition model. This processing improves results in non-ideal environments.

The development of StarWhisper began in 2023 as a native Windows implementation of the open-source project Whisper.cpp, which itself is a C++ port of OpenAI's Whisper model. The developers are an independent team focused on creating productivity tools with built-in privacy. The application is primarily written in C++ for the processing core and uses the Qt framework for the graphical user interface. The choice of C++ ensures near-metal performance and efficient resource consumption, while Qt provides a cross-platform foundation for potential future development on other operating systems.

Alternatives to StarWhisper:

Glimp — Free Download. Real-time AI assistant for job interviews

StarWhisper

Alternatives to StarWhisper:

Glimp

Speakey

RocketWhisper

VoiceOS

Typeless

BB Recorder

Vowen

VoiceInk

OpenWispr

Pipit

Whispering Tiger