Skip to main content

Overview

VoiceTypr supports multiple AI transcription engines with various model sizes to balance speed and accuracy based on your needs. All models run locally on your device for complete privacy.

Supported Engines

Whisper

OpenAI’s Whisper models provide excellent accuracy across 99+ languages with multiple size options.

Parakeet

NVIDIA Parakeet models optimized for Apple Silicon, offering fast transcription using the Neural Engine.

Soniox

Cloud-based speech recognition API offering fast, accurate transcription without local model downloads.

Model Types

VoiceTypr distinguishes between local and cloud models:
type SpeechModelEngine = 'whisper' | 'parakeet' | 'soniox';
type ModelKind = 'local' | 'cloud';

interface LocalModelInfo {
  kind: 'local';
  name: string;              // Internal identifier
  display_name: string;      // User-friendly name
  engine: SpeechModelEngine; // Which engine runs this model
  size: number;              // Download size in bytes
  speed_score: number;       // Speed rating (1-10)
  accuracy_score: number;    // Accuracy rating (1-10)
  recommended: boolean;      // Is this a recommended model?
  downloaded: boolean;       // Is it downloaded locally?
  requires_setup: boolean;   // Does it need configuration?
  url: string;              // Download URL
  sha256: string;           // Checksum for verification
}

Whisper Models

Available Sizes

Whisper models come in multiple sizes, each with different speed and accuracy tradeoffs:
Whisper Tiny (~75 MB)
  • Speed Score: 10/10 ⚡
  • Accuracy Score: 6/10
  • Best for: Quick drafts, testing, low-power devices
  • Languages: Multilingual
The smallest and fastest model. Good for quick notes when accuracy isn’t critical.

English-Only Models

Whisper also offers English-only variants (.en suffix) that are optimized for English:
"tiny.en"   // Tiny English-only
"base.en"   // Base English-only  
"small.en"  // Small English-only
"medium.en" // Medium English-only
English-only models are smaller, faster, and more accurate for English compared to their multilingual counterparts.

Parakeet Models

Parakeet models are available on macOS only and leverage Apple’s Neural Engine for hardware acceleration.

Available Models

Parakeet 1.1B (~1.3 GB)
  • Speed Score: 8/10
  • Accuracy Score: 8/10
  • Languages: Multilingual (100+ languages)
  • Hardware: Apple Neural Engine
Multilingual support with good performance on Apple Silicon.
Parakeet models are macOS-exclusive and require an Apple Silicon Mac (M1, M2, M3, or newer). They will not run on Intel Macs or other platforms.

Soniox Cloud Models

Soniox is a cloud-based speech recognition service that provides fast, accurate transcription without requiring local model downloads.

Overview

Unlike Whisper and Parakeet which run entirely on your device, Soniox processes audio in the cloud:
  • No downloads required: No disk space needed for models
  • Fast transcription: Cloud processing with optimized infrastructure
  • Requires internet: Audio is sent to Soniox API for processing
  • API key required: You need a Soniox account and API key

Setup

To use Soniox models:
1

Get API Key

Sign up at soniox.com and obtain an API key
2

Configure in VoiceTypr

Add your API key in Settings → Models → Soniox Configuration
3

Select Soniox Model

Choose a Soniox model from the Models tab

Available Models

Soniox offers several optimized models:
  • stt-async-v3: Latest asynchronous model with best accuracy
  • stt-streaming: Real-time streaming transcription
  • stt-multilingual: Support for multiple languages
Check the Soniox documentation for the latest available models and language support.

Privacy Considerations

When using Soniox, audio is sent to Soniox servers for processing. This differs from Whisper and Parakeet which process entirely offline on your device.Only use Soniox if you’re comfortable with cloud-based processing of your audio.

Performance

Soniox typically provides:
  • Speed: Very fast, limited by network latency
  • Accuracy: High accuracy comparable to Whisper Large
  • Cost: Based on Soniox API pricing

API Key Storage

Soniox API keys are stored securely in your system keychain:
  • macOS: Keychain Access
  • Windows: Credential Manager
The key is never stored in plain text.

Validation

VoiceTypr validates your Soniox API key by:
await invoke('validate_and_cache_soniox_key', {
  apiKey: 'your-soniox-api-key'
});
The validation checks against Soniox’s /v1/models endpoint to verify the key is active.

Hardware Acceleration

VoiceTypr automatically uses hardware acceleration when available for maximum performance.

macOS

  • Whisper: Uses Metal GPU acceleration via Apple’s Metal Performance Shaders
  • Parakeet: Uses Apple Neural Engine for ultra-fast inference
  • Requirements: macOS 13.0+ (Ventura or later)

Windows

  • Whisper: Supports GPU acceleration via DirectML
  • Compatible GPUs: NVIDIA, AMD, and Intel GPUs
  • Fallback: Automatically uses CPU if GPU unavailable or drivers missing
On Windows, ensure your graphics drivers are up to date for 5-10x faster transcription:

Model Management

Downloading Models

1

Open Models Tab

Click the VoiceTypr menubar icon and go to the Models tab.
2

Browse Available Models

Models are organized into two sections:
  • Available to Use: Already downloaded and ready
  • Available to Setup: Need to be downloaded first
3

Download a Model

Click the Download button on any model. Progress is shown in real-time.
4

Verify and Activate

After download, the model is verified using SHA-256 checksum and automatically becomes available for use.

Download Progress Tracking

Download progress is tracked in real-time:
downloadProgress: Record<string, number>  // modelName -> percentage (0-100)
You can cancel an in-progress download at any time:
cancelDownload(modelName: string)

Model Verification

All downloaded models are verified using SHA-256 checksums to ensure integrity:
verifyingModels: Set<string>  // Models currently being verified
Model verification happens automatically after download. If verification fails, the download is considered corrupted and must be retried.

Deleting Models

To free up disk space, you can delete models you no longer need:
1

Navigate to Models

Open the Models tab in VoiceTypr.
2

Find Downloaded Model

Locate the model in the “Available to Use” section.
3

Delete

Click the delete/trash icon on the model card.
If you delete the currently active model, VoiceTypr will clear your model selection. You’ll need to select and download a new model before transcription will work.

Model Selection

To switch between downloaded models:
  1. Go to the Models tab
  2. Click on any downloaded model to select it
  3. The selected model is saved in settings:
current_model: string;           // Model name
current_model_engine: 'whisper' | 'parakeet' | 'soniox';

Choosing the Right Model

Use this guide to select the best model for your needs:
Recommended: Whisper Tiny or Tiny.en
  • Fastest inference times
  • Good for quick notes and drafts
  • Trade-off: Lower accuracy
Recommended: Whisper Large or Medium
  • Highest accuracy scores
  • Best for professional transcription
  • Trade-off: Slower processing
Consider enabling AI Enhancement to further improve output quality.
Recommended: Parakeet 1.1B v2 (English) or Parakeet 1.1B (Multilingual)
  • Optimized for Apple Neural Engine
  • Faster than Whisper on M-series chips
  • Great accuracy
Recommended: Whisper Small, Medium, or Large (multilingual variants)
  • Support for 99+ languages
  • Avoid .en suffix models (English-only)
  • Parakeet 1.1B also supports 100+ languages

Disk Space Requirements

Ensure you have enough free disk space before downloading models:
ModelSize
Whisper Tiny~75 MB
Whisper Base~150 MB
Whisper Small~500 MB
Whisper Medium~1.5 GB
Whisper Large~3 GB
Parakeet 1.1B~1.3 GB
You only need one model to use VoiceTypr. The recommended Whisper Small model requires just 500 MB.

Model Storage Location

Downloaded models are stored in:
  • macOS: ~/Library/Application Support/com.voicetypr.app/models/
  • Windows: %APPDATA%\com.voicetypr.app\models\

Troubleshooting

Download Failed

  1. Check your internet connection
  2. Ensure you have enough free disk space
  3. Try canceling and restarting the download
  4. Check firewall/antivirus isn’t blocking the download

Verification Failed

If SHA-256 verification fails:
  1. Delete the corrupted model
  2. Retry the download
  3. Check for disk errors if it continues failing

Model Not Appearing

If a downloaded model doesn’t appear:
  1. Click “Refresh Models” in the Models tab
  2. Restart VoiceTypr
  3. Check the model storage location manually

Slow Transcription

If transcription is slower than expected:
  • macOS: Ensure you’re using a model compatible with Metal acceleration
  • Windows: Update your GPU drivers for hardware acceleration
  • Try a smaller/faster model (Tiny or Base)
  • Close resource-intensive applications