1. Upgrade Whisper model
Change WHISPER_MODEL=medium in .env. WER drops from ~4.2% to ~2.9%.
Runs on existing server. No code changes. One env var.
Trivial effort
Works now
2. Gateway + Mac worker
Run stt-local gateway on Debian, connect a Mac with Apple Silicon as remote worker via ngrok.
Best accuracy (1.93% WER) but requires dedicated Mac hardware.
Needs Mac hardware
Medium effort
3. Parakeet via ONNX on CPU
Export Parakeet to ONNX, run via onnxruntime on x86 CPU.
Untested path — likely slower than faster-whisper on 2 cores without GPU.
High effort
Risky
4. Add GPU to server
Upgrade to a VPS with NVIDIA GPU. Enables both faster-whisper large-v3 and
Parakeet via NeMo/TensorRT. Best long-term path for quality + speed.
Hosting change
All options open