Acknowledgements
Ascolta is proprietary software, but it stands on excellent open-source work. The transcription models run on your Mac and are downloaded directly from their publishers on first use — we don't bundle or redistribute them. Here's everything we build on, with thanks.
On-device speech recognition
- WhisperKit by Argmax — on-device Whisper inference. Used under the MIT license. github.com/argmaxinc/WhisperKit
- FluidAudio by FluidInference — on-device Parakeet inference and model download. Used under the Apache-2.0 license. github.com/FluidInference/FluidAudio
Speech models
The transcription models are downloaded to your Mac from Hugging Face on first use and run entirely offline. Ascolta does not redistribute the model weights.
- Parakeet TDT 0.6B (v2 and v3) by NVIDIA, packaged for Core ML by FluidInference — used under CC-BY-4.0, with attribution to NVIDIA. FluidInference/parakeet-tdt-0.6b-v3-coreml
- Whisper and Distil-Whisper by OpenAI and the Hugging Face team — used under the MIT license.
App frameworks
- Sparkle — secure app updates. Used under the MIT license. sparkle-project.org
- Swift packages from Apple (swift-crypto, swift-collections, swift-argument-parser, swift-asn1) and Hugging Face (swift-transformers, swift-jinja) — used under the MIT and Apache-2.0 licenses.
Optional cloud transcription
If you choose to connect a cloud provider with your own API key, audio is sent to that provider for transcription under their terms. This is off by default. Providers: OpenAI (gpt-4o-mini-transcribe) and Groq (whisper-large-v3-turbo).
Thanks
To everyone who builds and maintains the software above, and to the writers who think out loud. Questions about licensing? Email [email protected].