
github.com
February 27, 2026
5 min read
51/100
Summary
Frikallo/parakeet.cpp provides an ultra-fast and portable implementation of NVIDIA's Parakeet models for on-device speech recognition in C++. It achieves approximately 27ms encoder inference on Apple Silicon GPUs for 10 seconds of audio, making it 96 times faster than CPU processing, and utilizes the Axiom tensor library for automatic Metal GPU acceleration without heavy dependencies.
Key Takeaways
Community Sentiment
Positives
Concerns
Pure C, CPU-only inference with Mistral Voxtral Realtime 4B speech to text model
Feb 10, 2026

DeepSeek 4 Flash local inference engine for Metal
May 7, 2026
Rust implementation of Mistral's Voxtral Mini 4B Realtime runs in your browser
Feb 10, 2026
Flash-MoE: Running a 397B Parameter Model on a Laptop
Mar 22, 2026

Nvidia PersonaPlex 7B on Apple Silicon: Full-Duplex Speech-to-Speech in Swift
Mar 5, 2026