Themata.AI
Themata.AI

Popular tags:

#developer-tools#ai-agents#llms#claude#ai-ethics#code-generation#openai#ai-safety#anthropic#open-source

AI is changing the world. Don't stay behind. Clear summaries, community insight, delivered without the noise. Subscribe to never miss a beat.

© 2026 Themata.AI • All Rights Reserved

Privacy

|

Cookies

|

Contact
🕒 Latest🔥 Top

Filtering by tag:

speech-recognitionClear
NewsOpinionResearchToolClear
NVIDIA PersonaPlex 7B on Apple Silicon: Full-Duplex Speech-to-Speech in Native Swift with MLX
nvidiaspeech-recognitionapple-silicondeveloper-tools
Tool

Nvidia PersonaPlex 7B on Apple Silicon: Full-Duplex Speech-to-Speech in Swift

NVIDIA PersonaPlex 7B enables full-duplex speech-to-speech communication on Apple Silicon, allowing simultaneous listening and speaking. The qwen3-asr-swift library processes audio in real-time, streaming generated audio chunks without a multi-step pipeline.

blog.ivan.digital

🔥🔥🔥🔥🔥

5 min

3/5/2026

Parakeet.cpp – Parakeet ASR inference in pure C++ with Metal GPU acceleration

Frikallo/parakeet.cpp provides an ultra-fast and portable implementation of NVIDIA's Parakeet models for on-device speech recognition in C++. It achieves approximately 27ms encoder inference on Apple Silicon GPUs for 10 seconds of audio, making it 96 times faster than CPU processing, and utilizes the Axiom tensor library for automatic Metal GPU acceleration without heavy dependencies.

github.com

🔥🔥🔥🔥🔥

5 min

2/27/2026

Rust implementation of Mistral's Voxtral Mini 4B Realtime runs in your browser

Voxtral Mini Realtime is a streaming speech recognition model implemented in pure Rust, utilizing the Burn ML framework. It operates natively in the browser via WASM and WebGPU, with a Q4 GGUF quantized version available for client-side execution.

github.com

🔥🔥🔥🔥🔥

4 min

2/10/2026

Pure C, CPU-only inference with Mistral Voxtral Realtime 4B speech to text model

The GitHub repository provides a pure C implementation of the inference pipeline for Mistral AI's Voxtral Realtime 4B speech-to-text model, requiring only the C standard library. It features fast MPS inference, a chunked audio processing encoder to manage memory usage, and supports audio input from stdin or live microphone capture.

github.com

🔥🔥🔥🔥🔥

9 min

2/10/2026

Nvidia PersonaPlex 7B on Apple Silicon: Full-Duplex Speech-to-Speech in Swift

NVIDIA PersonaPlex 7B enables full-duplex speech-to-speech communication on Apple Silicon, allowing simultaneous listening and speaking. The qwen3-asr-swift library processes audio in real-time, streaming generated audio chunks without a multi-step pipeline.

blog.ivan.digital

🔥🔥🔥🔥🔥

5 min

3/5/2026

Rust implementation of Mistral's Voxtral Mini 4B Realtime runs in your browser

Voxtral Mini Realtime is a streaming speech recognition model implemented in pure Rust, utilizing the Burn ML framework. It operates natively in the browser via WASM and WebGPU, with a Q4 GGUF quantized version available for client-side execution.

github.com

🔥🔥🔥🔥🔥

4 min

2/10/2026

Parakeet.cpp – Parakeet ASR inference in pure C++ with Metal GPU acceleration

Frikallo/parakeet.cpp provides an ultra-fast and portable implementation of NVIDIA's Parakeet models for on-device speech recognition in C++. It achieves approximately 27ms encoder inference on Apple Silicon GPUs for 10 seconds of audio, making it 96 times faster than CPU processing, and utilizes the Axiom tensor library for automatic Metal GPU acceleration without heavy dependencies.

github.com

🔥🔥🔥🔥🔥

5 min

2/27/2026

Pure C, CPU-only inference with Mistral Voxtral Realtime 4B speech to text model

The GitHub repository provides a pure C implementation of the inference pipeline for Mistral AI's Voxtral Realtime 4B speech-to-text model, requiring only the C standard library. It features fast MPS inference, a chunked audio processing encoder to manage memory usage, and supports audio input from stdin or live microphone capture.

github.com

🔥🔥🔥🔥🔥

9 min

2/10/2026

Nvidia PersonaPlex 7B on Apple Silicon: Full-Duplex Speech-to-Speech in Swift

NVIDIA PersonaPlex 7B enables full-duplex speech-to-speech communication on Apple Silicon, allowing simultaneous listening and speaking. The qwen3-asr-swift library processes audio in real-time, streaming generated audio chunks without a multi-step pipeline.

blog.ivan.digital

🔥🔥🔥🔥🔥

5 min

3/5/2026

Pure C, CPU-only inference with Mistral Voxtral Realtime 4B speech to text model

The GitHub repository provides a pure C implementation of the inference pipeline for Mistral AI's Voxtral Realtime 4B speech-to-text model, requiring only the C standard library. It features fast MPS inference, a chunked audio processing encoder to manage memory usage, and supports audio input from stdin or live microphone capture.

github.com

🔥🔥🔥🔥🔥

9 min

2/10/2026

Parakeet.cpp – Parakeet ASR inference in pure C++ with Metal GPU acceleration

Frikallo/parakeet.cpp provides an ultra-fast and portable implementation of NVIDIA's Parakeet models for on-device speech recognition in C++. It achieves approximately 27ms encoder inference on Apple Silicon GPUs for 10 seconds of audio, making it 96 times faster than CPU processing, and utilizes the Axiom tensor library for automatic Metal GPU acceleration without heavy dependencies.

github.com

🔥🔥🔥🔥🔥

5 min

2/27/2026

Rust implementation of Mistral's Voxtral Mini 4B Realtime runs in your browser

Voxtral Mini Realtime is a streaming speech recognition model implemented in pure Rust, utilizing the Burn ML framework. It operates natively in the browser via WASM and WebGPU, with a Q4 GGUF quantized version available for client-side execution.

github.com

🔥🔥🔥🔥🔥

4 min

2/10/2026

No more articles to load