Themata.AI | AI news without the noise

Themata.AI

AI is changing the world. Don't stay behind. Clear summaries, community insight, delivered without the noise. Subscribe to never miss a beat.

Privacy

Contact

🕒 Latest 🔥 Top

Week Month Year All Time

Filtering by tag:

speech-recognitionClear

Telus Uses AI to Alter Call-Agent Accents

ai-agents speech-recognition telecommunications ethical-ai

News

Telus Uses AI to Alter Call-Agent Accents

Telus is utilizing AI technology from Tomato.ai to modify call-centre agents' accents in real time, aiming to reduce "accent-related friction." Labour groups have criticized this practice, calling it deceptive and advocating for mandatory disclosure.

letsdatascience.com

🔥🔥🔥🔥🔥

3 min

5/6/2026

GitHub - microsoft/VibeVoice: Open-Source Frontier Voice AI

speech-recognition open-source ai-models developer-tools

Tool

Microsoft VibeVoice: Open-Source Frontier Voice AI

VibeVoice ASR is an open-source speech-to-text model that processes 60-minute long-form audio in a single pass, producing structured transcriptions with speaker identification, timestamps, and content. It is now integrated into the Hugging Face Transformers library for easy project implementation.

github.com

🔥🔥🔥🔥🔥

4 min

4/28/2026

Cohere Transcribe: state-of-the-art speech recognition

speech-recognition automatic-speech-recognition ai-models developer-tools

Tool

Cohere Transcribe: Speech Recognition

Cohere has launched Transcribe, an open-source automatic speech recognition (ASR) model designed for high accuracy in practical conditions. The model supports various applications, including meeting transcription, speech analytics, and real-time customer support.

cohere.com

🔥🔥🔥🔥🔥

5 min

3/31/2026

NVIDIA PersonaPlex 7B on Apple Silicon: Full-Duplex Speech-to-Speech in Native Swift with MLX

nvidia speech-recognition apple-silicon developer-tools

Tool

Nvidia PersonaPlex 7B on Apple Silicon: Full-Duplex Speech-to-Speech in Swift

NVIDIA PersonaPlex 7B enables full-duplex speech-to-speech communication on Apple Silicon, allowing simultaneous listening and speaking. The qwen3-asr-swift library processes audio in real-time, streaming generated audio chunks without a multi-step pipeline.

blog.ivan.digital

🔥🔥🔥🔥🔥

5 min

3/5/2026

GitHub - Frikallo/parakeet.cpp: Ultra fast and portable Parakeet implementation for on-device inference in C++ using Axiom with MPS+Unified Memory and Cuda support

speech-recognition nvidia developer-tools

Tool

Parakeet.cpp – Parakeet ASR inference in pure C++ with Metal GPU acceleration

Frikallo/parakeet.cpp provides an ultra-fast and portable implementation of NVIDIA's Parakeet models for on-device speech recognition in C++. It achieves approximately 27ms encoder inference on Apple Silicon GPUs for 10 seconds of audio, making it 96 times faster than CPU processing, and utilizes the Axiom tensor library for automatic Metal GPU acceleration without heavy dependencies.

github.com

🔥🔥🔥🔥🔥

5 min

2/27/2026

GitHub - TrevorS/voxtral-mini-realtime-rs

ai-models rust webgpu speech-recognition

Tool

Rust implementation of Mistral's Voxtral Mini 4B Realtime runs in your browser

Voxtral Mini Realtime is a streaming speech recognition model implemented in pure Rust, utilizing the Burn ML framework. It operates natively in the browser via WASM and WebGPU, with a Q4 GGUF quantized version available for client-side execution.

github.com

🔥🔥🔥🔥🔥

4 min

2/10/2026

GitHub - antirez/voxtral.c: Pure C inference of Mistral Voxtral Realtime 4B speech to text model

speech-recognition mistral-ai audio-processing developer-tools

Tool

Pure C, CPU-only inference with Mistral Voxtral Realtime 4B speech to text model

The GitHub repository provides a pure C implementation of the inference pipeline for Mistral AI's Voxtral Realtime 4B speech-to-text model, requiring only the C standard library. It features fast MPS inference, a chunked audio processing encoder to manage memory usage, and supports audio input from stdin or live microphone capture.

github.com

🔥🔥🔥🔥🔥

9 min

2/10/2026

Voxtral transcribes at the speed of sound. | Mistral AI

transcription-technology speech-recognition mistral-ai real-time-applications

Voxtral Transcribe 2

Voxtral Transcribe 2 features two advanced speech-to-text models, Voxtral Mini Transcribe V2 for batch transcription and Voxtral Realtime for live applications, offering state-of-the-art transcription quality and ultra-low latency. Voxtral Realtime is available as open-weights under the Apache 2.0 license.

mistral.ai

🔥🔥🔥🔥🔥

5 min

2/4/2026

ai-agents speech-recognition telecommunications ethical-ai

News

Telus Uses AI to Alter Call-Agent Accents

letsdatascience.com

🔥🔥🔥🔥🔥

3 min

5/6/2026

speech-recognition automatic-speech-recognition ai-models developer-tools

Tool

Cohere Transcribe: Speech Recognition

cohere.com

🔥🔥🔥🔥🔥

5 min

3/31/2026

speech-recognition nvidia developer-tools

Tool

Parakeet.cpp – Parakeet ASR inference in pure C++ with Metal GPU acceleration

github.com

🔥🔥🔥🔥🔥

5 min

2/27/2026

speech-recognition mistral-ai audio-processing developer-tools

Tool

Pure C, CPU-only inference with Mistral Voxtral Realtime 4B speech to text model

github.com

🔥🔥🔥🔥🔥

9 min

2/10/2026

speech-recognition open-source ai-models developer-tools

Tool

Microsoft VibeVoice: Open-Source Frontier Voice AI

github.com

🔥🔥🔥🔥🔥

4 min

4/28/2026

nvidia speech-recognition apple-silicon developer-tools

Tool

Nvidia PersonaPlex 7B on Apple Silicon: Full-Duplex Speech-to-Speech in Swift

blog.ivan.digital

🔥🔥🔥🔥🔥

5 min

3/5/2026

ai-models rust webgpu speech-recognition

Tool

Rust implementation of Mistral's Voxtral Mini 4B Realtime runs in your browser

github.com

🔥🔥🔥🔥🔥

4 min

2/10/2026

transcription-technology speech-recognition mistral-ai real-time-applications

Voxtral Transcribe 2

mistral.ai

🔥🔥🔥🔥🔥

5 min

2/4/2026

ai-agents speech-recognition telecommunications ethical-ai

News

Telus Uses AI to Alter Call-Agent Accents

letsdatascience.com

🔥🔥🔥🔥🔥

3 min

5/6/2026

nvidia speech-recognition apple-silicon developer-tools

Tool

Nvidia PersonaPlex 7B on Apple Silicon: Full-Duplex Speech-to-Speech in Swift

blog.ivan.digital

🔥🔥🔥🔥🔥

5 min

3/5/2026

speech-recognition mistral-ai audio-processing developer-tools

Tool

Pure C, CPU-only inference with Mistral Voxtral Realtime 4B speech to text model

github.com

🔥🔥🔥🔥🔥

9 min

2/10/2026

speech-recognition open-source ai-models developer-tools

Tool

Microsoft VibeVoice: Open-Source Frontier Voice AI

github.com

🔥🔥🔥🔥🔥

4 min

4/28/2026

speech-recognition nvidia developer-tools

Tool

Parakeet.cpp – Parakeet ASR inference in pure C++ with Metal GPU acceleration

github.com

🔥🔥🔥🔥🔥

5 min

2/27/2026

transcription-technology speech-recognition mistral-ai real-time-applications

Voxtral Transcribe 2

mistral.ai

🔥🔥🔥🔥🔥

5 min

2/4/2026

speech-recognition automatic-speech-recognition ai-models developer-tools

Tool

Cohere Transcribe: Speech Recognition

cohere.com

🔥🔥🔥🔥🔥

5 min

3/31/2026

ai-models rust webgpu speech-recognition

Tool

Rust implementation of Mistral's Voxtral Mini 4B Realtime runs in your browser

github.com

🔥🔥🔥🔥🔥

4 min

2/10/2026

No more articles to load