Add mlx-whisper as a backend #407

hanrok · 2025-11-18T15:52:14Z

No description provided.

This commit adds support for mlx-whisper as a backend option, optimized for Apple Silicon (M1/M2/M3) Macs. MLX leverages Apple's Neural Engine and GPU for hardware-accelerated inference. Changes: - Add MLX_WHISPER to BackendType enum with is_mlx_whisper() method - Create WhisperMLX transcriber wrapper (transcriber_mlx.py) * Maps standard model sizes to MLX community models * Implements transcribe() with language detection support * Returns segments compatible with WhisperLive base interface - Create ServeClientMLXWhisper backend class (mlx_whisper_backend.py) * Extends ServeClientBase with MLX-specific implementation * Supports single model mode for memory efficiency * Thread-safe model access with locking * Graceful fallback to faster_whisper if MLX unavailable - Update server.py initialize_client() to instantiate MLX backend - Update run_server.py CLI to include mlx_whisper in backend options - Add mlx-whisper dependency to requirements/server.txt The backend follows the same pluggable architecture as other backends (faster_whisper, tensorrt, openvino) and implements the required transcribe_audio() and handle_transcription_output() methods. Usage: python run_server.py --backend mlx_whisper --model small.en

This commit adds: 1. MLX Model Path CLI Argument: - Add --mlx_model_path/-mlx argument to run_server.py - Server can now specify MLX model, overriding client's choice - Supports model sizes (small.en) and HF repos (mlx-community/whisper-large-v3-turbo) - Integrated into single_model mode support 2. Server-side MLX Model Configuration: - Update server.run() to accept mlx_model_path parameter - Update recv_audio(), handle_new_connection(), and initialize_client() - MLX backend now uses server-specified model when provided - Falls back to client-specified model if not set 3. Microphone Test Script (test_mlx_microphone.py): - Real-time transcription test with microphone input - Command-line args for host, port, model, language, translate - User-friendly interface with status messages - Saves output to SRT file - Proper error handling and cleanup 4. GPU Verification Tool (verify_mlx_gpu.py): - Checks if running on Apple Silicon (M1/M2/M3) - Verifies MLX and mlx-whisper installation - Tests GPU/Neural Engine access with sample computation - Optional MLX Whisper model loading test - Provides instructions for monitoring GPU usage: * Activity Monitor (GUI) * powermetrics (Terminal) * asitop (Third-party) - Comprehensive summary with actionable recommendations Usage Examples: # Server with specific MLX model python run_server.py --backend mlx_whisper --mlx_model_path small.en # Verify GPU functionality python verify_mlx_gpu.py # Test with microphone python test_mlx_microphone.py --model small.en --lang en

zoq · 2025-11-24T14:28:53Z

This is huge, I will test the backend this week.

claude added 3 commits November 18, 2025 13:53

Remove test scripts - not needed for PR

4d8917b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add mlx-whisper as a backend #407

Add mlx-whisper as a backend #407

Uh oh!

hanrok commented Nov 18, 2025

Uh oh!

zoq commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Add mlx-whisper as a backend #407

Are you sure you want to change the base?

Add mlx-whisper as a backend #407

Uh oh!

Conversation

hanrok commented Nov 18, 2025

Uh oh!

zoq commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants