Speech Mcp

What is Speech Mcp?

Speech Mcp is an MCP (Model Context Protocol) client that integrates OpenAI’s speech-to-text and text-to-speech engines, offering a modern PyQt-based user interface with audio visualization. It runs on any system with Python 3.10+ and is designed for developers and users who want voice-driven interactions with AI.

How to use Speech Mcp?

Install by cloning the repository and running the provided install_speech_mcp.sh script, which sets up a virtual environment, installs dependencies, and creates a global speech-mcp command. Configure by editing the .env file with your OpenAI API key and preferred TTS/STT settings. Launch Speech Mcp via speech-mcp, ./run.sh, or ./speech-mcp-bin.

Key features of Speech Mcp

Modern PyQt-based UI with dark theme and audio visualization
Voice input capture and transcription using OpenAI STT
Voice output with multiple OpenAI voice options
Multi-speaker narration for stories and dialogues
Single-voice text-to-speech conversion from text or file
Audio/video transcription from various media formats
Voice persistence: remembers preferred voice across sessions
Continuous conversation mode with automatic silence detection

Use cases of Speech Mcp

Engage in natural voice conversations with an AI assistant
Generate narrated audio files for stories or dialogues with multiple voices
Transcribe speech from audio or video files
Convert written text or documents into spoken audio

FAQ from Speech Mcp

What API key is required?

Speech Mcp requires an OpenAI API key for both text-to-speech and speech-to-text. Configure it in the .env file as OPENAI_API_KEY.

Which voices are supported?

Speech Mcp supports OpenAI voices including bm_daniel (default), alloy, echo, fable, onyx, nova, and shimmer.

On which platforms does Speech Mcp run?

Speech Mcp requires Python 3.10 or higher and runs on any operating system that supports the dependencies (PyQt5, PyAudio, NumPy, etc.).

Is Speech Mcp free and open source?

Yes, Speech Mcp is released under the MIT License. You can freely use, modify, and distribute it.

What are the known limits?

Speech Mcp relies on an OpenAI API endpoint; costs are incurred based on OpenAI’s pricing. The maximum recording duration defaults to 30 seconds (adjustable). Silence detection parameters are configurable via environment variables.

About Speech Mcp

Overview

What is Speech Mcp?

How to use Speech Mcp?

Key features of Speech Mcp

Use cases of Speech Mcp

FAQ from Speech Mcp

What API key is required?

Which voices are supported?

On which platforms does Speech Mcp run?

Is Speech Mcp free and open source?

What are the known limits?

Comments

More Other MCP clients

MCP Web Client

Dify Connect MCP

Dify Plugin Agent Mcp_sse

mcp-oi-wiki

MCPChatbot Example