MCP Server Whisper

@arcaputo3

a year ago

An MCP Server for audio transcription using OpenAI

Overview

What is MCP Server Whisper?

MCP Server Whisper is a Model Context Protocol (MCP) server designed for advanced audio transcription and processing using OpenAI's Whisper and GPT-4o models.

How to use MCP Server Whisper?

To use MCP Server Whisper, clone the repository, set up your environment with the required API key and audio file path, and start the server using the provided commands. You can then interact with the server to manage and transcribe audio files.

Key features of MCP Server Whisper?

Advanced file searching with regex patterns and metadata filtering.
Parallel batch processing for multiple audio files.
Format conversion between supported audio types.
Automatic compression for oversized files.
Enhanced transcription with specialized prompts.
Comprehensive metadata support including duration and file size.
High-performance caching for repeated operations.

Use cases of MCP Server Whisper?

Transcribing interviews and meetings for documentation.
Converting audio files to different formats for compatibility.
Batch processing multiple audio files for efficiency.
Extracting detailed insights from audio recordings using enhanced transcription.

FAQ from MCP Server Whisper?

What audio formats are supported?

Supported formats include mp3, wav, and more, depending on the model used.
Is there a limit on audio file size?

Yes, files larger than 25MB are automatically compressed to meet API limits.
Can I use this server for real-time transcription?

The server is designed for batch processing and may not support real-time transcription.

Build with ShipAny.