a year ago
research-and-dataA lightweight MCP server that loads PDF files, extracts and chunks their text, builds a semantic vector index, and returns relevant passages to Claude or other AI agents for document-based question answering.
Overview
What is mcp-docs-reader?
The mcp-docs-reader is a lightweight MCP (Model Context Protocol) server designed to load PDF files, extract and chunk their text, build a semantic vector index, and return relevant passages for document-based question answering.
How to use mcp-docs-reader?
To use mcp-docs-reader, install Claude Desktop, download the mcp-docs-reader project, set up the UV environment, configure Claude Desktop with the necessary settings, and run Claude Desktop to interact with your PDF documents.
Key features of mcp-docs-reader?
- Loads and processes PDF documents from a local folder.
- Extracts text and splits it into semantic chunks.
- Generates vector embeddings using SentenceTransformer.
- Builds a FAISS-based vector index for semantic search.
- Retrieves top-k relevant chunks based on user queries.
- Constructs prompts with relevant passages and questions for Claude.
Use cases of mcp-docs-reader?
- Answering questions based on local PDF documents.
- Summarizing key points from academic papers.
- Assisting in research by providing relevant document excerpts.
FAQ from mcp-docs-reader?
- Can mcp-docs-reader process any PDF file?
Yes, it can process any PDF file located in the specified local folder.
- Is there a specific setup required for Claude Desktop?
Yes, you need to configure Claude Desktop to recognize the mcp-docs-reader settings.
- What is the purpose of the semantic vector index?
The semantic vector index allows for efficient retrieval of relevant text passages based on user queries.