Inferbench

@JoniMartin27

Visit Server

4 hours ago

InferBench's MCP server lets coding agents run, serve and benchmark local LLMs (text + image, llama.cpp + Stable Diffusion) on your own hardware on demand. Measures real tokens/sec, picks the optimal quant for your GPU, and exposes a 124-model catalog. Local-first, no cloud required.

Overview Tools Comments

Tools

Try in Playground

Server Config

{
  "mcpServers": {
    "inferbench": {
      "command": "C:\\Users\\<user>\\AppData\\Local\\Programs\\InferBench\\resources\\sidecar\\inferbench-backend.exe",
      "args": [
        "--mcp"
      ]
    }
  }
}

Build with ShipAny.

Explore
Playground
Blog
Cases
DXT
Partners

Privacy
Terms