Conkurrence

@AlligatorC0der

Visit Server

2 months ago

# inter-rater-reliability

# fleiss-kappa

Conkurrence measures whether multiple AI models produce consistent outputs on your evaluation tasks. It tells you which items your AI agrees on and which need human review — using Fleiss' κ, Kendall's W, and bootstrap confidence intervals, the same psychometric methods trusted in clinical research.

Overview Tools Comments

Tools

Try in Playground

Server Config

{
  "mcpServers": {
    "conkurrence": {
      "command": "npx",
      "args": [
        "-y",
        "conkurrence",
        "mcp"
      ]
    }
  }
}

Build with ShipAny.

Explore
Playground
Blog
Cases
DXT
Partners

Privacy
Terms