Back to All Skills
AI Evaluation Framework
skillsmp•testing•👍 820 upvotes
Methodologies for evaluating AI agent performance with rubrics, benchmarks, and automated testing.
evaluationbenchmarkstestingmetrics
📦 Installation
Copy evaluation.md to .claude/skills/📖 About This Skill
Methodologies for evaluating AI agent performance with rubrics, benchmarks, and automated testing.
Features
- Works with claude-code, codex
- Part of the skillsmp skills collection
- Category: testing
Getting Started
After installing the skill, simply mention it in your conversation or use the relevant command. The AI will automatically apply the skill's instructions to help you complete your task.
