Back to All Skills
📦

AI Evaluation Framework

skillsmptesting👍 820 upvotes

Methodologies for evaluating AI agent performance with rubrics, benchmarks, and automated testing.

evaluationbenchmarkstestingmetrics

📦 Installation

Copy evaluation.md to .claude/skills/

📖 About This Skill

Methodologies for evaluating AI agent performance with rubrics, benchmarks, and automated testing.

Features

  • Works with claude-code, codex
  • Part of the skillsmp skills collection
  • Category: testing

Getting Started

After installing the skill, simply mention it in your conversation or use the relevant command. The AI will automatically apply the skill's instructions to help you complete your task.

Quick Actions

View on GitHub

Supported Platforms

💻claude code
codex

Source

📦
skillsmp
Skills Collection