iframe-proxy | Sunbelt Computer Software

Setup & Installation

npx skills add https://github.com/datadog-labs/agent-skills --skill dd-llmo-experiment-analyzer

or paste the link and ask your coding assistant to install it

https://github.com/datadog-labs/agent-skills/tree/main/dd-llmo/experiment-analyzer

What This Skill Does

Analyzes LLM experiment results from Datadog, supporting single or comparative experiments in exploratory or Q&A modes. Given one or two experiment IDs, it pulls metrics, segments failures, samples representative events, and produces a structured report with root-cause hypotheses and actionable recommendations.

Instead of manually querying experiment summaries, cross-referencing metrics by segment, and sampling failure events one by one, this skill runs the full analysis pipeline automatically and delivers a report with specific numbers and linked examples.

When to use it

Comparing two LLM experiment runs to find where the candidate regressed
Drilling into the worst-performing segments of a single evaluation run
Answering a specific question about metric differences between a baseline and candidate
Exporting an experiment analysis report to a Datadog notebook for team review
Spotting error clusters and failure themes across experiment events

Sunbelt Computer Software

PL/B Language Development and Support

dd-llmo-experiment-analyzer

Setup & Installation

What This Skill Does

When to use it

Similar Skills

minimax-xlsx

xlsx

meme-rush

query-token-info