Hugging Face/hugging-face-evaluation — Agent Skills | officialskills.sh
Back to skills

hugging-face-evaluation

officialai-tools

Adds and manages evaluation results in Hugging Face model cards using the model-index metadata format.

Setup & Installation

npx skills add https://github.com/huggingface/skills --skill hugging-face-evaluation
or paste the link and ask your coding assistant to install it
https://github.com/huggingface/skills/tree/main/skills/hugging-face-evaluation
View on GitHub

What This Skill Does

Adds and manages evaluation results in Hugging Face model cards using the model-index metadata format. Supports extracting benchmark tables from README files, importing scores from the Artificial Analysis API, and running evaluations with vLLM or lighteval on local GPUs or HF Jobs infrastructure.

Instead of manually converting markdown tables to model-index YAML and resolving merge conflicts, this skill handles extraction, formatting, deduplication, and PR creation in a single CLI workflow.

When to use it

  • Extracting benchmark scores from a model README and formatting them as model-index YAML
  • Importing Artificial Analysis benchmark results directly into a model card
  • Running MMLU or GSM8K evaluations on a HuggingFace model using a local GPU
  • Submitting lighteval jobs to HF Jobs for models without a public API endpoint
  • Opening a pull request with structured evaluation metadata on a model you don't own