iframe-proxy | Sunbelt Computer Software

Setup & Installation

npx skills add https://github.com/huggingface/skills --skill hugging-face-vision-trainer

or paste the link and ask your coding assistant to install it

https://github.com/huggingface/skills/tree/main/skills/hugging-face-vision-trainer

What This Skill Does

Trains and fine-tunes vision models on Hugging Face Jobs cloud GPUs, covering object detection (D-FINE, RT-DETR v2, DETR, YOLOS), image classification (timm models including MobileNetV3, ResNet, ViT), and SAM/SAM2 segmentation. Handles COCO-format dataset prep, Albumentations augmentation, mAP/accuracy evaluation, and automatic model persistence to the Hugging Face Hub.

It handles bbox format detection, string category remapping, dataset validation, Trackio monitoring, and Hub persistence in one workflow, eliminating the manual infrastructure work that typically precedes each training run.

When to use it

Training a custom object detector on COCO-format data using D-FINE or RT-DETR v2 on cloud GPUs
Fine-tuning a MobileNetV3 or ViT classifier on a Hub image dataset without a local GPU
Fine-tuning SAM2 for image matting or segmentation using bounding box or point prompts
Running the dataset inspector to catch column format mismatches before submitting a GPU training job
Saving fine-tuned detection or classification checkpoints directly to the Hugging Face Hub from ephemeral cloud jobs

Sunbelt Computer Software

PL/B Language Development and Support

hugging-face-vision-trainer

Setup & Installation

What This Skill Does

When to use it

Similar Skills

mcp-builder

skill-creator

template

answers