Sunbelt Computer Software

Audio QA Dataset

This repository contains a collection of audio clips along with annotated question-answer pairs stored in a structured metadata file.

AUDITA

This dataset accompanies the paper:

AUDITA: A New Dataset to Audit Humans vs. AI Skill at Audio QA
Tasnim Kabir, Dmytro Kurdydyk, Aadi Palnitkar, Liam Dorn, Ahmed Haj Ahmed, and Jordan Lee Boyd-Graber (2026)

ACL Findings 2026: https://aclanthology.org/2026.findings-acl.1292/
arXiv Preprint: https://arxiv.org/abs/2604.21766

🔎 Dataset Explorer

You can explore the dataset interactively at:

https://manchester.umiacs.umd.edu/audio

The explorer allows you to:

Browse questions by source dataset
Browse questions by audio category
Listen to audio clips
View question-answer pairs and metadata

Citation

If you use this dataset, please cite the ACL Findings paper:

@inproceedings{kabir-etal-2026-audita,
    title = "{AUDITA}: A New Dataset to Audit Humans vs. {AI} Skill at Audio {QA}",
    author = "Kabir, Tasnim and
      Kurdydyk, Dmytro and
      Palnitkar, Aadi and
      Dorn, Liam and
      Ahmed, Ahmed Haj and
      Boyd-Graber, Jordan Lee",
    editor = "Liakata, Maria and
      Moreira, Viviane P. and
      Zhang, Jiajun and
      Jurgens, David",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2026",
    month = jul,
    year = "2026",
    address = "San Diego, California, United States",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2026.findings-acl.1292/",
    pages = "25922--25951",
    ISBN = "979-8-89176-395-1"
}

If you wish to cite the preprint instead:

@article{kabir2026audita,
  title={AUDITA: A New Dataset to Audit Humans vs. AI Skill at Audio QA},
  author={Kabir, Tasnim and Kurdydyk, Dmytro and Palnitkar, Aadi and Dorn, Liam and Ahmed, Ahmed Haj and Boyd-Graber, Jordan Lee},
  journal={arXiv preprint arXiv:2604.21766},
  year={2026}
}

📁 Folder Structure

.
├── audio/             # Contains the audio files referenced in the metadata
└── combined.json      # Metadata with QA pairs and file references

📄 File Descriptions

`combined.json`

A list of JSON objects, each representing a question-answer annotation for an audio file.

Dataset Overview

This dataset consists of 9,690 human-ready question-answer pairs, organized as follows.

OUR Sources

Quizbowl-style

Pavements: 673 questions
Audio-Packets: 1,649 questions

Trivia-style

Quizmasters: 4,138 questions

Subtotal (OUR): 6,460 questions

EXTERNAL Sources

Close-Ended Questions

OpenAQA: 882 questions
ClothoAQA: 323 questions

Open-Ended Questions

OpenAQA: 2,025 questions

Subtotal (EXTERNAL): 3,230 questions

Total Human-Ready Questions: 9,690

Metadata Fields

Each entry in combined.json contains the following fields:

Example Entry

{
  "question": "Are humans heard?",
  "dataset": "clotho_aqa",
  "file_name": "/data/clotho_aqa/Backyard nature.wav",
  "task": "closed_ended",
  "ground_truth": "yes",
  "Categories": "Character/Person",
  "Subcategories": "N/A"
}

Note: Only the final audio files (e.g., Backyard nature.wav or 123456.flac) are stored in the audio/ directory. The file_name field contains the original source path; simply extract the filename and locate it in the audio/ folder.

`audio/`

This directory contains all audio files referenced by combined.json in .wav or .flac format.

Name		Name	Last commit message	Last commit date
Latest commit History 1,207 Commits
geo		geo
mediaContent		mediaContent
musicalelements		musicalelements
musicidentification		musicidentification
person		person
sound		sound
source/External_sources		source/External_sources
20091212.motorcycle.wav		20091212.motorcycle.wav
20091224.bells.02.wav		20091224.bells.02.wav
20091225.garage.door.wav		20091225.garage.door.wav
20091225.rain.01.wav		20091225.rain.01.wav
20092007.strike.wav		20092007.strike.wav
20100117.beeps.00.wav		20100117.beeps.00.wav
20100307.subway.train.wav		20100307.subway.train.wav
20100401.brussels.01.wav		20100401.brussels.01.wav
20100418.marshes.storm.01.wav		20100418.marshes.storm.01.wav
20100423.river.02.wav		20100423.river.02.wav
20101121.sanlucar.market.wav		20101121.sanlucar.market.wav
20101228.street.wav		20101228.street.wav
20110121_cranes.wind.wav		20110121_cranes.wind.wav
20110121_stream.MS.wav		20110121_stream.MS.wav
20110422_shower.wav		20110422_shower.wav
20111208_beauty.and.3.beasts.wav		20111208_beauty.and.3.beasts.wav
20120715_ourense.bell.01.wav		20120715_ourense.bell.01.wav
20130405_wooden.stairs.floor.01.wav		20130405_wooden.stairs.floor.01.wav
20130723_Rain2.wav		20130723_Rain2.wav
BJ5RL_H7NKs.flac		BJ5RL_H7NKs.flac
BLwuy9_lD_s.flac		BLwuy9_lD_s.flac
BMTa51Z_KXI.flac		BMTa51Z_KXI.flac
BMXKIc_MWPc.flac		BMXKIc_MWPc.flac
BMhnTdy-A0M_000030.flac		BMhnTdy-A0M_000030.flac
BackwardsBrainbustersKylieMinogue-CantGetYouOutOfMyHead(Clip).wav		BackwardsBrainbustersKylieMinogue-CantGetYouOutOfMyHead(Clip).wav
BackwardsBrainbustersKylieMinogue-CantGetYouOutOfMyHead(Reveal).wav		BackwardsBrainbustersKylieMinogue-CantGetYouOutOfMyHead(Reveal).wav
BackwardsBrainbustersKylieMinogue-HandOnYourHeart(Clip).wav		BackwardsBrainbustersKylieMinogue-HandOnYourHeart(Clip).wav
BackwardsBrainbustersKylieMinogue-StepBackInTime(Clip).wav		BackwardsBrainbustersKylieMinogue-StepBackInTime(Clip).wav
BiorGBTpnJc.flac		BiorGBTpnJc.flac
BirdCallBackyard2016Long.wav		BirdCallBackyard2016Long.wav
Birds Chirping outside Krucifix Productions Sound Effects.wav		Birds Chirping outside Krucifix Productions Sound Effects.wav
Birds and ducks -Delta del Llobregat.wav		Birds and ducks -Delta del Llobregat.wav
Blackcap.wav		Blackcap.wav
Bm7CuyN-PO4.flac		Bm7CuyN-PO4.flac
BobKessler-Metal Bangs.wav		BobKessler-Metal Bangs.wav
Bonnet Macaque Alarm Call @ GoaMarch 9 2013.wav		Bonnet Macaque Alarm Call @ GoaMarch 9 2013.wav
Booklet_-_Leaf_through_01_L_Close_R_Distant.wav		Booklet_-_Leaf_through_01_L_Close_R_Distant.wav
GuessTheVoiceDennisWaterman_01.wav		GuessTheVoiceDennisWaterman_01.wav
GuessTheVoiceDennisWaterman_02.wav		GuessTheVoiceDennisWaterman_02.wav
GuessTheVoiceDennisWaterman_03.wav		GuessTheVoiceDennisWaterman_03.wav
GuessTheVoiceDerrenBrown_01.wav		GuessTheVoiceDerrenBrown_01.wav
GuessTheVoiceDerrenBrown_02.wav		GuessTheVoiceDerrenBrown_02.wav
GuessTheVoiceDerrenBrown_03.wav		GuessTheVoiceDerrenBrown_03.wav
GuessTheVoiceDesOConnor_01.wav		GuessTheVoiceDesOConnor_01.wav
GuessTheVoiceDianeAbbott_01.wav		GuessTheVoiceDianeAbbott_01.wav
GuessTheVoiceDianeAbbott_02.wav		GuessTheVoiceDianeAbbott_02.wav
GuessTheVoiceDomJolly_01.wav		GuessTheVoiceDomJolly_01.wav
GuessTheVoiceDonaldTrump_01.wav		GuessTheVoiceDonaldTrump_01.wav
GuessTheVoiceGabyRoslin_01.wav		GuessTheVoiceGabyRoslin_01.wav
GuessTheVoiceGarethGates_01.wav		GuessTheVoiceGarethGates_01.wav
GuessTheVoiceGarethSouthgate_01.wav		GuessTheVoiceGarethSouthgate_01.wav
GuessTheVoiceGaryBarlow_01.wav		GuessTheVoiceGaryBarlow_01.wav
GuessTheVoiceGaryLineker_01.wav		GuessTheVoiceGaryLineker_01.wav
GuessTheVoiceGaryLineker_02.wav		GuessTheVoiceGaryLineker_02.wav
GuessTheVoiceGemmaCollins_01.wav		GuessTheVoiceGemmaCollins_01.wav
GuessTheVoiceGemmaCollins_02.wav		GuessTheVoiceGemmaCollins_02.wav
GuessTheVoiceGeneWilder_01.wav		GuessTheVoiceGeneWilder_01.wav
GuessTheVoiceGeorgeBest_01.wav		GuessTheVoiceGeorgeBest_01.wav
GuessTheVoiceGeorgeMichael_01.wav		GuessTheVoiceGeorgeMichael_01.wav
LE6wUwQ9z9k.flac		LE6wUwQ9z9k.flac
LFHN1Vo0Hqg.flac		LFHN1Vo0Hqg.flac
LQUsqKufZIo.flac		LQUsqKufZIo.flac
LTmmh5GqfaA.flac		LTmmh5GqfaA.flac
LVH8J0Z6GI0.flac		LVH8J0Z6GI0.flac
LZY0TDBjekc.flac		LZY0TDBjekc.flac
LcdryGfdpNI_000017.flac		LcdryGfdpNI_000017.flac
Leeds City Centre-Busses.wav		Leeds City Centre-Busses.wav
Lekkers Ambience.wav		Lekkers Ambience.wav
LepP24RfIro.flac		LepP24RfIro.flac
Light Rain Home2.wav		Light Rain Home2.wav
LittleChild_WoodenFloorFootsteps_01.wav		LittleChild_WoodenFloorFootsteps_01.wav
Living Minute - Winter Thaw.wav		Living Minute - Winter Thaw.wav
Living Room Room Tone.wav		Living Room Room Tone.wav
LmcF42o3ikc_000030.flac		LmcF42o3ikc_000030.flac
Lns99vPS6RI_000000.flac		Lns99vPS6RI_000000.flac
Loading old cobbles.wav		Loading old cobbles.wav
London Underground- tube train at station.wav		London Underground- tube train at station.wav
LsZLRb8yhQ4.flac		LsZLRb8yhQ4.flac
Lt2xNqsaBR8.flac		Lt2xNqsaBR8.flac
LwV8OFAYJT8.flac		LwV8OFAYJT8.flac
Maas 03 100215.wav		Maas 03 100215.wav
NationalAnthemsHungary.wav		NationalAnthemsHungary.wav
NationalAnthemsIceland.wav		NationalAnthemsIceland.wav
NationalAnthemsIndia.wav		NationalAnthemsIndia.wav
NationalAnthemsIndonesia.wav		NationalAnthemsIndonesia.wav
NationalAnthemsPakistan.wav		NationalAnthemsPakistan.wav
NationalAnthemsPanama.wav		NationalAnthemsPanama.wav
NationalAnthemsVenezuela.wav		NationalAnthemsVenezuela.wav
NationalAnthemsVietnam.wav		NationalAnthemsVietnam.wav
NationalAnthemsVirginIslands.wav		NationalAnthemsVirginIslands.wav
NationalAnthemsWales.wav		NationalAnthemsWales.wav
README.md		README.md
Synthetic Rain Noise.wav		Synthetic Rain Noise.wav

Field	Description
`question`	The question posed about the audio content
`dataset`	Source dataset (e.g., `clotho_aqa`)
`file_name`	Path to the corresponding audio file
`task`	Question type (`closed_ended` or `open_ended`)
`ground_truth`	The correct answer
`Categories`	High-level category (e.g., `Character/Person`)
`Subcategories`	More specific category label (or `N/A`)

Sunbelt Computer Software

PL/B Language Development and Support

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Audio QA Dataset

AUDITA

🔎 Dataset Explorer

Citation

📁 Folder Structure

📄 File Descriptions

`combined.json`

Dataset Overview

OUR Sources

Quizbowl-style

Trivia-style

EXTERNAL Sources

Close-Ended Questions

Open-Ended Questions

Metadata Fields

Example Entry

`audio/`

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Sunbelt Computer Software

PL/B Language Development and Support

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Audio QA Dataset

AUDITA

🔎 Dataset Explorer

Citation

📁 Folder Structure

📄 File Descriptions

combined.json

Dataset Overview

OUR Sources

Quizbowl-style

Trivia-style

EXTERNAL Sources

Close-Ended Questions

Open-Ended Questions

Metadata Fields

Example Entry

audio/

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

`combined.json`

`audio/`

Packages