The GigaMIDI Dataset with Features for Expressive Music Performance Detection

📅 2025-02-07
🏛️ Transactions of the International Society for Music Information Retrieval
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of explicit expressivity annotations in MIDI files by proposing a systematic methodology to distinguish non-expressive from expressive tracks. We introduce GigaMIDI—the largest research-grade symbolic music dataset to date, comprising over 1.4 million MIDI files—and provide the first formal definition and quantitative characterization of MIDI expressivity features. To overcome the annotation scarcity, we propose three novel, interpretable, heuristic metrics: velocity ratio, timing deviation ratio, and beat-level median deviation. Leveraging symbolic music analysis, beat-aligned temporal normalization, and statistically derived thresholds, we identify 1.656 million expressive tracks (31% of the corpus), spanning all General MIDI instrument classes. This yields the largest publicly available expressive MIDI subset, enabling large-scale data-driven studies of musical expressivity in symbolic domains.

Technology Category

Application Category

📝 Abstract
The Musical Instrument Digital Interface (MIDI), introduced in 1983, revolutionized music production by allowing computers and instruments to communicate efficiently. MIDI files encode musical instructions compactly, facilitating convenient music sharing. They benefit music information retrieval (MIR), aiding in research on music understanding, computational musicology, and generative music. The GigaMIDI dataset contains over 1.4 million unique MIDI files, encompassing 1.8 billion MIDI note events and over 5.3 million MIDI tracks. GigaMIDI is currently the largest collection of symbolic music in MIDI format available for research purposes under fair dealing. Distinguishing between non‑expressive and expressive MIDI tracks is challenging, as MIDI files do not inherently make this distinction. To address this issue, we introduce a set of innovative heuristics for detecting expressive music performance. These include the distinctive note velocity ratio (DNVR) heuristic, which analyzes MIDI note velocity; the distinctive note onset deviation ratio (DNODR) heuristic, which examines deviations in note onset times; and the note onset median metric level (NOMML) heuristic, which evaluates onset positions relative to metric levels. Our evaluation demonstrates these heuristics effectively differentiate between non‑expressive and expressive MIDI tracks. Furthermore, after evaluation, we create the most substantial expressive MIDI dataset, employing our heuristic NOMML. This curated iteration of GigaMIDI encompasses expressively performed instrument tracks detected by NOMML, containing all General MIDI instruments, constituting 31% of the GigaMIDI dataset, totaling 1,655,649 tracks.
Problem

Research questions and friction points this paper is trying to address.

Detect expressive music performance in MIDI tracks.
Differentiate non-expressive and expressive MIDI tracks.
Create largest expressive MIDI dataset using heuristics.
Innovation

Methods, ideas, or system contributions that make the work stand out.

DNVR heuristic for velocity analysis
DNODR heuristic for onset deviation
NOMML heuristic for metric evaluation
🔎 Similar Papers
No similar papers found.