Evaluating Interactive 2D Visualization as a Sample Selection Strategy for Biomedical Time-Series Data Annotation

📅 2026-03-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the high cost of annotation and low sample selection efficiency in biomedical time series labeling by systematically evaluating three active sample selection strategies: random sampling (RND), farthest-first traversal (FAFT), and an interactive 2D visualization approach (2DV). For the first time, the effectiveness of 2DV is validated with real human annotators, leveraging dimensionality reduction techniques to enable intuitive exploration of high-dimensional data distributions. Experimental results demonstrate that 2DV achieves superior performance when aggregating labels from multiple annotators, significantly enhancing the detection of rare classes—such as abnormal infant movements. Furthermore, in a speech emotion recognition task, expert annotators not only exhibited improved classification accuracy but also reported a more engaging and interactive annotation experience.
📝 Abstract
Reliable machine-learning models in biomedical settings depend on accurate labels, yet annotating biomedical time-series data remains challenging. Algorithmic sample selection may support annotation, but evidence from studies involving real human annotators is scarce. Consequently, we compare three sample selection methods for annotation: random sampling (RND), farthest-first traversal (FAFT), and a graphical user interface-based method enabling exploration of complementary 2D visualizations (2DVs) of high-dimensional data. We evaluated the methods across four classification tasks in infant motility assessment (IMA) and speech emotion recognition (SER). Twelve annotators, categorized as experts or non-experts, performed data annotation under a limited annotation budget, and post-annotation experiments were conducted to evaluate the sampling methods. Across all classification tasks, 2DV performed best when aggregating labels across annotators. In IMA, 2DV most effectively captured rare classes, but also exhibited greater annotator-to-annotator label distribution variability resulting from the limited annotation budget, decreasing classification performance when models were trained on individual annotators' labels; in these cases, FAFT excelled. For SER, 2DV outperformed the other methods among expert annotators and matched their performance for non-experts in the individual-annotator setting. A failure risk analysis revealed that RND was the safest choice when annotator count or annotator expertise was uncertain, whereas 2DV had the highest risk due to its greater label distribution variability. Furthermore, post-experiment interviews indicated that 2DV made the annotation task more interesting and enjoyable. Overall, 2DV-based sampling appears promising for biomedical time-series data annotation, particularly when the annotation budget is not highly constrained.
Problem

Research questions and friction points this paper is trying to address.

sample selection
biomedical time-series data
data annotation
annotation budget
label quality
Innovation

Methods, ideas, or system contributions that make the work stand out.

interactive 2D visualization
sample selection
biomedical time-series annotation
farthest-first traversal
annotation budget
🔎 Similar Papers
No similar papers found.
E
Einari Vaaras
Signal Processing Research Centre, Tampere University, Finland
M
Manu Airaksinen
BABA Center, Department of Physiology, University of Helsinki, Finland
Okko Räsänen
Okko Räsänen
Professor, Tampere University, Finland
cognitive sciencelanguage acquisitionspeech processingmachine learning