SILVI: Simple Interface for Labeling Video Interactions

📅 2025-11-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing open-source video annotation tools struggle to simultaneously support precise individual localization and fine-grained annotation of social interaction behaviors, hindering both animal social behavior research and visual model training. To address this, we propose SILVI—the first open-source tool enabling joint annotation of individual identities and interactive behaviors. SILVI innovatively integrates behavioral semantic annotation with multi-object tracking, supporting temporal synchronization, dynamic scene graph generation, and explicit interaction relation modeling. Its decoupled frontend-backend architecture produces structured, machine-readable label data compatible with both animal and human video analysis. Open-sourced with comprehensive documentation and modular, extensible design, SILVI significantly improves annotation efficiency for complex social behavior datasets and enhances downstream model training support. By bridging computer vision and behavioral ecology, SILVI facilitates deeper interdisciplinary research and advances the development of socially aware visual understanding systems.

Technology Category

Application Category

📝 Abstract
Computer vision methods are increasingly used for the automated analysis of large volumes of video data collected through camera traps, drones, or direct observations of animals in the wild. While recent advances have focused primarily on detecting individual actions, much less work has addressed the detection and annotation of interactions -- a crucial aspect for understanding social and individualized animal behavior. Existing open-source annotation tools support either behavioral labeling without localization of individuals, or localization without the capacity to capture interactions. To bridge this gap, we present SILVI, an open-source labeling software that integrates both functionalities. SILVI enables researchers to annotate behaviors and interactions directly within video data, generating structured outputs suitable for training and validating computer vision models. By linking behavioral ecology with computer vision, SILVI facilitates the development of automated approaches for fine-grained behavioral analyses. Although developed primarily in the context of animal behavior, SILVI could be useful more broadly to annotate human interactions in other videos that require extracting dynamic scene graphs. The software, along with documentation and download instructions, is available at: https://gitlab.gwdg.de/kanbertay/interaction-labelling-app.
Problem

Research questions and friction points this paper is trying to address.

Detecting and annotating interactions in video data for animal behavior analysis
Integrating behavioral labeling with individual localization in annotation tools
Generating structured outputs for training computer vision models on interactions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Open-source software for video interaction labeling
Combines behavior annotation with individual localization
Generates structured outputs for computer vision training
🔎 Similar Papers
No similar papers found.
O
Ozan Kanbertay
Institute of Computer Science and Campus Institute Data Science, University of Göttingen
Richard Vogg
Richard Vogg
University of Göttingen
computer visiondeep learninganimal behavior
E
Elif Karakoc
Behavioral Ecology & Sociobiology Unit, German Primate Center, Göttingen, Germany
P
Peter M. Kappeler
Department of Sociobiology/Anthropology, University of Göttingen, Göttingen, Germany
C
Claudia Fichtel
Behavioral Ecology & Sociobiology Unit, German Primate Center, Göttingen, Germany
Alexander S. Ecker
Alexander S. Ecker
University of Göttingen, Germany
Computational NeuroscienceVisionMachine LearningComputer VisionData Science