🤖 AI Summary
This work addresses the critical bottleneck of lacking large-scale, open-source datasets for fine-grained visual analysis of Eurasian lynx. We introduce the first benchmark dataset supporting individual re-identification, 2D pose estimation, and instance segmentation—comprising 31,000 real camera-trap images (spanning 219 individuals across 15 years and multiple regions) and 102,000 high-fidelity synthetic images. Methodologically, we propose geography- and time-aware evaluation protocols; integrate Unity-based simulation with diffusion-model-driven texture generation to synthesize diverse, realistic data; and provide comprehensive annotations—including 20-keypoint skeletons and pixel-accurate instance masks—alongside open-/closed-set temporal splits. Experiments demonstrate substantial improvements in cross-temporal and cross-regional generalization, establishing new state-of-the-art performance on individual re-identification and pose estimation. This dataset advances fine-grained wildlife visual analysis by enabling robust, scalable, and ecologically grounded computer vision research.
📝 Abstract
We introduce CzechLynx, the first large-scale, open-access dataset for individual identification, 2D pose estimation, and instance segmentation of the Eurasian lynx (Lynx lynx). CzechLynx includes more than 30k camera trap images annotated with segmentation masks, identity labels, and 20-point skeletons and covers 219 unique individuals across 15 years of systematic monitoring in two geographically distinct regions: Southwest Bohemia and the Western Carpathians. To increase the data variability, we create a complementary synthetic set with more than 100k photorealistic images generated via a Unity-based pipeline and diffusion-driven text-to-texture modeling, covering diverse environments, poses, and coat-pattern variations. To allow testing generalization across spatial and temporal domains, we define three tailored evaluation protocols/splits: (i) geo-aware, (ii) time-aware open-set, and (iii) time-aware closed-set. This dataset is targeted to be instrumental in benchmarking state-of-the-art models and the development of novel methods for not just individual animal re-identification.