🤖 AI Summary
Addressing the challenge of static object re-identification for mobile service robots operating over extended periods in dynamic outdoor environments, this work focuses on generalizable instance-level object re-identification across varying viewpoints, illumination conditions, and weather. Existing approaches rely heavily on category-level priors or require precise foreground segmentation, and fail to model complex outdoor appearance variations robustly. To overcome these limitations, we: (1) introduce CODa Re-ID, the first large-scale野外 (field-deployed) object re-identification benchmark featuring real-world environmental diversity; (2) propose CLOVER, a segmentation-free, context-aware invariant representation learning framework that jointly incorporates multi-view geometric priors and environment-invariance constraints via contrastive self-supervised learning; and (3) demonstrate state-of-the-art performance on CODa Re-ID, with strong generalization across unseen instances and categories—enabling robust long-term object tracking and semantic understanding in realistic outdoor settings.
📝 Abstract
In many applications, robots can benefit from object-level understanding of their environments, including the ability to distinguish object instances and re-identify previously seen instances. Object re-identification is challenging across different viewpoints and in scenes with significant appearance variation arising from weather or lighting changes. Most works on object re-identification focus on specific classes; approaches that address general object re-identification require foreground segmentation and have limited consideration of challenges such as occlusions, outdoor scenes, and illumination changes. To address this problem, we introduce CODa Re-ID: an in-the-wild object re-identification dataset containing 1,037,814 observations of 557 objects of 8 classes under diverse lighting conditions and viewpoints. Further, we propose CLOVER, a representation learning method for object observations that can distinguish between static object instances. Our results show that CLOVER achieves superior performance in static object re-identification under varying lighting conditions and viewpoint changes, and can generalize to unseen instances and classes.