Robust Canonicalization through Bootstrapped Data Re-Alignment

📅 2025-10-09

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

Fine-grained visual classification (FGVC) suffers from misalignment in training data caused by geometric biases—e.g., scale and orientation variations—rendering conventional normalization methods insufficiently robust due to their reliance on idealized alignment priors. To address this, we propose a bootstrapped data realignment framework that iteratively optimizes sample geometric poses, enabling dynamic realignment under any compact Lie group and thereby restoring the alignment assumptions required for effective normalization—with theoretical convergence guarantees. Our approach integrates group-equivariant modeling, normalization functions, and a variance-reduction mechanism, without requiring strong data augmentation or specialized equivariant architectures. Evaluated on four FGVC benchmarks, it consistently outperforms both equivariant models and existing normalization methods, matching the performance of heavy-augmentation baselines while significantly improving robustness to geometric perturbations.

Technology Category

Application Category

📝 Abstract

Fine-grained visual classification (FGVC) tasks, such as insect and bird identification, demand sensitivity to subtle visual cues while remaining robust to spatial transformations. A key challenge is handling geometric biases and noise, such as different orientations and scales of objects. Existing remedies rely on heavy data augmentation, which demands powerful models, or on equivariant architectures, which constrain expressivity and add cost. Canonicalization offers an alternative by shielding such biases from the downstream model. In practice, such functions are often obtained using canonicalization priors, which assume aligned training data. Unfortunately, real-world datasets never fulfill this assumption, causing the obtained canonicalizer to be brittle. We propose a bootstrapping algorithm that iteratively re-aligns training samples by progressively reducing variance and recovering the alignment assumption. We establish convergence guarantees under mild conditions for arbitrary compact groups, and show on four FGVC benchmarks that our method consistently outperforms equivariant, and canonicalization baselines while performing on par with augmentation.

Problem

Research questions and friction points this paper is trying to address.

Handling geometric biases in fine-grained visual classification

Improving canonicalization robustness with bootstrapped data re-alignment

Overcoming spatial transformation challenges without heavy augmentation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bootstrapped algorithm iteratively re-aligns training samples

Reduces variance to recover alignment assumption progressively

Outperforms equivariant and canonicalization baselines consistently

🔎 Similar Papers

Improving Numerical Stability of Normalized Mutual Information Estimator on High Dimensions