Implicit Data Synthesis for Contrastive Unsupervised Data Augmentation

📅 2026-06-05

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Scientific observational data are typically unlabeled and highly sensitive to structural integrity, making conventional contrastive learning approaches—relying on input data perturbations—prone to disrupting their inherent structure. To address this challenge, this work proposes an implicit data synthesis strategy that generates contrastive views by perturbing neural network weights rather than the raw input data, thereby preserving the original data structure while enabling effective unsupervised representation learning. Built upon the SimCLR framework, the proposed method is evaluated on radar observations of meteors. Experimental results demonstrate that, under identical evaluation protocols, the approach significantly outperforms standard contrastive learning baselines.

📝 Abstract

Scientific observations generate large quantities of unlabeled data which is laborious to hand-label, making unsupervised learning techniques valuable for processing datasets. Among these approaches, contrastive learning provides a convenient mechanism for extracting structural representations from unannotated datasets. For natural imagery, the general approach is to use a variety of data-space augmentation methods in order to generate synthetic samples; however, for scientific observations data-space perturbations can fundamentally alter the underlying data. Our proposed method is to generate contrastive samples by perturbing the network weights rather than the underlying data, thus more closely preserving the structure of the data. We demonstrate this technique using a SimCLR-based pipeline applied over radar observations of meteors, and show performance gains under matched protocols.

Problem

Research questions and friction points this paper is trying to address.

contrastive learning

unsupervised data augmentation

scientific observations

data perturbation

structural preservation

Innovation

Methods, ideas, or system contributions that make the work stand out.

implicit data synthesis

contrastive learning

unsupervised data augmentation