🤖 AI Summary
This study systematically evaluates the applicability of the foundation model (FM) paradigm across three scientific domains—genomics, satellite imagery, and time-series analysis—to assess whether FMs can supplant traditional supervised learning. Method: We construct a cross-modal benchmark framework employing lightweight architectures (e.g., Wide ResNet, U-Net), automated hyperparameter optimization, and standardized training protocols to rigorously compare domain-specific FMs against strong supervised baselines. Contribution/Results: Across all tasks, carefully tuned supervised models match or exceed state-of-the-art domain-specific FMs; large-scale pretraining yields no consistent empirical gains. This work provides the first multi-modal scientific validation that the FM paradigm remains immature for these domains. We open-source two automated evaluation workflows and underscore the necessity—and benchmarking value—of strong supervised baselines in scientific AI assessment.
📝 Abstract
Following its success for vision and text, the"foundation model"(FM) paradigm -- pretraining large models on massive data, then fine-tuning on target tasks -- has rapidly expanded to domains in the sciences, engineering, healthcare, and beyond. Has this achieved what the original FMs accomplished, i.e. the supplanting of traditional supervised learning in their domains? To answer we look at three modalities -- genomics, satellite imaging, and time series -- with multiple recent FMs and compare them to a standard supervised learning workflow: model development, hyperparameter tuning, and training, all using only data from the target task. Across these three specialized domains, we find that it is consistently possible to train simple supervised models -- no more complicated than a lightly modified wide ResNet or UNet -- that match or even outperform the latest foundation models. Our work demonstrates that the benefits of large-scale pretraining have yet to be realized in many specialized areas, reinforces the need to compare new FMs to strong, well-tuned baselines, and introduces two new, easy-to-use, open-source, and automated workflows for doing so.