๐ค AI Summary
Insect classification is critical for agricultural and ecological research but faces challenges including morphological complexity, severe class imbalance, and large-scale data. To address these, this work pioneers the application of neural architecture search (NAS) to multimodal insect classification, proposing a joint modeling framework that fuses visual features and metadata. We introduce an alternating bilevel optimization strategy coupled with a zero-operation pruning mechanism to automatically construct high-performance, sparse networks. Furthermore, multi-cell stacking and sparsity constraints jointly balance representational capacity and computational efficiency. On the BIOSCAN-5M benchmark, our method achieves 96.81% accuracy and 97.05% F1-scoreโsurpassing state-of-the-art transfer learning, Transformer-based, and AutoML approaches by 8โ16 percentage points. It also demonstrates strong generalization on Insects-1M, confirming its practicality and robustness across diverse insect datasets.
๐ Abstract
Insect classification is important for agricultural management and ecological research, as it directly affects crop health and production. However, this task remains challenging due to the complex characteristics of insects, class imbalance, and large-scale datasets. To address these issues, we propose BioAutoML-NAS, the first BioAutoML model using multimodal data, including images, and metadata, which applies neural architecture search (NAS) for images to automatically learn the best operations for each connection within each cell. Multiple cells are stacked to form the full network, each extracting detailed image feature representations. A multimodal fusion module combines image embeddings with metadata, allowing the model to use both visual and categorical biological information to classify insects. An alternating bi-level optimization training strategy jointly updates network weights and architecture parameters, while zero operations remove less important connections, producing sparse, efficient, and high-performing architectures. Extensive evaluation on the BIOSCAN-5M dataset demonstrates that BioAutoML-NAS achieves 96.81% accuracy, 97.46% precision, 96.81% recall, and a 97.05% F1 score, outperforming state-of-the-art transfer learning, transformer, AutoML, and NAS methods by approximately 16%, 10%, and 8% respectively. Further validation on the Insects-1M dataset obtains 93.25% accuracy, 93.71% precision, 92.74% recall, and a 93.22% F1 score. These results demonstrate that BioAutoML-NAS provides accurate, confident insect classification that supports modern sustainable farming.