Meta-Learning Loss Functions for Deep Neural Networks

๐Ÿ“… 2024-06-14
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 4
โœจ Influential: 1
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To bridge the gap between data-hungry deep models and human-like few-shot learning efficiency, this paper investigates the long-overlooked loss function component within meta-learning frameworks and proposes a novel dynamic adaptive loss learning paradigm. Methodologically, it introduces (1) EvoMALโ€”an interpretable symbolic loss evolution method that integrates symbolic regression with evolutionary algorithms to generate lightweight, task-adaptive, and interpretable loss functions; (2) Sparse Label Smoothing Regularization (SparseLSR), a new regularization technique for mitigating label noise in low-data regimes; and (3) NPBMLโ€”a unified framework jointly optimizing meta-initialization, meta-optimizer, and loss function. Experiments across multiple few-shot benchmarks demonstrate state-of-the-art performance: classification accuracy improves significantly, while memory overhead for loss learning is reduced by over 80%.

Technology Category

Application Category

๐Ÿ“ Abstract
Humans can often quickly and efficiently solve new complex learning tasks given only a small set of examples. In contrast, modern artificially intelligent systems often require thousands or millions of observations in order to solve even the most basic tasks. Meta-learning aims to resolve this issue by leveraging past experiences from similar learning tasks to embed the appropriate inductive biases into the learning system. Historically methods for meta-learning components such as optimizers, parameter initializations, and more have led to significant performance increases. This thesis aims to explore the concept of meta-learning to improve performance, through the often-overlooked component of the loss function. The loss function is a vital component of a learning system, as it represents the primary learning objective, where success is determined and quantified by the system's ability to optimize for that objective successfully. In this thesis, we developed methods for meta-learning the loss function of deep neural networks. In particular, we first introduced a method for meta-learning symbolic model-agnostic loss function called Evolved Model Agnostic Loss (EvoMAL). This method consolidates recent advancements in loss function learning and enables the development of interpretable loss functions on commodity hardware. Through empirical and theoretical analysis, we uncovered patterns in the learned loss functions, which later inspired the development of Sparse Label Smoothing Regularization (SparseLSR), which is a significantly faster and more memory-efficient way to perform label smoothing regularization. Second, we challenged the conventional notion that a loss function must be a static function by developing Adaptive Loss Function Learning (AdaLFL), a method for meta-learning adaptive loss functions. Lastly, we developed Neural Procedural Bias Meta-Learning (NPBML) a task-adaptive few-shot learning method that meta-learns the parameter initialization, optimizer, and loss function simultaneously.
Problem

Research questions and friction points this paper is trying to address.

Meta-learning loss functions for efficient small-data learning
Improving AI performance via loss function optimization
Enhancing deep neural networks with meta-learned objectives
Innovation

Methods, ideas, or system contributions that make the work stand out.

Meta-learning loss functions for deep neural networks
Leveraging past experiences for inductive biases
Improving performance through loss function optimization
๐Ÿ”Ž Similar Papers
No similar papers found.