GFSNetwork: Differentiable Feature Selection via Gumbel-Sigmoid Relaxation

📅 2025-03-17

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Addressing the inherent trade-off between interpretability and computational efficiency in feature selection for high-dimensional tabular data, this paper proposes GFSNetwork, a differentiable feature selection framework. GFSNetwork employs temperature-controlled Gumbel-Sigmoid relaxation to enable end-to-end learning of feature subsets and introduces the first differentiable mechanism for automatic determination of the optimal number of features—while maintaining constant computational complexity independent of input dimensionality. By synergistically integrating the representational power of deep learning with the transparency of classical feature selection, GFSNetwork achieves superior performance over baselines—including DeepLasso and attention-based methods—on classification and regression benchmarks as well as real-world metagenomic datasets. It attains higher predictive accuracy using fewer selected features, thereby delivering both strong interpretability and high computational efficiency.

Technology Category

Application Category

📝 Abstract

Feature selection in deep learning remains a critical challenge, particularly for high-dimensional tabular data where interpretability and computational efficiency are paramount. We present GFSNetwork, a novel neural architecture that performs differentiable feature selection through temperature-controlled Gumbel-Sigmoid sampling. Unlike traditional methods, where the user has to define the requested number of features, GFSNetwork selects it automatically during an end-to-end process. Moreover, GFSNetwork maintains constant computational overhead regardless of the number of input features. We evaluate GFSNetwork on a series of classification and regression benchmarks, where it consistently outperforms recent methods including DeepLasso, attention maps, as well as traditional feature selectors, while using significantly fewer features. Furthermore, we validate our approach on real-world metagenomic datasets, demonstrating its effectiveness in high-dimensional biological data. Concluding, our method provides a scalable solution that bridges the gap between neural network flexibility and traditional feature selection interpretability. We share our python implementation of GFSNetwork at https://github.com/wwydmanski/GFSNetwork, as well as a PyPi package (gfs_network).

Problem

Research questions and friction points this paper is trying to address.

Differentiable feature selection for high-dimensional tabular data.

Automatic feature selection without predefined feature count.

Scalable solution combining neural networks and interpretability.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Differentiable feature selection via Gumbel-Sigmoid relaxation

Automatic feature number selection during end-to-end process

Constant computational overhead regardless of input features

🔎 Similar Papers

No similar papers found.