Beyond Black-Box Predictions: Identifying Marginal Feature Effects in Tabular Transformer Networks

📅 2025-04-11

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the unidentifiability of marginal feature effects and the lack of interpretability in deep tabular Transformer models—stemming from their black-box nature—this paper proposes the first interpretable framework for tabular Transformers with both theoretical guarantees and empirical validation. Methodologically, we design an attention-enhanced additive-constrained Tabular Transformer, incorporating a marginal-effect decoupling module and a theory-driven feature-effect reconstruction mechanism. This enables explicit modeling and precise recovery of individual feature marginal effects while preserving strong predictive capacity. On multiple benchmark datasets, our method achieves prediction accuracy competitive with state-of-the-art models—including XGBoost and FT-Transformer—while delivering statistically reliable, visually intuitive, and high-fidelity marginal effect curves. To our knowledge, this is the first work to break the long-standing trade-off between performance and interpretability in tabular Transformers.

Technology Category

Application Category

📝 Abstract

In recent years, deep neural networks have showcased their predictive power across a variety of tasks. Beyond natural language processing, the transformer architecture has proven efficient in addressing tabular data problems and challenges the previously dominant gradient-based decision trees in these areas. However, this predictive power comes at the cost of intelligibility: Marginal feature effects are almost completely lost in the black-box nature of deep tabular transformer networks. Alternative architectures that use the additivity constraints of classical statistical regression models can maintain intelligible marginal feature effects, but often fall short in predictive power compared to their more complex counterparts. To bridge the gap between intelligibility and performance, we propose an adaptation of tabular transformer networks designed to identify marginal feature effects. We provide theoretical justifications that marginal feature effects can be accurately identified, and our ablation study demonstrates that the proposed model efficiently detects these effects, even amidst complex feature interactions. To demonstrate the model's predictive capabilities, we compare it to several interpretable as well as black-box models and find that it can match black-box performances while maintaining intelligibility. The source code is available at https://github.com/OpenTabular/NAMpy.

Problem

Research questions and friction points this paper is trying to address.

Identify marginal feature effects in tabular transformers

Bridge gap between model intelligibility and performance

Maintain black-box predictive power with interpretable features

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts tabular transformers for feature effects

Identifies marginal effects amid interactions

Balances interpretability and predictive power

🔎 Similar Papers

No similar papers found.

Authors to Follow