TabSwift: An Efficient Tabular Foundation Model with Row-Wise Attention

πŸ“… 2026-06-05
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the high inference cost of existing tabular foundation models, which hinders their deployment in latency-sensitive applications. To this end, we propose TabSwift, a lightweight backbone architecture that leverages row-wise attention augmented with a gated attention stabilization mechanism and learnable register tokens to effectively capture tabular context. Furthermore, TabSwift incorporates an adaptive layer-wise early-exit strategy that dynamically balances computational efficiency and predictive accuracy. Experimental results demonstrate that TabSwift achieves competitive performance against state-of-the-art models such as TabPFN v2 and TabICL on both classification and regression tasks, while substantially reducing inference latencyβ€”making it well-suited for efficient real-world deployment.
πŸ“ Abstract
Tabular foundation models, exemplified by TabPFN, perform prediction via in-context learning, inferring test labels directly from labeled training examples. They have demonstrated competitive performance, particularly on small-to-medium datasets. However, recent tabular foundation models often improve accuracy with increasingly complex architectures, incurring higher inference cost and limiting practical deployment. In this work, we revisit the original TabPFN design and show that a lightweight row-wise attention-only backbone can remain highly competitive with two simple enhancements: a gated attention stabilization mechanism and a small set of learnable register tokens that provide global context and improve pretraining quality. The resulting model, TabSwift, supports both classification and regression, and is competitive with stronger tabular foundation models (e.g., TabPFN v2 and TabICL) while being more efficient at inference. For latency-sensitive serving, we further introduce an adaptive layer-wise early-exit mechanism that dynamically adjusts inference depth per sample. Overall, TabSwift enables efficient and anytime tabular in-context learning for practical deployments.
Problem

Research questions and friction points this paper is trying to address.

tabular foundation models
inference efficiency
model complexity
practical deployment
in-context learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

row-wise attention
gated attention stabilization
learnable register tokens
adaptive early-exit
tabular foundation model
πŸ”Ž Similar Papers
No similar papers found.