LimiX-2M: Mitigating Low-Rank Collapse and Attention Bottlenecks in Tabular Foundation Models

πŸ“… 2026-06-03
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

183K/year
πŸ€– AI Summary
This work addresses the limitations of conventional tabular foundation models, whose scalar tokenization restricts feature expressiveness, leading to low-rank collapse and attention bottlenecks that diminish shallow-layer sensitivity to feature values and induce redundant representations. To overcome these issues, the authors propose a unified β€œtoken-routing” framework: RaBEL expands scalars into local radial basis function (RBF) features augmented with exponential gating, thereby improving the condition number and effective rank; additionally, an Sβ†’Nβ†’F reordered bidirectional block aggregates cross-sample context prior to feature mixing and integrates attention-based pooling for efficient modeling. The resulting model, LimiX-2M, with only 2 million parameters, surpasses larger counterparts such as TabPFN-v2 and TabICL on mainstream tabular benchmarks, achieving a significantly improved trade-off between accuracy and efficiency.
πŸ“ Abstract
Tabular foundation models (TFMs) increasingly rival tree ensembles, but their performance is often compute-inefficient: with standard affine scalar tokenization, each feature injects value variation through an essentially one-dimensional channel, and feature IDs/positional signals cannot increase within-feature value degrees of freedom, yielding weak early-layer value sensitivity and redundant hidden states. We present a unified \emph{tokenize-and-route} framework for strong TFMs: \textbf{RaBEL} expands each scalar into compact localized RBF features (optionally exponent-gated) to improve conditioning and shallow-layer effective rank, while a reordered bidirectional block \textbf{S$\rightarrow$N$\rightarrow$F} aligns computation with the readout by aggregating cross-sample context before feature mixing and using attention pooling. Together, these changes yield \textbf{LimiX-2M}, a 2M-parameter model that outperforms larger TabPFN-v2 and TabICL baselines on widely used tabular benchmarks while reducing training and inference costs. These results highlight value-aware tokenization and readout-aligned routing as key levers for improving the accuracy--efficiency trade-off in TFMs. Model checkpoints and inference code are available at https://github.com/limix-ldm-ai/LimiX.
Problem

Research questions and friction points this paper is trying to address.

Tabular Foundation Models
Low-Rank Collapse
Attention Bottlenecks
Tokenization
Compute Efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

tokenize-and-route
RaBEL
low-rank collapse
attention bottleneck
tabular foundation models
πŸ”Ž Similar Papers
No similar papers found.