GOTabPFN: From Feature Ordering to Compact Tokenization for Tabular Foundation Models on High-Dimensional Data

📅 2026-06-03
📈 Citations: 0
Influential: 0
📄 PDF

career value

158K/year
🤖 AI Summary
This work addresses the limited predictive performance of small tabular foundation models on high-dimensional, low-sample-size (HDLSS) tabular data. The authors propose the Graph-guided Ordering with Local Refinement (GO-LR) algorithm, which arranges features in a meaningful sequence and establishes its equivalence to the weighted minimum linear arrangement problem. Building upon this, they introduce a Neuro-inspired Subcell Compression (NSC) module that aggregates neighboring features to generate compact meta-features. For the first time, the study integrates compact tokenization into TabPFN-like architectures, substantially improving prediction accuracy and stability under strict token constraints—without requiring retraining of large backbone models.
📝 Abstract
We investigate how to make small tabular foundation models effective for High-Dimensional, Low-Sample Size (HDLSS) tabular prediction without retraining large backbones. We introduce Graph-guided Ordering with Local Refinement (GO-LR), show its equivalence to weighted Minimum Linear Arrangement, and interpret the practical solver as a TSP-path-style surrogate. We propose GOTabPFN,which builds on GO-LR, and a Neuro-Inspired Subunit Compression (NSC) unit to pool locally adjacent ordered features into meta-features, yielding a compact representation that makes TabPFN-style prediction practical in HDLSS regimes. Across tabular benchmarks, GOTabPFN improves stability and accuracy under tight token budgets.
Problem

Research questions and friction points this paper is trying to address.

tabular foundation models
High-Dimensional Low-Sample Size
feature ordering
compact tokenization
HDLSS
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph-guided Ordering
Neuro-Inspired Subunit Compression
Tabular Foundation Models
High-Dimensional Low-Sample Size
Compact Tokenization
🔎 Similar Papers
No similar papers found.
A
Al Zadid Sultan Bin Habib
Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA
M
Md Younus Ahamed
Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA
P
Prashnna Kumar Gyawali
Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA
Gianfranco Doretto
Gianfranco Doretto
West Virginia University
Computer VisionMachine LearningBiomedical Data ScienceArtificial Intelligence
D
Donald A. Adjeroh
Lane Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown, WV 26506, USA