🤖 AI Summary
Existing table foundation models treat tables as isolated, static entities, neglecting operational context and domain-specific knowledge—thus failing to handle the dynamicity and domain dependence of real-world data. Method: We propose Semantic-Linked Tables (SLTs), a novel paradigm that deeply integrates tables with declarative knowledge (e.g., rules, constraints) and procedural knowledge (e.g., workflows, decision logic). We introduce FMSLT, a foundation model anchored on operational knowledge, emphasizing human–AI collaborative modeling. It leverages knowledge graph embedding, workflow-aware representation learning, multi-source heterogeneous data alignment, and context-aware semantic encoding. Contribution/Results: This work exposes fundamental limitations of conventional paradigms and establishes the first systematic framework for table modeling grounded in operational context. FMSLT delivers grounded, interpretable, and scenario-adaptive table semantics—providing robust, structured AI capabilities for high-stakes domains such as finance and healthcare.
📝 Abstract
Current research on tabular foundation models often overlooks the complexities of large-scale, real-world data by treating tables as isolated entities and assuming information completeness, thereby neglecting the vital operational context. To address this, we introduce the concept of Semantically Linked Tables (SLT), recognizing that tables are inherently connected to both declarative and procedural operational knowledge. We propose Foundation Models for Semantically Linked Tables (FMSLT), which integrate these components to ground tabular data within its true operational context. This comprehensive representation unlocks the full potential of machine learning for complex, interconnected tabular data across diverse domains. Realizing FMSLTs requires access to operational knowledge that is often unavailable in public datasets, highlighting the need for close collaboration between domain experts and researchers. Our work exposes the limitations of current tabular foundation models and proposes a new direction centered on FMSLTs, aiming to advance robust, context-aware models for structured data.