Bridging Streaming Continual Learning via In-Context Large Tabular Models

📅 2025-12-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the coupled challenges of concept drift and catastrophic forgetting in unbounded data streams, this paper proposes a unified Stream-based Continual Learning (SCL) framework. Methodologically, it introduces a Large-scale Context-Aware Tabular Model (LTM) as a central hub—first rigorously demonstrating that such a model simultaneously satisfies streaming constraints (e.g., bounded memory, low-latency inference) and continual learning requirements (e.g., experience replay). The approach is grounded in two principled objectives: distribution matching and distribution compression, jointly optimizing plasticity (adaptation to novel distributions), stability (preservation of prior knowledge), memory diversity, and retrieval priority. It integrates online stream summarization, dynamic distribution matching, diversity-driven memory compression, and adaptive retrieval. Experiments across multiple SCL benchmarks show significant forgetting mitigation, sub-millisecond inference latency, strict memory boundedness, and superior overall performance over state-of-the-art continual learning and streaming learning methods.

Technology Category

Application Category

📝 Abstract
In streaming scenarios, models must learn continuously, adapting to concept drifts without erasing previously acquired knowledge. However, existing research communities address these challenges in isolation. Continual Learning (CL) focuses on long-term retention and mitigating catastrophic forgetting, often without strict real-time constraints. Stream Learning (SL) emphasizes rapid, efficient adaptation to high-frequency data streams, but typically neglects forgetting. Recent efforts have tried to combine these paradigms, yet no clear algorithmic overlap exists. We argue that large in-context tabular models (LTMs) provide a natural bridge for Streaming Continual Learning (SCL). In our view, unbounded streams should be summarized on-the-fly into compact sketches that can be consumed by LTMs. This recovers the classical SL motivation of compressing massive streams with fixed-size guarantees, while simultaneously aligning with the experience-replay desiderata of CL. To clarify this bridge, we show how the SL and CL communities implicitly adopt a divide-to-conquer strategy to manage the tension between plasticity (performing well on the current distribution) and stability (retaining past knowledge), while also imposing a minimal complexity constraint that motivates diversification (avoiding redundancy in what is stored) and retrieval (re-prioritizing past information when needed). Within this perspective, we propose structuring SCL with LTMs around two core principles of data selection for in-context learning: (1) distribution matching, which balances plasticity and stability, and (2) distribution compression, which controls memory size through diversification and retrieval mechanisms.
Problem

Research questions and friction points this paper is trying to address.

Bridging streaming and continual learning via large tabular models.
Summarizing unbounded data streams into compact, in-context sketches.
Balancing plasticity and stability through distribution matching and compression.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large tabular models bridge streaming continual learning
On-the-fly stream summarization into compact sketches
Data selection via distribution matching and compression
🔎 Similar Papers
No similar papers found.
A
Afonso Lourenço
GECAD, ISEP, Polytechnic of Porto, Rua Dr. António Bernardino de Almeida, Porto, 4249-015, Portugal
J
João Gama
INESC-TEC, FEP, University of Porto, Rua Dr. Roberto Frias, Porto, 4200-465, Portugal
E
Eric P. Xing
Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE
Goreti Marreiros
Goreti Marreiros
Full Professor, GECAD/ISEP/Polytechnic of Porto