SEMixer: Semantics Enhanced MLP-Mixer for Multiscale Mixing and Long-term Time Series Forecasting

📅 2026-02-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes SEMixer, a lightweight model designed to address the challenges of long-term time series forecasting, including the difficulty of modeling multi-scale patterns, interference from redundant noise, and semantic gaps between non-adjacent scales. Built upon the MLP-Mixer architecture, SEMixer integrates a Random Attention Mechanism (RAM) with a Multi-scale Progressive Mixing Chain (MPMC) to enhance the semantic representation of temporal segments, enabling efficient and memory-friendly modeling and fusion of multi-scale temporal dependencies. The effectiveness of SEMixer is validated through extensive experiments on ten public datasets and a large-scale real-world wireless network dataset (21 GB) from the 2025 CCF AIOps Challenge, where it achieved third place.

Technology Category

Application Category

📝 Abstract
Modeling multiscale patterns is crucial for long-term time series forecasting (TSF). However, redundancy and noise in time series, together with semantic gaps between non-adjacent scales, make the efficient alignment and integration of multi-scale temporal dependencies challenging. To address this, we propose SEMixer, a lightweight multiscale model designed for long-term TSF. SEMixer features two key components: a Random Attention Mechanism (RAM) and a Multiscale Progressive Mixing Chain (MPMC). RAM captures diverse time-patch interactions during training and aggregates them via dropout ensemble at inference, enhancing patch-level semantics and enabling MLP-Mixer to better model multi-scale dependencies. MPMC further stacks RAM and MLP-Mixer in a memory-efficient manner, achieving more effective temporal mixing. It addresses semantic gaps across scales and facilitates better multiscale modeling and forecasting performance. We not only validate the effectiveness of SEMixer on 10 public datasets, but also on the \textit{2025 CCF AlOps Challenge} based on 21GB real wireless network data, where SEMixer achieves third place. The code is available at the link https://github.com/Meteor-Stars/SEMixer.
Problem

Research questions and friction points this paper is trying to address.

long-term time series forecasting
multiscale modeling
semantic gaps
temporal dependencies
time series redundancy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Random Attention Mechanism
Multiscale Progressive Mixing Chain
MLP-Mixer
Long-term Time Series Forecasting
Multiscale Temporal Dependencies
🔎 Similar Papers
No similar papers found.
X
Xu Zhang
Shanghai Key Laboratory of Data Science, College of Computer Science and Artificial Intelligence, Fudan University, Shanghai, China
Qitong Wang
Qitong Wang
Harvard University
Data Systems
Peng Wang
Peng Wang
Professor, Computer Science, Fudan University
DatabaseData mining
W
Wei Wang
Shanghai Key Laboratory of Data Science, College of Computer Science and Artificial Intelligence, Fudan University, Shanghai, China