🤖 AI Summary
To address the challenge of accurately capturing temporal human activity patterns for fine-grained residential electricity demand forecasting, this paper proposes a hierarchical semi-Markov generative model: an upper layer employs a time-inhomogeneous Markov router to model activity transitions, while a lower layer explicitly models activity durations via hazard functions. To enhance statistical robustness under sparse observational data, we incorporate survey-design-weighted training and cross-population information sharing. Compared with conventional Markov models, our approach significantly improves both sequence fidelity and electricity consumption prediction accuracy on held-out test sets. It identifies gender, day type, and household size as key covariates influencing activity patterns and generates high-fidelity synthetic activity trajectories. The framework supports microgrid optimal scheduling and granular demand response strategies.
📝 Abstract
Residential electricity demand at granular scales is driven by what people do and for how long. Accurately forecasting this demand for applications like microgrid management and demand response therefore requires generative models that can produce realistic daily activity sequences, capturing both the timing and duration of human behavior. This paper develops a generative model of human activity sequences using nationally representative time-use diaries at a 10-minute resolution. We use this model to quantify which demographic factors are most critical for improving predictive performance.
We propose a hierarchical semi-Markov framework that addresses two key modeling challenges. First, a time-inhomogeneous Markov emph{router} learns the patterns of ``which activity comes next." Second, a semi-Markov emph{hazard} component explicitly models activity durations, capturing ``how long" activities realistically last. To ensure statistical stability when data are sparse, the model pools information across related demographic groups and time blocks. The entire framework is trained and evaluated using survey design weights to ensure our findings are representative of the U.S. population.
On a held-out test set, we demonstrate that explicitly modeling durations with the hazard component provides a substantial and statistically significant improvement over purely Markovian models. Furthermore, our analysis reveals a clear hierarchy of demographic factors: Sex, Day-Type, and Household Size provide the largest predictive gains, while Region and Season, though important for energy calculations, contribute little to predicting the activity sequence itself. The result is an interpretable and robust generator of synthetic activity traces, providing a high-fidelity foundation for downstream energy systems modeling.