Towards Time Series Generation Conditioned on Unstructured Natural Language

📅 2025-06-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the novel problem of natural language–driven time-series generation. We propose the first diffusion-based time-series generation framework conditioned on unstructured text. Methodologically, it integrates a pre-trained language model with a conditional diffusion model, where text embeddings dynamically guide time-series modeling to ensure semantic fidelity and controllability. Our key contributions are threefold: (1) establishing the “language-to-time-series” generation paradigm; (2) empirically validating feasibility across diverse domains—including finance and climate—enabling customized forecasting, data augmentation, and cross-domain transfer; and (3) releasing the largest publicly available text–time-series paired dataset to date (63,010 instances). All code and data are open-sourced to advance foundational research and practical applications of generative AI in time-series analysis.

Technology Category

Application Category

📝 Abstract
Generative Artificial Intelligence (AI) has rapidly become a powerful tool, capable of generating various types of data, such as images and text. However, despite the significant advancement of generative AI, time series generative AI remains underdeveloped, even though the application of time series is essential in finance, climate, and numerous fields. In this research, we propose a novel method of generating time series conditioned on unstructured natural language descriptions. We use a diffusion model combined with a language model to generate time series from the text. Through the proposed method, we demonstrate that time series generation based on natural language is possible. The proposed method can provide various applications such as custom forecasting, time series manipulation, data augmentation, and transfer learning. Furthermore, we construct and propose a new public dataset for time series generation, consisting of 63,010 time series-description pairs.
Problem

Research questions and friction points this paper is trying to address.

Generate time series from unstructured natural language descriptions
Develop time series generative AI using diffusion and language models
Create a public dataset for time series generation research
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion model with language model
Generates time series from text
New public dataset provided
🔎 Similar Papers
No similar papers found.
J
Jaeyun Woo
Graduate School of Information Science and Electrical Engineering, Kyushu University, Fukuoka, Japan
J
Jiseok Lee
Graduate School of Information Science and Electrical Engineering, Kyushu University, Fukuoka, Japan
Brian Kenji Iwana
Brian Kenji Iwana
Kyushu University
Pattern RecognitionCharacter RecognitionDocument Image ProcessingMachine Learning