🤖 AI Summary
This paper addresses the novel problem of natural language–driven time-series generation. We propose the first diffusion-based time-series generation framework conditioned on unstructured text. Methodologically, it integrates a pre-trained language model with a conditional diffusion model, where text embeddings dynamically guide time-series modeling to ensure semantic fidelity and controllability. Our key contributions are threefold: (1) establishing the “language-to-time-series” generation paradigm; (2) empirically validating feasibility across diverse domains—including finance and climate—enabling customized forecasting, data augmentation, and cross-domain transfer; and (3) releasing the largest publicly available text–time-series paired dataset to date (63,010 instances). All code and data are open-sourced to advance foundational research and practical applications of generative AI in time-series analysis.
📝 Abstract
Generative Artificial Intelligence (AI) has rapidly become a powerful tool, capable of generating various types of data, such as images and text. However, despite the significant advancement of generative AI, time series generative AI remains underdeveloped, even though the application of time series is essential in finance, climate, and numerous fields. In this research, we propose a novel method of generating time series conditioned on unstructured natural language descriptions. We use a diffusion model combined with a language model to generate time series from the text. Through the proposed method, we demonstrate that time series generation based on natural language is possible. The proposed method can provide various applications such as custom forecasting, time series manipulation, data augmentation, and transfer learning. Furthermore, we construct and propose a new public dataset for time series generation, consisting of 63,010 time series-description pairs.