Towards Time Series Generation Conditioned on Unstructured Natural Language

📅 2025-06-28

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This paper addresses the novel problem of natural language–driven time-series generation. We propose the first diffusion-based time-series generation framework conditioned on unstructured text. Methodologically, it integrates a pre-trained language model with a conditional diffusion model, where text embeddings dynamically guide time-series modeling to ensure semantic fidelity and controllability. Our key contributions are threefold: (1) establishing the “language-to-time-series” generation paradigm; (2) empirically validating feasibility across diverse domains—including finance and climate—enabling customized forecasting, data augmentation, and cross-domain transfer; and (3) releasing the largest publicly available text–time-series paired dataset to date (63,010 instances). All code and data are open-sourced to advance foundational research and practical applications of generative AI in time-series analysis.

Technology Category

Application Category

📝 Abstract

Generative Artificial Intelligence (AI) has rapidly become a powerful tool, capable of generating various types of data, such as images and text. However, despite the significant advancement of generative AI, time series generative AI remains underdeveloped, even though the application of time series is essential in finance, climate, and numerous fields. In this research, we propose a novel method of generating time series conditioned on unstructured natural language descriptions. We use a diffusion model combined with a language model to generate time series from the text. Through the proposed method, we demonstrate that time series generation based on natural language is possible. The proposed method can provide various applications such as custom forecasting, time series manipulation, data augmentation, and transfer learning. Furthermore, we construct and propose a new public dataset for time series generation, consisting of 63,010 time series-description pairs.

Problem

Research questions and friction points this paper is trying to address.

Generate time series from unstructured natural language descriptions

Develop time series generative AI using diffusion and language models

Create a public dataset for time series generation research

Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion model with language model

Generates time series from text

New public dataset provided

🔎 Similar Papers

No similar papers found.