🤖 AI Summary
Existing LLM-based approaches for data science focus narrowly on isolated subtasks, neglecting inter-stage dependencies and thus failing to support end-to-end workflows. Method: This paper proposes a Jupyter Notebook–centric LLM agent framework that unifies human-agent-environment interaction via Markdown cells and executable code cells. It introduces a novel finite-state transducer (FST)–based four-phase mechanism: depth-first search (DFS)–guided task planning, incremental code execution, self-diagnostic debugging, and post-hoc result filtering—enabling dynamic closed-loop feedback and autonomous error recovery. Contribution/Results: The framework achieves performance on par with or surpassing state-of-the-art methods across diverse data science tasks—including data analysis, visualization, and modeling—demonstrating strong cross-scenario generalization and the feasibility of fully automated, end-to-end data science workflows.
📝 Abstract
Data Science tasks are multifaceted, dynamic, and often domain-specific. Existing LLM-based approaches largely concentrate on isolated phases, neglecting the interdependent nature of many data science tasks and limiting their capacity for comprehensive end-to-end support. We propose DatawiseAgent, a notebook-centric LLM agent framework that unifies interactions among user, agent and the computational environment through markdown and executable code cells, supporting flexible and adaptive automated data science. Built on a Finite State Transducer(FST), DatawiseAgent orchestrates four stages, including DSF-like planning, incremental execution, self-debugging, and post-filtering. Specifically, the DFS-like planning stage systematically explores the solution space, while incremental execution harnesses real-time feedback and accommodates LLM's limited capabilities to progressively complete tasks. The self-debugging and post-filtering modules further enhance reliability by diagnosing and correcting errors and pruning extraneous information. Extensive experiments on diverse tasks, including data analysis, visualization, and data modeling, show that DatawiseAgent consistently outperforms or matches state-of-the-art methods across multiple model settings. These results highlight its potential to generalize across data science scenarios and lay the groundwork for more efficient, fully automated workflows.