VirnyFlow: A Design Space for Responsible Model Development

📅 2025-06-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of black-box automation, accountability deficits, and insufficient customization in real-world multi-objective AutoML problems, this paper introduces the first accountability-oriented design space for ML model development. It enables data scientists to explicitly specify optimization objectives, configure cross-stage experiments, and iteratively refine pipelines under constraint alignment. Technically, it integrates multi-objective Bayesian optimization, cost-aware multi-armed bandits, query optimization, and a distributed parallel architecture. Evaluated on five real-world benchmarks, our system significantly outperforms state-of-the-art AutoML methods in both optimization quality—measured by Pareto front coverage—and scalability—supporting concurrent scheduling of over one thousand components. This work establishes a new paradigm for responsible, interpretable, and customizable ML pipeline development.

Technology Category

Application Category

📝 Abstract
Developing machine learning (ML) models requires a deep understanding of real-world problems, which are inherently multi-objective. In this paper, we present VirnyFlow, the first design space for responsible model development, designed to assist data scientists in building ML pipelines that are tailored to the specific context of their problem. Unlike conventional AutoML frameworks, VirnyFlow enables users to define customized optimization criteria, perform comprehensive experimentation across pipeline stages, and iteratively refine models in alignment with real-world constraints. Our system integrates evaluation protocol definition, multi-objective Bayesian optimization, cost-aware multi-armed bandits, query optimization, and distributed parallelism into a unified architecture. We show that VirnyFlow significantly outperforms state-of-the-art AutoML systems in both optimization quality and scalability across five real-world benchmarks, offering a flexible, efficient, and responsible alternative to black-box automation in ML development.
Problem

Research questions and friction points this paper is trying to address.

Designing responsible ML models for multi-objective real-world problems
Enabling customized optimization criteria in ML pipeline development
Outperforming AutoML systems in optimization quality and scalability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Customized optimization criteria for ML pipelines
Multi-objective Bayesian optimization integration
Cost-aware multi-armed bandits for efficiency
🔎 Similar Papers
No similar papers found.
D
Denys Herasymuk
Ukrainian Catholic University, Lviv, Ukraine
N
Nazar Protsiv
Ukrainian Catholic University, Lviv, Ukraine
Julia Stoyanovich
Julia Stoyanovich
New York University
responsible AIdata managementalgorithmic rankingAI ethicsAI policy