A Unified AI System For Data Quality Control and DataOps Management in Regulated Environments

📅 2025-12-05

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

In highly regulated domains such as finance, data quality control (QC) is often fragmented into isolated preprocessing steps, undermining end-to-end trustworthy AI pipelines. To address this, we propose the first AI-driven DataOps framework that embeds QC as a system-level core component. Our framework deeply integrates rule-based engines, statistical analysis, and custom AI-powered anomaly detection across the entire data lifecycle—from ingestion and transformation to model deployment—enabling dynamic remediation, policy-configurable workflows, and end-to-end auditability. Technically, it unifies data profiling, stream processing, cloud-native storage interfaces, and a proprietary AI detection module. Evaluated in a real-world financial production environment, the framework achieves significantly improved anomaly recall, reduces manual intervention by 42%, and ensures audit completeness and full data traceability under high-throughput conditions, fully satisfying regulatory compliance requirements.

Technology Category

Application Category

📝 Abstract

In regulated domains such as finance, the integrity and governance of data pipelines are critical - yet existing systems treat data quality control (QC) as an isolated preprocessing step rather than a first-class system component. We present a unified AI-driven Data QC and DataOps Management framework that embeds rule-based, statistical, and AI-based QC methods into a continuous, governed layer spanning ingestion, model pipelines, and downstream applications. Our architecture integrates open-source tools with custom modules for profiling, audit logging, breach handling, configuration-driven policies, and dynamic remediation. We demonstrate deployment in a production-grade financial setup: handling streaming and tabular data across multiple asset classes and transaction streams, with configurable thresholds, cloud-native storage interfaces, and automated alerts. We show empirical gains in anomaly detection recall, reduction of manual remediation effort, and improved auditability and traceability in high-throughput data workflows. By treating QC as a system concern rather than an afterthought, our framework provides a foundation for trustworthy, scalable, and compliant AI pipelines in regulated environments.

Problem

Research questions and friction points this paper is trying to address.

Integrates data quality control into continuous DataOps management

Unifies rule-based, statistical, and AI methods for anomaly detection

Enhances auditability and traceability in regulated data pipelines

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified AI framework integrates QC into continuous DataOps layer

Combines rule-based, statistical, and AI methods for quality control

Enables configurable, automated remediation and audit in regulated environments

🔎 Similar Papers

No similar papers found.

Authors to Follow