🤖 AI Summary
To address the trade-off between high computational overhead and classification accuracy in real-time multi-class data stream classification, this paper proposes a dynamic model chain construction framework. Our approach centers on a utility-driven safe chaining mechanism that integrates predicate-ranking principles for class-aware model selection and safe inference skipping. It further incorporates dynamic confidence updating, early-exit deep neural networks, stacked ensemble learning, and cost-sensitive sequential optimization to support dependency-aware model architectures. Extensive experiments on multiple benchmark streaming datasets demonstrate that our method achieves classification accuracy comparable to the best single-model baseline while reducing average computational cost by up to 62%. It significantly outperforms both static model deployment and naive cascading strategies. The proposed framework thus provides a scalable, high-quality solution for real-time stream classification under resource-constrained environments.
📝 Abstract
NOMAD (Navigating Optimal Model Application for Datastreams) is an intelligent framework for data enrichment during ingestion that optimizes realtime multiclass classification by dynamically constructing model chains, i.e ,sequences of machine learning models with varying cost-quality tradeoffs, selected via a utilitybased criterion. Inspired by predicate ordering techniques from database query processing, NOMAD leverages cheaper models as initial filters, proceeding to more expensive models only when necessary, while guaranteeing classification quality remains comparable to a designated role model through a formal chain safety mechanism. It employs a dynamic belief update strategy to adapt model selection based on per event predictions and shifting data distributions, and extends to scenarios with dependent models such as earlyexit DNNs and stacking ensembles. Evaluation across multiple datasets demonstrates that NOMAD achieves significant computational savings compared to static and naive approaches while maintaining classification quality comparable to that achieved by the most accurate (and often the most expensive) model.