Use ADAS Data to Predict Near-Miss Events: A Group-Based Zero-Inflated Poisson Approach

๐Ÿ“… 2025-08-31
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Near-miss events (NMEs) exhibit extreme sparsity and zero-inflation, undermining the reliability of conventional count models. Method: We propose a Grouped Zero-Inflated Poisson (GZIP) model estimated via the EM algorithm, integrating ADAS warnings with multi-source in-vehicle sensor data to enable interpretable weekly driving risk prediction. The model automatically identifies heterogeneous driver subgroups, incorporates an offset term to account for exposure variation, and enhances contextual awareness through multi-sensor feature fusion. Contribution/Results: Evaluated on naturalistic driving data from 354 commercial drivers, GZIP significantly outperforms baseline modelsโ€”yielding lower AIC/BIC, superior out-of-sample calibration, and robustness to misspecification of the number of latent groups. This provides a reliable, interpretable modeling framework for dynamic risk pricing and personalized safety interventions.

Technology Category

Application Category

๐Ÿ“ Abstract
Driving behavior big data leverages multi-sensor telematics to understand how people drive and powers applications such as risk evaluation, insurance pricing, and targeted intervention. Usage-based insurance (UBI) built on these data has become mainstream. Telematics-captured near-miss events (NMEs) provide a timely alternative to claim-based risk, but weekly NMEs are sparse, highly zero-inflated, and behaviorally heterogeneous even after exposure normalization. Analyzing multi-sensor telematics and ADAS warnings, we show that the traditional statistical models underfit the dataset. We address these challenges by proposing a set of zero-inflated Poisson (ZIP) frameworks that learn latent behavior groups and fit offset-based count models via EM to yield calibrated, interpretable weekly risk predictions. Using a naturalistic dataset from a fleet of 354 commercial drivers over a year, during which the drivers completed 287,511 trips and logged 8,142,896 km in total, our results show consistent improvements over baselines and prior telematics models, with lower AIC/BIC values in-sample and better calibration out-of-sample. We also conducted sensitivity analyses on the EM-based grouping for the number of clusters, finding that the gains were robust and interpretable. Practically, this supports context-aware ratemaking on a weekly basis and fairer premiums by recognizing heterogeneous driving styles.
Problem

Research questions and friction points this paper is trying to address.

Predicting sparse near-miss events from ADAS telematics data
Addressing zero-inflated and behaviorally heterogeneous driving data
Improving risk prediction models for usage-based insurance pricing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Group-based zero-inflated Poisson model
EM algorithm for latent behavior grouping
ADAS multi-sensor telematics data analysis
๐Ÿ”Ž Similar Papers
No similar papers found.
Xinbo Zhang
Xinbo Zhang
ByteDance
Montserrat Guillen
Montserrat Guillen
ICREA Academia & Full Professor (University of Barcelona) ORCID 0000-0002-2644-6268
riskinsurancemicroeconometricslongevityactuarial science
Lishuai Li
Lishuai Li
Department of Data Science, City University of Hong Kong, Hong Kong SAR, China
X
Xin Li
Department of Information Systems, City University of Hong Kong, Hong Kong SAR, China
F
Frank Youhua Chen
Department of Decision Analytics and Operations, City University of Hong Kong, Hong Kong SAR, China