🤖 AI Summary
This study addresses the limitation of traditional extreme value methods, which often neglect bulk information when modeling heavy-tailed data and struggle to simultaneously capture both the overall distribution and tail characteristics. To overcome this, the authors propose a Bayesian nonparametric mixture model that unifies the modeling of the bulk (below a threshold) and the tail (above a threshold) through a dual-component structure based on a four-parameter shifted gamma–gamma distribution and a normalized stable process. Posterior inference is carried out via an adaptive Metropolis–Hastings MCMC algorithm, which quantifies the proportion of data supporting the heavy-tailed component and effectively integrates information from both bulk and tail. Experiments on simulated and real-world datasets demonstrate that the proposed approach significantly improves the accuracy and informational completeness of heavy-tailed data modeling.
📝 Abstract
In the study of heavy tail data, several models have been introduced. If the interest is in the tail of the distribution, block maxima or excess over thresholds are the typical approaches, wasting relevant information in the bulk of the data. To avoid this, two building block mixture models for the body (below the threshold) and the tail (above the threshold) are proposed. In this paper, we exploit the richness of nonparametric mixture models to model heavy tail data. We specifically consider mixtures of shifted gamma-gamma distributions with four parameters and a normalised stable processes as a mixing distribution. One of these parameters is associated with the tail. By studying the posterior distribution of the tail parameter, we are able to estimate the proportion of the data that supports a heavy tail component. We develop an efficient MCMC method with adapting Metropolis-Hastings steps to obtain posterior inference and illustrate with simulated and real datasets.