🤖 AI Summary
This study addresses the challenge of estimating lead-lag relationships in asynchronous observation point processes arising from high-frequency financial order flows. The problem is reframed as locating the sharpest peak of the cross-pair correlation function (CPCF) of a bivariate stationary point process, with the lead-lag time defined as the position of this peak. Moving beyond conventional discrete-time frameworks, the authors propose a novel estimator based on continuous-time modeling and kernel density estimation. By integrating point process theory, CPCF analysis, and nonparametric estimation, the method enjoys strong theoretical properties and demonstrates superior performance over existing approaches in both simulation studies and real-world high-frequency data experiments.
📝 Abstract
This paper introduces a new theoretical framework for analyzing lead-lag relationships between point processes, with a special focus on applications to high-frequency financial data. In particular, we are interested in lead-lag relationships between two sequences of order arrival timestamps. The seminal work of Dobrev and Schaumburg proposed model-free measures of cross-market trading activity based on cross-counts of timestamps. While their method is known to yield reliable results, it faces limitations because its original formulation inherently relies on discrete-time observations, an issue we address in this study. Specifically, we formulate the problem of estimating lead-lag relationships in two point processes as that of estimating the shape of the cross-pair correlation function (CPCF) of a bivariate stationary point process, a quantity well-studied in the neuroscience and spatial statistics literature. Within this framework, the prevailing lead-lag time is defined as the location of the CPCF's sharpest peak. Under this interpretation, the peak location in Dobrev and Schaumburg's cross-market activity measure can be viewed as an estimator of the lead-lag time in the aforementioned sense. We further propose an alternative lead-lag time estimator based on kernel density estimation and show that it possesses desirable theoretical properties and delivers superior numerical performance. Empirical evidence from high-frequency financial data demonstrates the effectiveness of our proposed method.