Finding Complex Patterns in Trajectory Data via Geometric Set Cover

📅 2023-08-28
🏛️ arXiv.org
📈 Citations: 1
Influential: 1
📄 PDF
🤖 AI Summary
This paper addresses the problem of compactly representing large-scale trajectory data (e.g., GPS traces): given *n* input trajectories, select the minimum number of representative polygonal curves—each of complexity at most *l*—such that every point on any input trajectory lies within Fréchet distance Δ of some subtrajectory of a representative curve. We propose a novel geometric set cover framework that, for the first time, supports multi-segment polyline representatives (not merely line segments), reducing the required number of representatives from *O(kl log(kl))* to *O(k log n)*, where *k* is the optimal cover size. Our algorithm guarantees coverage radius 11Δ and runs in Õ(*l²n⁴ + kln⁴*) time. Extensive evaluation on ocean current and human motion datasets demonstrates significant improvements in modeling complex real-world movement patterns and practical applicability.
📝 Abstract
Clustering trajectories is a central challenge when faced with large amounts of movement data such as GPS data. We study a clustering problem that can be stated as a geometric set cover problem: Given a polygonal curve of complexity $n$, find the smallest number $k$ of representative trajectories of complexity at most $l$ such that any point on the input trajectories lies on a subtrajectory of the input that has Fr'echet distance at most $Delta$ to one of the representative trajectories. In previous work, Br""uning et al.~(2022) developed a bicriteria approximation algorithm that returns a set of curves of size $O(kllog(kl))$ which covers the input with a radius of $11Delta$ in time $widetilde{O}((kl)^2n + kln^3)$, where $k$ is the smallest number of curves of complexity $l$ needed to cover the input with a radius of $Delta$. The representative trajectories computed by this algorithm are always line segments. In the applications however, one is usually interested in more complex representative curves which consist of several edges. We present a new approach that builds upon previous work computing a set of curves of size $O(klog(n))$ in time $widetilde{O}(l^2n^4 + kln^4)$ with the same distance guarantee of $11Delta$, where each curve may consist of curves of complexity up to the given complexity parameter~$l$. We conduct experiments on tracking data of ocean currents and full body motion data suggesting its validity as a tool for analyzing large spatio-temporal data sets.
Problem

Research questions and friction points this paper is trying to address.

Clustering complex trajectory data efficiently.
Finding minimal representative trajectories with geometric set cover.
Improving algorithm for complex curves in spatio-temporal data analysis.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Geometric set cover for trajectory clustering
Improved algorithm with O(k log(n)) curve size
Supports complex curves up to complexity l
🔎 Similar Papers
No similar papers found.