Budgeted Spatial Data Acquisition: When Coverage and Connectivity Matter

📅 2024-12-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the spatial data procurement scenario by formalizing the Budgeted Maximum Coverage with Connectivity constraints (BMCC) problem: selecting a subset of spatial datasets under a budget to maximize geographic coverage area while ensuring topological connectivity of the covered region. We propose two greedy algorithms with provable approximation guarantees and polynomial-time complexity. To enhance efficiency, we introduce a dual acceleration strategy combining spatial indexing and graph-connectivity-based pruning. Extensive experiments on five real-world datasets demonstrate that our algorithms solve instances in milliseconds, achieve an average coverage ratio exceeding 92%—close to optimal—guarantee 100% connectivity satisfaction, and outperform baseline methods by up to 8.3× in runtime.

Technology Category

Application Category

📝 Abstract
Data is undoubtedly becoming a commodity like oil, land, and labor in the 21st century. Although there have been many successful marketplaces for data trading, the existing data marketplaces lack consideration of the case where buyers want to acquire a collection of datasets (instead of one), and the overall spatial coverage and connectivity matter. In this paper, we take the first attempt to formulate this problem as Budgeted Maximum Coverage with Connectivity Constraint (BMCC), which aims to acquire a dataset collection with the maximum spatial coverage under a limited budget while maintaining spatial connectivity. To solve the problem, we propose two approximate algorithms with detailed theoretical guarantees and time complexity analysis, followed by two acceleration strategies to further improve the efficiency of the algorithm. Experiments are conducted on five real-world spatial dataset collections to verify the efficiency and effectiveness of our algorithms.
Problem

Research questions and friction points this paper is trying to address.

Optimal Dataset Selection
Budget Constraint
Maximal Coverage
Innovation

Methods, ideas, or system contributions that make the work stand out.

Budget-Constrained Data Source Selection
Approximation Algorithms
Real-World Dataset Efficiency
🔎 Similar Papers
No similar papers found.
Wenzhe Yang
Wenzhe Yang
School of Computer Science, Wuhan University, Wuhan 430061, China
Shixun Huang
Shixun Huang
University of Wollongong
data mininggraph databasesmachine learningalgorithms
S
Sheng Wang
School of Computer Science, Wuhan University, Wuhan 430061, China
Z
Zhiyong Peng
School of Computer Science, Wuhan University, Wuhan 430061, China