Data-Locality-Aware Task Assignment and Scheduling for Distributed Job Executions

📅 2024-07-11
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses data-locality-aware task assignment and online scheduling for distributed job execution under unknown job arrival sequences, aiming to minimize job completion time. We propose the Optimal Balanced Task Assignment (OBTA) algorithm, which theoretically improves the approximation ratio of the water-filling algorithm; design a more efficient Replica Deletion (RD) heuristic that significantly reduces computational overhead while preserving accuracy; and introduce a job reordering mechanism based on Shortest Estimated Time First (SEFT) to enhance throughput and response efficiency. Leveraging combinatorial optimization modeling, online algorithm design, and trace-driven evaluation, our approach achieves substantial reductions in average job completion time over real-world workloads compared to state-of-the-art baselines. The RD heuristic outperforms classical water-filling in both speed and effectiveness, and SEFT-based reordering further boosts overall scheduling performance.

Technology Category

Application Category

📝 Abstract
This paper investigates a data-locality-aware task assignment and scheduling problem aimed at minimizing job completion times for distributed job executions. Without prior knowledge of future job arrivals, we propose an optimal balanced task assignment algorithm (OBTA) that minimizes the completion time of each arriving job. We significantly reduce OBTA's computational overhead by narrowing the search space of potential solutions. Additionally, we extend an approximate algorithm known as water-filling (WF) and nontrivially prove that its approximation factor equals the number of task groups in the job assignment. We also design a novel heuristic, replica-deletion (RD), which outperforms WF. To further reduce the completion time of each job, we expand the problem to include job reordering, where we adjust the order of outstanding jobs following the shortest-estimated-time-first policy. Extensive trace-driven evaluations validate the performance and efficiency of the proposed algorithms.
Problem

Research questions and friction points this paper is trying to address.

Minimizing job completion times without future arrival knowledge
Achieving data-locality-aware task assignment and scheduling
Reducing computational overhead while maintaining performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimal Balanced Task Assignment algorithm minimizes completion times
Extended Water-Filling algorithm with proven approximation factor
Replica-Deletion heuristic outperforms through global optimization techniques
🔎 Similar Papers
No similar papers found.