Globus service enhancements for exascale applications and facilities

📅 2024-09-09

🏛️ The international journal of high performance computing applications

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

To address low cross-facility transfer efficiency and high integrity verification overhead for terabyte-scale files in exascale computing environments, this paper proposes a client-driven dynamic chunking mechanism, the first of its kind to be deeply integrated into the Globus platform. Methodologically, it synergistically combines automated chunking scheduling, parallel transfer optimization, and incremental hash-based integrity verification—departing from conventional small-file-centric transfer optimization paradigms. Experimental evaluation demonstrates up to a 3.2× improvement in end-to-end throughput for TB-scale file transfers and an 87% reduction in integrity verification latency compared to baseline approaches. The solution has been deployed and validated across multiple national flagship supercomputing facilities, significantly enhancing performance, reliability, and scalability for large-scale scientific data movement.

Technology Category

Application Category

📝 Abstract

Many extreme-scale applications require the movement of large quantities of data to, from, and among leadership computing facilities, as well as other scientific facilities and the home institutions of facility users. These applications, particularly when leadership computing facilities are involved, can touch upon edge cases (e.g., terabyte files) that had not been a focus of previous Globus optimization work, which had emphasized rather the movement of many smaller (megabyte to gigabyte) files. We report here on how automated client-driven chunking can be used to accelerate both the movement of large files and the integrity checking operations that have proven to be essential for large data transfers. We present detailed performance studies that provide insights into the benefits of these modifications in a range of file transfer scenarios.

Problem

Research questions and friction points this paper is trying to address.

Enhancing data transfer for extreme-scale applications

Optimizing large file movement in leadership computing

Improving integrity checks for terabyte-sized file transfers

Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated client-driven chunking for large files

Enhanced integrity checking for data transfers

Performance optimization for extreme-scale applications

🔎 Similar Papers

No similar papers found.