๐ค AI Summary
This work addresses the limitations of traditional data transfer tools, which typically support only streaming or batch modes in isolation and thus struggle to meet the unified management demands of heterogeneous data migration across multi-cloud environments. Building upon the Skyplane framework, we propose the first control plane that natively unifies both streaming and batch data transfers. Our system employs a URI-driven automatic routing mechanism to intelligently select the optimal transfer mode and integrates record-level structured data ingestion with chunked binary object transmission. Evaluated in an environmental monitoring scenario, the system demonstrates significant reductions in operational complexity while achieving high-throughput performance across regions, offering an efficient and unified solution for data migration in heterogeneous cloud infrastructures.
๐ Abstract
Cloud and big data workloads are increasingly distributing data across multiple cloud providers and regions for rapid decision-making and analytics. Traditional transfer tools are typically specialized for a single paradigm, either stream replication or bulk transfer. This specialization forces users to deploy and manage separate systems with different configurations for each transfer pattern. This paper presents SkyHOST (Hybrid Object and Stream Transfer), a unified data movement architecture built upon the Skyplane framework to bridge the gap between bulk object transfer and streaming workloads through a single control plane and CLI. SkyHOST manages URI-based routing to automatically select the appropriate transfer mechanism, supporting both structured data for record-level ingestion and chunk-based transfer for large binary objects. We demonstrate, through an environmental monitoring use case and empirical evaluation, that SkyHOST provides operational simplicity by consolidating heterogeneous data movement patterns under a single control plane while achieving competitive throughput for cross-region transfers.