Streaming Democratized: Ease Across the Latency Spectrum with Delayed View Semantics and Snowflake Dynamic Tables

📅 2025-04-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Stream data processing faces persistent challenges including manual incremental maintenance, SQL semantic inconsistencies between streaming and batch paradigms, and insufficient enterprise-grade operational capabilities; moreover, existing systems overemphasize sub-second latency, limiting applicability to mainstream second- to minute-scale analytical workloads. To address these, we propose Delayed View Semantics (DVS), the first formal framework unifying stream and batch semantics across a broad latency spectrum. We design declarative dynamic table primitives to ensure end-to-end transactional consistency and high availability, and extend the snapshot isolation model to enforce invariants in streaming applications. Evaluated on a production system built atop Snowflake, our approach reduces development complexity by over 80%, supports configurable end-to-end latency from milliseconds to minutes, cuts operational overhead by 90%, and currently serves thousands of enterprise customers.

Technology Category

Application Category

📝 Abstract
Streaming data pipelines remain challenging and expensive to build and maintain, despite significant advancements in stronger consistency, event time semantics, and SQL support over the last decade. Persistent obstacles continue to hinder usability, such as the need for manual incrementalization, semantic discrepancies across SQL implementations, and the lack of enterprise-grade operational features. While the rise of incremental view maintenance (IVM) as a way to integrate streaming with databases has been a huge step forward, transaction isolation in the presence of IVM remains underspecified, leaving the maintenance of application-level invariants as a painful exercise for the user. Meanwhile, most streaming systems optimize for latencies of 100 ms to 3 sec, whereas many practical use cases are well-served by latencies ranging from seconds to tens of minutes. We present delayed view semantics (DVS), a conceptual foundation that bridges the semantic gap between streaming and databases, and introduce Dynamic Tables, Snowflake's declarative streaming transformation primitive designed to democratize analytical stream processing. DVS formalizes the intuition that stream processing is primarily a technique to eagerly compute derived results asynchronously, while also addressing the need to reason about the resulting system end to end. Dynamic Tables then offer two key advantages: ease of use through DVS, enterprise-grade features, and simplicity; as well as scalable cost efficiency via IVM with an architecture designed for diverse latency requirements. We first develop extensions to transaction isolation that permit the preservation of invariants in streaming applications. We then detail the implementation challenges of Dynamic Tables and our experience operating it at scale. Finally, we share insights into user adoption and discuss our vision for the future of stream processing.
Problem

Research questions and friction points this paper is trying to address.

Addresses challenges in building streaming data pipelines
Resolves semantic gaps between streaming and databases
Optimizes for diverse latency requirements in streaming
Innovation

Methods, ideas, or system contributions that make the work stand out.

Delayed View Semantics bridges streaming and databases
Dynamic Tables enable declarative stream processing
Incremental View Maintenance for scalable efficiency
🔎 Similar Papers
No similar papers found.
D
Dan Sotolongo
Snowflake
D
Daniel Mills
Snowflake
T
Tyler Akidau
Snowflake
A
Anirudh Santhiar
Snowflake
A
Attila-P'eter T'oth
Snowflake
I
Ilaria Battiston
CWI; work conducted at Snowflake
Ankur Sharma
Ankur Sharma
B
Botong Huang
B
Boyuan Zhang
D
Dzmitry Pauliukevich
Snowflake
E
Enrico Sartorello
I
Igor Belianski
I
Ivan Kalev
Lawrence Benson
Lawrence Benson
PostDoc @ Technische Universität München
Data ManagementModern HardwarePersistent MemoryStream Processing
L
Leon Papke
L
Ling Geng
M
Matt Uhlar
Nikhil Shah
Nikhil Shah
N
Niklas Semmler
O
Olivia Zhou
Snowflake
S
Saras Nowak
S
Sasha Lionheart
T
Till Merker
V
Vlad Lifliand
W
Wendy E. Grus
Y
Yi Huang
Yiwen Zhu
Yiwen Zhu
Microsoft
Resource ManagementPublic TransportationTransit Assignment