VectraFlow: Long-Horizon Semantic Processing over Data and Event Streams with LLMs

📅 2026-04-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of current large language models (LLMs), which lack statefulness and struggle with long sequential streams of unstructured text, and traditional complex event processing (CEP) systems, which handle only structured events and cannot interpret semantics. To bridge this gap, we propose VectraFlow—the first semantic stream processing engine that integrates LLM-based semantic understanding with CEP. VectraFlow introduces continuous semantic operators supporting filter, map, and aggregate operations, and combines event extraction with nondeterministic finite automaton (NFA)-based rule matching to detect temporal semantic patterns in unstructured document streams. The system enables throughput–accuracy trade-offs through streaming windows and state management mechanisms, and features a natural language-to-execution-graph compilation interface. An end-to-end clinical document stream processing prototype demonstrates its effectiveness in building real-time semantic pipelines and identifying complex events.
📝 Abstract
Monitoring continuous data for meaningful signals increasingly demands long-horizon, stateful reasoning over unstructured streams. However, today's LLM frameworks remain stateless and one-shot, and traditional Complex Event Processing (CEP) systems, while capable of temporal pattern detection, assume structured, typed event streams that leave unstructured text out of reach. We demonstrate VectraFlow, a semantic streaming dataflow engine, to address both gaps. VectraFlow extends traditional relational operators with LLM-powered execution over free-text streams, offering a suite of continuous semantic operators -- filter, map, aggregate, join, group-by, and window -- each with configurable throughput-accuracy tradeoffs across LLM-based, embedding-based, and hybrid implementations. Building on this, a semantic event pattern operator lifts complex event processing to unstructured document streams, combining LLM-based event extraction with NFA-based temporal rule matching for stateful reasoning over sequences of semantic events. In this demonstration, users will interact with VectraFlow's live query interface to compose semantic pipelines over clinical document streams. Attendees will compile natural language intents into executable operator graphs, inspect intermediate stateful outputs, and observe end-to-end temporal pattern detection, from raw text to matched event cohorts.
Problem

Research questions and friction points this paper is trying to address.

long-horizon reasoning
unstructured streams
semantic processing
complex event processing
stateful reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

semantic streaming
LLM-powered operators
complex event processing
unstructured text
stateful reasoning