How VoltSP Stream Processing Works¶

Stream data processing has become a critical component of business operations. The exponential growth of available data and the pressure to act on information in real time have made traditional computing approaches obsolete. It is no longer sufficient to gather data and post process it to determine what actions to take. Now businesses need to operate on the data in flight to filter, format, validate, measure, and respond to events in a timely manner.

And for simple operations this works. Many operations, like filtering data based on fixed rules or converting from one format to another, can be performed at speed. However, the Achilles heel of stream data processing is the fact that many operations still require access to up-to-date and entrusted information such as customer accounts, inventory levels, and resource availability. These stateful operations, if performed against a traditional SQL database, incur the same latency to which previous centralized operations were susceptible. Which is where VoltSP comes in.

By integrating stream data processing with Volt Active Data — an ACID database designed to maximize throughput without sacrificing consistency, durability, or availability — VoltSP makes it possible to combine both stateless and stateful processing in flight and at speed.

Figure 3.1. VoltSP Architecture

VoltSP Architecture

VoltSP Architecture¶

The VoltSP architecture consists of three primary parts: sources, sinks, and processors. And the Domain Specific Language you use to define VoltSP pipelines mirror the exact same structure, letting you define the source, one or more processors, and a sink:

stream
   source
       processor
       processor
       processor
        [ ... ]
   sink

Where the processors can be any combination of stateless or stateful operations, with Volt Active Data providing real time access to reference data that can be used to verify, authenticate, authorize, or in other ways validate and enhance the data as it passes.

The advantages the VoltSP architecture offers are:

Cloud Native — VoltSP pipelines are designed from the ground up to run in the cloud. It is also self contained and does not require any additional infrastructure (such as resource managers, schedulers, or the like). This allows for easy setup, scaling, and management.
Apache Kafka and Volt Active Data integration — Kafka is supported out of the box as a data source and both Kafka and Volt Active Data are supported as sinks for the pipeline, so that setting up the initial pipeline template is trivial.
Complex business logic — Partitioned procedures in Volt Active Data can be used to incorporate complex, stateful operations on the data without sacrificing latency.
Flexibility — The pipelines are designed as templates, using placeholders for key resources such as server addresses and topic names, so that different pipelines can be created from the same template by identifying different resources in the properties at runtime.
Scalability — The pipelines themselves can be scaled at runtime completely separately from the resources, such as Kafka servers or Volt Active Data cluster nodes allowing you to optimize computing resources to match actual needs.

Reliable data processing¶

VoltSP employs a sophisticated batch processing system that ensures reliable data handling even when interacting with remote systems. This system tracks requests and responses to external services, confirming that all operations within a processing batch have been successfully completed before considering the batch finished.

The batch processing mechanism works in conjunction with the circuit breaker pattern to provide:

Reliable tracking of asynchronous operations
Automatic retry capabilities for failed batches
Clear separation between processing phases
Proper handling of out-of-order responses

This approach ensures that data is processed reliably and consistently, even in the face of temporary system failures or network issues. By managing batches of operations as atomic units, VoltSP maintains data integrity throughout the processing pipeline.

Circuit Breaker¶

VoltSP incorporates circuit breaker patterns to enhance system resilience when interacting with remote systems. A circuit breaker is a mechanism that monitors the health of connections to external systems and temporarily halts processing when those systems experience problems.

When VoltSP detects that a remote system is experiencing issues (through a sliding window of recent outcomes), the circuit breaker "opens" to prevent further requests being sent to the troubled system. This approach:

Prevents overwhelming already struggling systems with additional requests
Allows remote systems time to recover without constant pressure
Conserves resources that would otherwise be wasted on failed requests
Enables graceful degradation of service rather than complete failure

VoltSP implements both local circuit breakers for individual components and a global circuit breaker that can temporarily pause the entire event processing pipeline when necessary. The system automatically attempts to resume normal operations once the remote systems return to a healthy state.

Count Window Circuit Breaker¶

The Count Window Circuit Breaker tracks outcomes (successes and failures) in a sliding window of the most recent events. Once the window is full, it continuously evaluates the percentage of failures. If the observed failure rate rises above a configured threshold, the breaker opens to temporarily halt interactions with the affected external system.

When open, the breaker prevents additional requests from being sent, allowing the remote system time to recover. After a configurable delay, the breaker transitions to a half‑open probe phase by resetting its window and allowing processing to resume. If subsequent outcomes again exceed the threshold (once the window fills), the breaker reopens; if they remain healthy, it stays closed and processing continues normally.

The following aspects are configurable and influence behavior: - Sliding window size: how many of the most recent outcomes are considered when calculating the failure rate. Smaller windows react faster to changes; larger windows provide more stability against short spikes. - Failure rate threshold (percent): the failure percentage that triggers the breaker to open once the window is full. Lower thresholds make the breaker more sensitive; higher thresholds require more sustained failures. - Retry delay: how long the breaker stays open before entering the half‑open probe phase.