Skip to content

Running in Kubernetes

VoltSP is a cloud-native service packaged with a Helm chart to automate its deployment and management in Kubernetes.

The Helm chart exposes standard Kubernetes settings, such as resource limits, in a familiar YAML format, alongside VoltSP-specific properties. A basic configuration for a VoltSP release looks like this:

replicaCount: 1

resources:
    limits:
        cpu: 2
        memory: 2G
    requests:
        cpu: 2
        memory: 2G

streaming:
    pipeline:
        className: org.acme.KafkaToVoltPipeline

In this example, the configuration creates one pod to run the pipeline specified by the pipeline.className property, with defined CPU and memory resources.

Note: The Java Virtual Machine (JVM) inside the container is automatically configured to use 80% of the available memory limit.

Configuring the Pipeline

The VoltSP Helm chart provides two main ways to configure your pipeline: passing custom properties to your Java code and automatically configuring built-in components like sources and sinks.

Passing Custom Configuration Values

You can pass custom configuration parameters to your pipeline by adding them under the streaming.pipeline.configuration path in your YAML file.

For example, to configure a custom producer, you can add the following:

# ... (resources and other settings)
streaming:
    pipeline:
        className: org.acme.KafkaToVoltPipeline
        configuration:
            producer:
                message:
                    count: 5000
                    bytes: 1024

At runtime, your Java pipeline code can access these values using the stream.getExecutionContext().configurator() API:

import io.confluent.kafka.serializers.KafkaAvroSerializer;
import org.voltdb.stream.api.Sinks;
import org.voltdb.stream.api.pipeline.VoltPipeline;
import org.voltdb.stream.api.pipeline.VoltStreamBuilder;
import org.voltdb.stream.api.pipeline.ExecutionContext;

public class ProducerPipeline implements VoltPipeline {

    @Override
    public void define(VoltStreamBuilder stream) {
        ExecutionContext.ConfigurationContext configurator = stream.getExecutionContext().configurator();

        // Retrieve custom configuration values
        int messageCount = configurator.findByPath("producer.message.count").asInt();
        int messageBytesLength = configurator.findByPath("producer.message.bytes").asInt();

        stream
                .withName("event producer")
                .consumeFromSource(new EventsSource(messageBytesLength))
                .terminateWithSink(Sinks.kafka() /* ... */);
    }
}

Automatic Component Configuration

A key feature of the VoltSP platform is its ability to implicitly configure its components. Instead of manually extracting every property, you can define them in the Helm configuration, and VoltSP will automatically apply them to the corresponding components in your pipeline.

For example, you can configure a Kafka sink directly in the YAML:

# ... (resources and other settings)
streaming:
    pipeline:
        className: org.acme.KafkaToVoltPipeline
        configuration:
            # Kafka Sink configuration
            sink:
                kafka:
                    topicName: "my-topic"
                    bootstrapServers: "kafka.example.com:9092"
                    # Optional parameters
                    schemaRegistry: "http://registry.example.com"
                    properties:
                        key1: value1
                        key2: value2
            # Custom application configuration
            producer:
                message:
                    count: 5000
                    bytes: 1024

With this configuration, you only need to declare the Kafka sink in your Java pipeline. VoltSP will handle its configuration automatically.

@Override
public void define(VoltStreamBuilder stream) {
    stream
            .withName("event producer")
            .consumeFromSource(new EventsSource(messageBytesLength))
            .terminateWithSink(
                    Sinks.kafka()
                            .accepting(EventMessage.class)
                            // Properties like topicName are configured from YAML
                            .withKeyExtractor(EventMessage::getSessionId)
                            .withValueSerializer(KafkaAvroSerializer.class)
            );
}

Note that some properties, such as the key extractor and value serializer, cannot be set from a textual configuration and must be provided explicitly in the code.

The configurable properties for each component are documented in their respective sections. For example, see the Kafka sink properties.

Mixing Configuration Styles

You can combine both automatic (YAML-based) and programmatic (DSL-based) configuration. If a property is defined in both the YAML file and the Java DSL, the value set in the DSL will take precedence.

In the example below, the topicName is overridden in the Java code, while all other Kafka sink properties (like bootstrapServers) are still applied from the Helm configuration.

@Override
public void define(VoltStreamBuilder stream) {
    String overrideTopicName = "my-overridden-topic";

    stream
            .withName("event producer")
            .consumeFromSource(new EventsSource(messageBytesLength))
            .terminateWithSink(
                    Sinks.kafka()
                            .accepting(EventMessage.class)
                            .withTopicName(overrideTopicName) // Overrides the YAML value
                            .withKeyExtractor(EventMessage::getSessionId)
                            .withValueSerializer(KafkaAvroSerializer.class)
            );
}

This hybrid approach provides flexibility, allowing you to define static configuration in Helm and handle dynamic or complex properties programmatically.