Skip to content

Voltdb-bulk-insert

This sink is suitable for efficiently inserting large volumes of data into VoltDB.

The voltdb-bulk-insert sink is used to insert data into a VoltDB table in batches. It supports configurations such as batch size, flush interval, table name, and the type of bulk operation (INSERT or UPSERT).

The voltdb-bulk-insert operator requires VoltDBResource to be configured either in yaml or in java, see examples.

// if resource is configured in yaml
voltStreamBuilder.configureResource("primary-cluster", VoltDBResourceConfigBuilder.class);
// or
voltStreamBuilder.configureResource("primary-cluster",
         VoltDBResourceConfigBuilder.class,
         new Consumer<VoltDBResourceConfigBuilder>() {
             @Override
             public void consume(VoltDBResourceConfigBuilder configurator) {
                 configurator
                   .addToServers("localhost", 12122)
                   .withClientBuilder(vcb -> builder -> {
                       builder.withMaxOutstandingTransactions(42000);
                       builder.withMaxTransactionsPerSecond(23);
                       builder.withRequestTimeout(Duration.ofSeconds(5));
                       builder.withAuthBuilder(authBuilder -> authBuilder
                          .withUsername("admin")
                          .withPassword("admin123"));
                       builder.withSslBuilder(sslBuilder -> sslBuilder
                          .withTrustStoreFile("c:/Users32/trust.me")
                          .withTrustStorePassword("got2have"));
                       builder.withRetryBuilder(retryBuilder -> retryBuilder
                          .withRetries(4)
                          .withBackoffDelay(Duration.ofSeconds(2))
                          .withMaxBackoffDelay(Duration.ofSeconds(11)));
                   })
             }
         });

voltStreamBuilder.terminateWithSink(VoltBulkInsertSinkConfigBuilder.builder()
   .withClientReferenceName("primary-cluster")
   .withTableName("my_table")
   .withBatchSize(100000)
   .withFlushInterval(5000)
   .withOperationType(VoltBulkOperationType.INSERT)
)
resources:
- name: primary-cluster
  voltdb-client:
    servers: localhost:12122
    client:
      maxTransactionsPerSecond: 3000
      maxOutstandingTransactions: 3000
      requestTimeout: PT10S
      auth:
        user: Admin
        password: 2r2Ffafw3V
      trustStore:
        file: file.pem
        password: got2have

sink:
   voltdb-bulk-insert:
       voltClientResource: primary-cluster
       retries: 3
       name: "my_table"
       batchSize: 100000
       flushInterval: 5000
       operationType: "INSERT"

Java dependency management

Add this declaration to your dependency management system to access the configuration DSL for this plugin in Java.

<dependency>
    <groupId>org.voltdb</groupId>
    <artifactId>volt-stream-plugin-volt-api</artifactId>
    <version>1.6.0</version>
</dependency>
implementation group: 'org.voltdb', name: 'volt-stream-plugin-volt-api', version: '1.6.0'

Properties

voltClientResource

Client resource reference to be used when connecting to VoltDb cluster Required.

Type: object

operationType

The type of bulk operation to perform, either INSERT or UPSERT. INSERT adds new records, while UPSERT updates existing records or inserts new ones. Type: object

Supported values: insert, upsert.

Default value: INSERT

batchSize

The maximum number of records to include in a single batch for insertion. Larger batches can improve performance but require more memory. Type: number

Default value: 100000

flushInterval

The maximum time to wait for the desired batch size before forcing a data flush to VoltDB. Type: object

Default value: 1s

exceptionHandler

A custom exception handler to manage errors encountered during data insertion. Type: object

JSON Schema

You can validate or explore the configuration using its JSON Schema.