Skip to content

Iceberg-table

The iceberg-table sink is used to create entries in Iceberg catalog. The catalog is organised around tables. Iceberg tables can only catalog files with Parquet, Orc or Avro content type. If table is not yet created, the sink will create it.

The schema provided must match the schema used by the file creator.

This sink will try to group as many files into a single commit as available in a single batch. This operation should be atomic and no other update to a table should be performed.

Note: Updates to an Iceberg table should be performed only by a single worker. Iceberg relies on optimistic concurrency control, which requires the catalog client to have the most recent table metadata when performing an append. If another worker modifies the table concurrently, the local metadata becomes outdated and the operation may fail or be invalid. This can lead to performance degradation.

Note: Iceberg table updates should be triggered only after a new file has been successfully written to the designated cloud storage system (e.g., S3, GCS, Azure Blob, HDFS, etc.). And can be done in background as a slow path.

.consumeFromSource(...)
.terminateWithSink(IcebergTableSinkConfigBuilder.builder()
    .withCatalogReferenceName(value)
    .withTableIdentifier(value)
    .withTableSchema(value)
    .withPartitionSpec(value)
    .withRetry(builder -> builder
        .withRetryCount(value)
    )
    .withExceptionHandler(value)
)
sink:
  iceberg-table:
    catalogReferenceName: value
    tableIdentifier: value
    tableSchema: value
    partitionSpec: value
    retry:
      retryCount: value
    exceptionHandler: value

Java dependency management

Add this declaration to your dependency management system to access the configuration DSL for this plugin in Java.

<dependency>
    <groupId>org.voltdb</groupId>
    <artifactId>volt-stream-plugin-iceberg-api</artifactId>
    <version>1.0-20250910-124207-release-1.5.3</version>
</dependency>
implementation group: 'org.voltdb', name: 'volt-stream-plugin-iceberg-api', version: '1.0-20250910-124207-release-1.5.3'

Properties

catalogReferenceName

Reference name of the configured Iceberg catalog. Required.

Type: string

tableIdentifier

Specifies table's identifier for Iceberg catalog. Required.

Type: object

tableSchema

Specifies table's schema. Required.

Type: object

partitionSpec

Specifies table's partitioning. Default is unpartitioned Type: object

retry

Type: object

Fields of retry:

retry.retryCount

How many times to retry before giving up. Type: number

Default value: 3

exceptionHandler

Custom exception handler enabling interception of all errors related to this source. Type: object