Iceberg-table¶
The iceberg-table
sink is used to create entries in Iceberg catalog. The catalog is organised around tables.
Iceberg tables can only catalog files with Parquet, Orc or Avro content type.
If table is not yet created, the sink will create it.
The schema provided must match the schema used by the file creator.
This sink will try to group as many files into a single commit as available in a single batch. This operation should be atomic and no other update to a table should be performed.
Note: Updates to an Iceberg table should be performed only by a single worker. Iceberg relies on optimistic concurrency control, which requires the catalog client to have the most recent table metadata when performing an append. If another worker modifies the table concurrently, the local metadata becomes outdated and the operation may fail or be invalid. This can lead to performance degradation.
Note: Iceberg table updates should be triggered only after a new file has been successfully written to the designated cloud storage system (e.g., S3, GCS, Azure Blob, HDFS, etc.). And can be done in background as a slow path.
.consumeFromSource(...)
.terminateWithSink(IcebergTableSinkConfigBuilder.builder()
.withCatalogReferenceName(value)
.withTableIdentifier(value)
.withTableSchema(value)
.withPartitionSpec(value)
.withRetry(builder -> builder
.withRetryCount(value)
)
.withExceptionHandler(value)
)
sink:
iceberg-table:
catalogReferenceName: value
tableIdentifier: value
tableSchema: value
partitionSpec: value
retry:
retryCount: value
exceptionHandler: value
Java dependency management¶
Add this declaration to your dependency management system to access the configuration DSL for this plugin in Java.
<dependency>
<groupId>org.voltdb</groupId>
<artifactId>volt-stream-plugin-iceberg-api</artifactId>
<version>1.0-20250910-124207-release-1.5.3</version>
</dependency>
implementation group: 'org.voltdb', name: 'volt-stream-plugin-iceberg-api', version: '1.0-20250910-124207-release-1.5.3'
Properties¶
catalogReferenceName
¶
Reference name of the configured Iceberg catalog. Required.
Type: string
tableIdentifier
¶
Specifies table's identifier for Iceberg catalog. Required.
Type: object
tableSchema
¶
Specifies table's schema. Required.
Type: object
partitionSpec
¶
Specifies table's partitioning. Default is unpartitioned
Type: object
retry
¶
Type: object
Fields of retry
:
retry.retryCount
¶
How many times to retry before giving up.
Type: number
Default value: 3
exceptionHandler
¶
Custom exception handler enabling interception of all errors related to this source.
Type: object