Skip to content

Onnx

Runs ONNX model inference for general machine learning tasks. This processor can either use an existing model reference defined in the resources section or create a new model instance using the provided URI.

The onnx processor accepts float arrays (tensors) and uses the inputTensorName setting to present them to the model as a named input.

The inference result is postprocessed to convert some of the ONNX specific data types to Java (e.g. OnnxSequence and OnnxMap).

Outputs

Processing result is encapsulated as an InferenceResult type object that contains the original input tensor and the output tensor as an array of floating point numbers.

Models

Supports standard ONNX models (.onnx files) for various machine learning tasks including classification, regression, and anomaly detection.

Downloads

The modelUri can point to a location on a local disk (using file:// scheme) or on a remote storage. Remote storage support depends on available plugins, e.g., S3 plugin allows downloads from an S3-compatible bucket.

.processWith(OnnxProcessorConfigBuilder.builder()
    .withModelRef(value)
    .withModelUri(value)
    .withInputTensorName(value)
    .withPrintDownloadProgress(value)
    .withCache(builder -> builder
        .withDirectory(value)
        .withMaxCacheSize(value)
        .withExpirationTime(value)
        .withCleanupOnStart(value)
    )
    .withSessionOptions(builder -> builder
        .withIntraOpNumThreads(value)
        .withInterOpNumThreads(value)
        .withGraphOptimizationLevel(value)
        .withExecutionMode(value)
        .withMemoryPatternOptimization(value)
        .withEnableCpuMemArena(value)
        .withProfilerFilePath(value)
        .withEnableCuda(value)
        .withCudaDeviceId(value)
        .withCudaExecutionProviderOptions(value)
        .withEnableCpu(value)
        .withEnableMemoryReuse(value)
        .withLogSeverityLevel(value)
        .withLogId(value)
    )
)
processor:
  onnx:
    modelRef: value
    modelUri: value
    inputTensorName: value
    printDownloadProgress: value
    cache:
      directory: value
      maxCacheSize: value
      expirationTime: value
      cleanupOnStart: value
    sessionOptions:
      intraOpNumThreads: value
      interOpNumThreads: value
      graphOptimizationLevel: value
      executionMode: value
      memoryPatternOptimization: value
      enableCpuMemArena: value
      profilerFilePath: value
      enableCuda: value
      cudaDeviceId: value
      cudaExecutionProviderOptions: value
      enableCpu: value
      enableMemoryReuse: value
      logSeverityLevel: value
      logId: value

Java dependency management

Add this declaration to your dependency management system to access the configuration DSL for this plugin in Java.

<dependency>
    <groupId>org.voltdb</groupId>
    <artifactId>volt-stream-plugin-onnx-api</artifactId>
    <version>1.0-20250910-124207-release-1.5.3</version>
</dependency>
implementation group: 'org.voltdb', name: 'volt-stream-plugin-onnx-api', version: '1.0-20250910-124207-release-1.5.3'

Properties

modelRef

Reference to an existing ONNX model resource. If specified, modelUri is ignored. Type: string

modelUri

URI to the ONNX model file. Required if modelRef is not specified. Type: string

inputTensorName

Name of the input tensor in the ONNX model. Required.

Type: string

printDownloadProgress

Whether to display progress information during model file downloads. Type: boolean

Default value: false

cache

This configuration controls how model files are cached locally, including the cache location, size limits, expiration policy, and cleanup behavior. If not provided files will be cached in the /tmp directory.

Type: object

Fields of cache:

cache.directory

Directory where files will be cached. If not specified, a temporary directory will be created. Type: string

cache.maxCacheSize

Maximum size of the cache in bytes. Files will be evicted when the cache exceeds this size. Use 0 for unlimited. Type: number

Default value: 0

cache.expirationTime

Duration after which cached files are considered stale and will not be used by the system. Type: object

cache.cleanupOnStart

Whether to clean up expired or invalid cache entries when the cache is initialized. Type: boolean

Default value: false

sessionOptions

Configuration options for OrtSession.SessionOptions. Type: object

Fields of sessionOptions:

sessionOptions.intraOpNumThreads

Number of threads used to parallelize the execution within nodes. Type: number

sessionOptions.interOpNumThreads

Number of threads used to parallelize the execution of the graph (across nodes). Type: number

sessionOptions.graphOptimizationLevel

Optimization level enum. Allowed values are: - NO_OPT - disable all optimizations - BASIC_OPT - enable basic optimizations - EXTENDED_OPT = enable all optimizations, - ALL_OPT = enable all optimizations and also enable extended optimizations. Type: object

Supported values: no_opt, basic_opt, extended_opt, all_opt.

Default value: ALL_OPT

sessionOptions.executionMode

Execution mode. Supported values are: sequential, parallel. Type: object

Supported values: sequential, parallel.

Default value: PARALLEL

sessionOptions.memoryPatternOptimization

Enable memory pattern optimization. Type: boolean

Default value: true

sessionOptions.enableCpuMemArena

Enable CPU memory arena. Default is true. Type: boolean

Default value: true

sessionOptions.profilerFilePath

The file to write profile information to. Enables profiling. Type: object

sessionOptions.enableCuda

Enable CUDA execution provider. Type: boolean

Default value: false

sessionOptions.cudaDeviceId

CUDA device ID to use. Type: number

Default value: 0

sessionOptions.cudaExecutionProviderOptions

CUDA execution provider options as key-value pairs. Type: object

sessionOptions.enableCpu

Enable CPU execution provider. Default is true. Type: boolean

Default value: true

sessionOptions.enableMemoryReuse

Enable memory reuse. Default is true. Type: boolean

Default value: true

sessionOptions.logSeverityLevel

Log severity level. 0 = verbose, 1 = info, 2 = warning, 3 = error, 4 = fatal. Type: object

sessionOptions.logId

Log ID. Type: object

Usage Examples

version: 1 name: NetworkIntrusionDetection

resources: - name: "s3-models-storage" s3: credentials: accessKey: "..." secretKey: "..."

source: stdin: {}

pipeline: processors: - onnx: modelUri: "s3-models-storage://models/intrusion-detection.onnx" inputTensorName: "input" printDownloadProgress: true cache: directory: "/tmp/models/"

sink: stdout: {}