Onnx¶

Runs ONNX model inference for general machine learning tasks. This processor can either use an existing model reference defined in the resources section or create a new model instance using the provided URI.

The onnx processor accepts float arrays (tensors) and uses the inputTensorName setting to present them to the model as a named input.

The inference result is postprocessed to convert some of the ONNX specific data types to Java (e.g. OnnxSequence and OnnxMap).

Outputs¶

Processing result is encapsulated as an InferenceResult type object that contains the original input tensor and the output tensor as an array of floating point numbers.

Models¶

Supports standard ONNX models (.onnx files) for various machine learning tasks including classification, regression, and anomaly detection.

Downloads¶

The modelUri can point to a location on a local disk (using file:// scheme) or on a remote storage. Remote storage support depends on available plugins, e.g., S3 plugin allows downloads from an S3-compatible bucket.

JAVAYAML

.processWith(OnnxProcessorConfigBuilder.builder()
    .withModelRef(value)
    .withModelUri(value)
    .withInputTensorName(value)
    .withPrintDownloadProgress(value)
    .withCache(builder -> builder
        .withDirectory(value)
        .withMaxCacheSize(value)
        .withExpirationTime(value)
        .withCleanupOnStart(value)
    )
    .withSessionOptions(builder -> builder
        .withIntraOpNumThreads(value)
        .withInterOpNumThreads(value)
        .withGraphOptimizationLevel(value)
        .withExecutionMode(value)
        .withMemoryPatternOptimization(value)
        .withEnableCpuMemArena(value)
        .withProfilerFilePath(value)
        .withEnableCuda(value)
        .withCudaDeviceId(value)
        .withCudaExecutionProviderOptions(value)
        .withEnableCpu(value)
        .withEnableMemoryReuse(value)
        .withLogSeverityLevel(value)
        .withLogId(value)
    )
)

processor:
  onnx:
    modelRef: value
    modelUri: value
    inputTensorName: value
    printDownloadProgress: value
    cache:
      directory: value
      maxCacheSize: value
      expirationTime: value
      cleanupOnStart: value
    sessionOptions:
      intraOpNumThreads: value
      interOpNumThreads: value
      graphOptimizationLevel: value
      executionMode: value
      memoryPatternOptimization: value
      enableCpuMemArena: value
      profilerFilePath: value
      enableCuda: value
      cudaDeviceId: value
      cudaExecutionProviderOptions: value
      enableCpu: value
      enableMemoryReuse: value
      logSeverityLevel: value
      logId: value

Java dependency management¶

Add this declaration to your dependency management system to access the configuration DSL for this plugin in Java.

MavenGradle

<dependency>
    <groupId>org.voltdb</groupId>
    <artifactId>volt-stream-plugin-onnx-api</artifactId>
    <version>1.5.4</version>
</dependency>

implementation group: 'org.voltdb', name: 'volt-stream-plugin-onnx-api', version: '1.5.4'

Properties¶

`modelRef`¶

Reference to an existing ONNX model resource. If specified, modelUri is ignored. Type: string

`modelUri`¶

URI to the ONNX model file. Required if modelRef is not specified. Type: string

`inputTensorName`¶

Name of the input tensor in the ONNX model. Required.

Type: string

`printDownloadProgress`¶

Whether to display progress information during model file downloads. Type: boolean

Default value: false

`cache`¶

This configuration controls how model files are cached locally, including the cache location, size limits, expiration policy, and cleanup behavior. If not provided files will be cached in the /tmp directory.

Type: object

Fields of cache:

`cache.directory`¶

Directory where files will be cached. If not specified, a temporary directory will be created. Type: string

`cache.maxCacheSize`¶

Maximum size of the cache in bytes. Files will be evicted when the cache exceeds this size. Use 0 for unlimited. Type: number

Default value: 0

`cache.expirationTime`¶

Duration after which cached files are considered stale and will not be used by the system. Type: object

`cache.cleanupOnStart`¶

Whether to clean up expired or invalid cache entries when the cache is initialized. Type: boolean

Default value: false

`sessionOptions`¶

Configuration options for OrtSession.SessionOptions. Type: object

Fields of sessionOptions:

`sessionOptions.intraOpNumThreads`¶

Number of threads used to parallelize the execution within nodes. Type: number

`sessionOptions.interOpNumThreads`¶

Number of threads used to parallelize the execution of the graph (across nodes). Type: number

`sessionOptions.graphOptimizationLevel`¶

Optimization level enum. Allowed values are: - NO_OPT - disable all optimizations - BASIC_OPT - enable basic optimizations - EXTENDED_OPT = enable all optimizations, - ALL_OPT = enable all optimizations and also enable extended optimizations. Type: object

Supported values: no_opt, basic_opt, extended_opt, all_opt.

Default value: ALL_OPT

`sessionOptions.executionMode`¶

Execution mode. Supported values are: sequential, parallel. Type: object

Supported values: sequential, parallel.

Default value: PARALLEL

`sessionOptions.memoryPatternOptimization`¶

Enable memory pattern optimization. Type: boolean

Default value: true

`sessionOptions.enableCpuMemArena`¶

Enable CPU memory arena. Default is true. Type: boolean

Default value: true

`sessionOptions.profilerFilePath`¶

The file to write profile information to. Enables profiling. Type: object

`sessionOptions.enableCuda`¶

Enable CUDA execution provider. Type: boolean

Default value: false

`sessionOptions.cudaDeviceId`¶

CUDA device ID to use. Type: number

Default value: 0

`sessionOptions.cudaExecutionProviderOptions`¶

CUDA execution provider options as key-value pairs. Type: object

`sessionOptions.enableCpu`¶

Enable CPU execution provider. Default is true. Type: boolean

Default value: true

`sessionOptions.enableMemoryReuse`¶

Enable memory reuse. Default is true. Type: boolean

Default value: true

`sessionOptions.logSeverityLevel`¶

Log severity level. 0 = verbose, 1 = info, 2 = warning, 3 = error, 4 = fatal. Type: object

`sessionOptions.logId`¶

Log ID. Type: object

Usage Examples¶

YAML

version: 1 name: NetworkIntrusionDetection

resources: - name: "s3-models-storage" s3: credentials: accessKey: "..." secretKey: "..."

source: stdin: {}

pipeline: processors: - onnx: modelUri: "s3-models-storage://models/intrusion-detection.onnx" inputTensorName: "input" printDownloadProgress: true cache: directory: "/tmp/models/"

sink: stdout: {}

Onnx¶

Outputs¶

Models¶

Downloads¶

Java dependency management¶

Properties¶

modelRef¶

modelUri¶

inputTensorName¶

printDownloadProgress¶

cache¶

cache.directory¶

cache.maxCacheSize¶

cache.expirationTime¶

cache.cleanupOnStart¶

sessionOptions¶

sessionOptions.intraOpNumThreads¶

sessionOptions.interOpNumThreads¶

sessionOptions.graphOptimizationLevel¶

sessionOptions.executionMode¶

sessionOptions.memoryPatternOptimization¶

sessionOptions.enableCpuMemArena¶

sessionOptions.profilerFilePath¶

sessionOptions.enableCuda¶

sessionOptions.cudaDeviceId¶

sessionOptions.cudaExecutionProviderOptions¶

sessionOptions.enableCpu¶

sessionOptions.enableMemoryReuse¶

sessionOptions.logSeverityLevel¶

sessionOptions.logId¶

Usage Examples¶

`modelRef`¶

`modelUri`¶

`inputTensorName`¶

`printDownloadProgress`¶

`cache`¶

`cache.directory`¶

`cache.maxCacheSize`¶

`cache.expirationTime`¶

`cache.cleanupOnStart`¶

`sessionOptions`¶

`sessionOptions.intraOpNumThreads`¶

`sessionOptions.interOpNumThreads`¶

`sessionOptions.graphOptimizationLevel`¶

`sessionOptions.executionMode`¶

`sessionOptions.memoryPatternOptimization`¶

`sessionOptions.enableCpuMemArena`¶

`sessionOptions.profilerFilePath`¶

`sessionOptions.enableCuda`¶

`sessionOptions.cudaDeviceId`¶

`sessionOptions.cudaExecutionProviderOptions`¶

`sessionOptions.enableCpu`¶

`sessionOptions.enableMemoryReuse`¶

`sessionOptions.logSeverityLevel`¶

`sessionOptions.logId`¶