Onnx¶
Runs ONNX model inference for general machine learning tasks. This processor can either
use an existing model reference defined in the resources
section or
create a new model instance using the provided URI.
The onnx
processor accepts float
arrays (tensors) and uses the inputTensorName
setting to present them to the model as a named input.
The inference result is postprocessed to convert some of the ONNX specific data types to Java
(e.g. OnnxSequence
and OnnxMap
).
Outputs¶
Processing result is encapsulated as an InferenceResult
type object that contains the original input tensor
and the output tensor as an array of floating point numbers.
Models¶
Supports standard ONNX models (.onnx
files) for various machine learning tasks including
classification, regression, and anomaly detection.
Downloads¶
The modelUri
can point to a location on a local disk (using file://
scheme) or on a remote storage.
Remote storage support depends on available plugins, e.g., S3 plugin allows downloads from an S3-compatible bucket.
.processWith(OnnxProcessorConfigBuilder.builder()
.withModelRef(value)
.withModelUri(value)
.withInputTensorName(value)
.withPrintDownloadProgress(value)
.withCache(builder -> builder
.withDirectory(value)
.withMaxCacheSize(value)
.withExpirationTime(value)
.withCleanupOnStart(value)
)
.withSessionOptions(builder -> builder
.withIntraOpNumThreads(value)
.withInterOpNumThreads(value)
.withGraphOptimizationLevel(value)
.withExecutionMode(value)
.withMemoryPatternOptimization(value)
.withEnableCpuMemArena(value)
.withProfilerFilePath(value)
.withEnableCuda(value)
.withCudaDeviceId(value)
.withCudaExecutionProviderOptions(value)
.withEnableCpu(value)
.withEnableMemoryReuse(value)
.withLogSeverityLevel(value)
.withLogId(value)
)
)
processor:
onnx:
modelRef: value
modelUri: value
inputTensorName: value
printDownloadProgress: value
cache:
directory: value
maxCacheSize: value
expirationTime: value
cleanupOnStart: value
sessionOptions:
intraOpNumThreads: value
interOpNumThreads: value
graphOptimizationLevel: value
executionMode: value
memoryPatternOptimization: value
enableCpuMemArena: value
profilerFilePath: value
enableCuda: value
cudaDeviceId: value
cudaExecutionProviderOptions: value
enableCpu: value
enableMemoryReuse: value
logSeverityLevel: value
logId: value
Java dependency management¶
Add this declaration to your dependency management system to access the configuration DSL for this plugin in Java.
<dependency>
<groupId>org.voltdb</groupId>
<artifactId>volt-stream-plugin-onnx-api</artifactId>
<version>1.0-20250910-124207-release-1.5.3</version>
</dependency>
implementation group: 'org.voltdb', name: 'volt-stream-plugin-onnx-api', version: '1.0-20250910-124207-release-1.5.3'
Properties¶
modelRef
¶
Reference to an existing ONNX model resource. If specified, modelUri is ignored.
Type: string
modelUri
¶
URI to the ONNX model file. Required if modelRef is not specified.
Type: string
inputTensorName
¶
Name of the input tensor in the ONNX model. Required.
Type: string
printDownloadProgress
¶
Whether to display progress information during model file downloads.
Type: boolean
Default value: false
cache
¶
This configuration controls how model files are cached locally, including the cache location, size limits, expiration policy, and cleanup behavior. If not provided files will be cached in the /tmp directory.
Type: object
Fields of cache
:
cache.directory
¶
Directory where files will be cached. If not specified, a temporary directory will be created.
Type: string
cache.maxCacheSize
¶
Maximum size of the cache in bytes. Files will be evicted when the cache exceeds this size. Use 0 for unlimited.
Type: number
Default value: 0
cache.expirationTime
¶
Duration after which cached files are considered stale and will not be used by the system.
Type: object
cache.cleanupOnStart
¶
Whether to clean up expired or invalid cache entries when the cache is initialized.
Type: boolean
Default value: false
sessionOptions
¶
Configuration options for OrtSession.SessionOptions.
Type: object
Fields of sessionOptions
:
sessionOptions.intraOpNumThreads
¶
Number of threads used to parallelize the execution within nodes.
Type: number
sessionOptions.interOpNumThreads
¶
Number of threads used to parallelize the execution of the graph (across nodes).
Type: number
sessionOptions.graphOptimizationLevel
¶
Optimization level enum. Allowed values are:
- NO_OPT - disable all optimizations
- BASIC_OPT - enable basic optimizations
- EXTENDED_OPT = enable all optimizations,
- ALL_OPT = enable all optimizations and also enable extended optimizations.
Type: object
Supported values: no_opt
, basic_opt
, extended_opt
, all_opt
.
Default value: ALL_OPT
sessionOptions.executionMode
¶
Execution mode. Supported values are: sequential, parallel.
Type: object
Supported values: sequential
, parallel
.
Default value: PARALLEL
sessionOptions.memoryPatternOptimization
¶
Enable memory pattern optimization.
Type: boolean
Default value: true
sessionOptions.enableCpuMemArena
¶
Enable CPU memory arena. Default is true.
Type: boolean
Default value: true
sessionOptions.profilerFilePath
¶
The file to write profile information to. Enables profiling.
Type: object
sessionOptions.enableCuda
¶
Enable CUDA execution provider.
Type: boolean
Default value: false
sessionOptions.cudaDeviceId
¶
CUDA device ID to use.
Type: number
Default value: 0
sessionOptions.cudaExecutionProviderOptions
¶
CUDA execution provider options as key-value pairs.
Type: object
sessionOptions.enableCpu
¶
Enable CPU execution provider. Default is true.
Type: boolean
Default value: true
sessionOptions.enableMemoryReuse
¶
Enable memory reuse. Default is true.
Type: boolean
Default value: true
sessionOptions.logSeverityLevel
¶
Log severity level. 0 = verbose, 1 = info, 2 = warning, 3 = error, 4 = fatal.
Type: object
sessionOptions.logId
¶
Log ID.
Type: object
Usage Examples¶
version: 1 name: NetworkIntrusionDetection
resources: - name: "s3-models-storage" s3: credentials: accessKey: "..." secretKey: "..."
source: stdin: {}
pipeline: processors: - onnx: modelUri: "s3-models-storage://models/intrusion-detection.onnx" inputTensorName: "input" printDownloadProgress: true cache: directory: "/tmp/models/"
sink: stdout: {}