Link Search Menu Expand Document

Custom operators

ONNX Runtime provides options to run custom operators that are not official ONNX operators. The contrib ops domain contains some common non-official ops, however it’s not recommended to add operators here to avoid increasing binary size of the core runtime package.

Contents

Register a custom operator

A new op can be registered with ONNX Runtime using the Custom Operator API (onnxruntime_c_api.h)

  1. Create an OrtCustomOpDomain with the domain name used by the custom ops
  2. Create an OrtCustomOp structure for each op and add them to the OrtCustomOpDomain with OrtCustomOpDomain_Add
  3. Call OrtAddCustomOpDomain to add the custom domain of ops to the session options

Examples

CUDA custom ops

When a model being inferred on GPU, onnxruntime will insert MemcpyToHost op before a CPU custom op and append MemcpyFromHost after to make sure tensor(s) are accessible throughout calling, meaning there are no extra efforts required from custom op developer for the case.

When using CUDA custom ops, to ensure synchronization between ORT’s CUDA kernels and the custom CUDA kernels, they must all use the same CUDA compute stream. To ensure this, you may first create a CUDA stream and pass it to the underlying Session via SessionOptions (use OrtCudaProviderOptions struct). This will ensure ORT’s CUDA kernels use that stream and if the custom CUDA kernels are launched using the same stream, synchronization is now taken care of implicitly.

For a sample, please see how the afore-mentioned MyCustomOp is being launched and how the Session using this custom op is created.