ONNX Runtime Mobile Custom Build
Creating a custom ‘minimal’ build of ONNX Runtime gives you control over what is included in order to minimize the binary size whilst satisfying the needs of your scenario.
The configuration file that was generated during model conversion is used to specify the operators (and potentially the types) that the build will support.
The general ONNX Runtime inferencing build instructions apply, with additional options being specified to reduce the binary size.
Contents
- Binary size reduction options
- Build Configuration
- Example build commands
- Building ONNX Runtime Python Wheel
Binary size reduction options
The follow options can be used to reduce the build size:
Enable the minimal build
--minimal_build
[REQUIRED]- A minimal build will ONLY support loading and executing ORT format models
- RTTI is disabled by default in this build, unless the Python bindings (
--build_wheel
) are enabled. - If you wish to enable execution providers that compile kernels such as NNAPI or CoreML specify
--minimal_build extended
.- See here for details on using NNAPI and CoreML with ONNX Runtime Mobile
Reduce build to required operator kernels
--include_ops_by_config
[REQUIRED]- Add
--include_ops_by_config <config file produced during model conversion> --skip_tests
to the build parameters. - See the documentation on the Reduced Operator Kernel build for more information on how this works.
- NOTE: Building will edit some of the ONNX Runtime source files to exclude unused kernels. If you wish to go back to creating a full build, or wish to change the operator kernels included, you MUST run
git reset --hard
orgit checkout HEAD -- ./onnxruntime/core/providers
from the root directory of your local ONNX Runtime repository to undo these changes.
- NOTE: Building will edit some of the ONNX Runtime source files to exclude unused kernels. If you wish to go back to creating a full build, or wish to change the operator kernels included, you MUST run
- Add
Reduce types supported by the required operators
--enable_reduced_operator_type_support
[OPTIONAL]- Enables operator type reduction.
- NOTE: Requires ONNX Runtime version 1.7 or higher and for type reduction to have been enabled during model conversion
- Enables operator type reduction.
Disable exceptions
--disable_exceptions
[OPTIONAL]- Disables support for exceptions in the build.
- Any locations that would have thrown an exception will instead log the error message and call abort().
- Requires
--minimal_build
. - NOTE: This is not a valid option if you need the Python bindings (
--build_wheel
) as the Python Wheel requires exceptions to be enabled.
- Exceptions are only used in ONNX Runtime for exceptional things. If you have validated the input to be used, and validated that the model can be loaded, it is unlikely that ORT would throw an exception unless there’s a system level issue (e.g. out of memory).
- Disables support for exceptions in the build.
Disable ML operator support
--disable_ml_ops
[OPTIONAL]- Whilst the operator kernel reduction script will disable all unused ML operator kernels, additional savings can be achieved by removing support for ML specific types. If you know that your model has no ML ops, or no ML ops that use the Map type, this flag can be provided.
- See the specs for the ONNX ML Operators if unsure.
Use shared libc++ on Android
--android_cpp_shared
[OPTIONAL]- Building using the shared libc++ library instead of the default static libc++ library will result in a smaller libonnxruntime.so library.
- See Android NDK documentation for more information.
Build Configuration
The MinSizeRel
configuration will produce the smallest binary size.
The Release
configuration can also be used if you wish to prioritize performance over binary size.
Example build commands
Windows
<ONNX Runtime repository root>.\build.bat --config=MinSizeRel --cmake_generator="Visual Studio 16 2019" --build_shared_lib --minimal_build --disable_ml_ops --disable_exceptions --include_ops_by_config <config file from model conversion> --skip_tests
Linux
<ONNX Runtime repository root>./build.sh --config=MinSizeRel --build_shared_lib --minimal_build --disable_ml_ops --disable_exceptions --include_ops_by_config <config file from model conversion> --skip_tests
Building ONNX Runtime Python Wheel
If you wish to use the ONNX Runtime python bindings with a minimal build, exceptions must be enabled due to Python requiring them.
Remove --disable_exceptions
and add --build_wheel
to the build command in order to build a Python Wheel with the ONNX Runtime bindings.
A .whl file will be produced in the build output directory under the <config>/dist
folder.
- The Python Wheel for a Windows MinSizeRel build using build.bat would be in
<ONNX Runtime repository root>\build\Windows\MinSizeRel\MinSizeRel\dist\
- The Python Wheel for a Linux MinSizeRel build using build.sh would be in
<ONNX Runtime repository root>/build/Linux/MinSizeRel/dist/
The wheel can be installed using pip
. Adjust the following command for your platform and the whl filename.
pip install -U .\build\Windows\MinSizeRel\MinSizeRel\dist\onnxruntime-1.7.0-cp37-cp37m-win_amd64.whl
Next: Model execution