Accelerate PyTorch model inferencing
ONNX Runtime can be used to accelerate PyTorch models inferencing.
Convert model to ONNX
- Basic PyTorch export through torch.onnx
- Super-resolution with ONNX Runtime
- Export PyTorch model with custom ops
Accelerate PyTorch model inferencing
BERT
- Accelerate BERT model on CPU
- Accelerate BERT model on GPU
- Accelerate reduced size BERT model through quantization