Accelerate Hugging Face models
ONNX Runtime can accelerate training and inferencing popular Hugging Face NLP models.
Accelerate Hugging Face model inferencing
- General export and inference: Hugging Face Transformers
 - Accelerate GPT2 model on CPU
 - Accelerate BERT model on CPU
 - Accelerate BERT model on GPU
 
Additional resources
- Blog post: Faster and smaller quantized NLP with Hugging Face and ONNX Runtime
 - Blog post: Accelerate your NLP pipelines using Hugging Face Transformers and ONNX Runtime