Overview
ONNX Runtime is a cross-platform accelerator for machine learning models. It supports models from multiple frameworks like PyTorch, TensorFlow, and Scikit-learn, and runs them optimally on various hardware.
Features
- Compatibility: Run models exported in the Open Neural Network Exchange (ONNX) format.
- Hardware Acceleration: First-class support for CUDA, TensorRT, ROCm, and CoreML.
- Tiny Runtime: Specialized build for mobile and web environments.
Use Cases
- Deploying PyTorch models into C++ or C# applications.
- Accelerating inference for real-time edge computing.
- Standardizing model delivery across different engineering teams.