NVIDIA Triton Inference Server
ml-serving server
NVIDIA inference server for deploying ML models in production
Pros and Cons
Ventajas
- + High performance
- + Multi-framework support
- + Dynamic batching
- + GPU optimized
Desventajas
- - Configuration complexity
- - Primarily for NVIDIA GPUs
Casos de Uso
- Production inference
- Multi-model serving
- ML pipelines