Triton Inference Server
TWSC provides pay-as-you-go working environment of NGC’s TensorRT Inference Server. The TensorRT inference server provides an inference service via an HTTP endpoint, allowing remote clients to request inferencing for any model that is being managed by the server. The TensorRT inference server itself is included in the TensorRT inference server container. External to the container, there are additional C++ and Python client libraries, and additional documentation at GitHub: Inference Server.
Image versions
Container Version | Ubuntu | CUDA Toolkit | Triton Inference Server | TensorRT | cuDNN | TWCC Release Date |
---|---|---|---|---|---|---|
tritonserver-24.12-trtllm-python-py3 | 24.04 | NVIDIA CUDA 12.6.3 | 2.53.0 | TensorRT 10.7.0.23 | - | 16JUN25 |
tritonserver-24.05-trtllm-python-py3 | 22.04 | NVIDIA CUDA 12.4.1 | 2.46.0 | TensorRT 10.0.1.6 | - | 19JUL24 |
tritonserver-22.11-py3 | 20.04 | NVIDIA CUDA 11.8.0 | 2.28.0 | TensorRT 8.5.1 | - | 19JUL24 |
tritonserver-22.08-py3 | 20.04 | NVIDIA CUDA 11.7.1 | 2.25.0 | TensorRT 8.4.2.4 | - | 30SEP22 |
tritonserver-22.05-py3 | 20.04 | NVIDIA CUDA 11.7.0 | 2.22.0 | TensorRT 8.2.5.1 | - | 21JUN22 |
tritonserver-22.02-py3 | 20.04 | NVIDIA CUDA 11.6.0 | 2.19.0 | TensorRT 8.2.3 | - | 18MAY22 |
tritonserver-21.11-py3 | 20.04 | NVIDIA CUDA 11.5.0 | 2.16.0 | TensorRT 8.0.3.4 | 8.3.0.96 | 18MAY22 |
tritonserver-21.08-py3 | 20.04 | NVIDIA CUDA 11.4.1 | 2.13.0 | TensorRT 8.0.1.6 | 8.2.2.6 | 16SEP21 |
tritonserver-21.06-py3 | 20.04 | NVIDIA CUDA 11.3.1 | 2.11.0 | TensorRT 7.2.3.4 | 8.2.1 | 16SEP21 |
tritonserver-21.02-py3 | 20.04 | NVIDIA CUDA 11.2.0 | 2.7.0 | TensorRT 7.2.2.3+ cuda11.1.0.024 | 8.0.5 | 12MAY21 |
tensorrtserver-20.02-py3 | 18.04 | NVIDIA CUDA 10.2.89 | 1.12.0 | TensorRT 7.0.0 | 7.6.5 | - |
tensorrtserver-19.02-py3-v1 | 16.04 | NVIDIA CUDA 10.0.130 | 0.11.0 beta | TensorRT 5.0.2 | 7.4.2 | - |
tensorrtserver-18.12-py3-v1 | 16.04 | NVIDIA CUDA 10.0.130 | 0.9.0 beta | TensorRT 5.0.2 | 7.4.1 | - |
tensorrtserver-18.10-py3-v1 | 16.04 | NVIDIA CUDA 10.0.130 | 0.7.0 beta | TensorRT 5.0.0 RC | 7.3.0 | - |
tensorrtserver-18.10-py2-v1 | 16.04 | NVIDIA CUDA 9.0.176 | 0.5.0 beta | TensorRT 4.0.1 | 7.2.1 | - |
info
py3
and py2
are Python version differences.
Detailed package versions
- tritonserver-24.12-py3
- tritonserver-24.05-py3
- tritonserver-22.11-py3
- tritonserver-22.08-py3
- tritonserver-22.05-py3
- tritonserver-22.02-py3
- tritonserver-21.11-py3
- tritonserver-21.08-py3
- tritonserver-21.06-py3
- tritonserver-21.02-py3
- tensorrtserver-20.02-py3
- tensorrtserver-19.02-py3-v1
- tensorrtserver-18.12-py3-v1
- tensorrtserver-18.10-py3-v1
- tensorrtserver-18.08.1-py2/py3-v1