Skip to content

Releases: triton-inference-server/model_navigator

Triton Model Navigator v0.6.0

30 Jun 16:52
Compare
Choose a tag to compare
  • new: Zero-copy runners for Torch, ONNX and TensorRT - omit H2D and D2H memory copy between runners execution

  • new: nav.pacakge.profile API method to profile generated models on provided dataloader

  • change: ProfilerConfig replaced with OptimizationProfile:

    • new: OptimizationProfile impact the conversion for TensorRT
    • new: batch_sizes and max_batch_size limit the max profile in TensorRT conversion
    • new: Allow to provide separate dataloader for profiling - first sample used only
  • new: allow to run nav.package.optimize on empty package - status generation only

  • new: use torch.inference_mode for inference runner when PyTorch 2.x is available

  • fix: Missing model in config when passing package generated during nav.{framework}.optimize directly to nav.package.optimize command

  • Other minor fixes and improvements

  • Version of external components used during testing:

Triton Model Navigator v0.5.6

23 Jun 13:52
Compare
Choose a tag to compare

Triton Model Navigator v0.5.5

24 May 13:21
Compare
Choose a tag to compare
  • new: Public nav.utilities module with UnpackedDataloader wrapper
  • new: Added support for strict flag in Torch custom config
  • new: Extended TensorRT custom config to support builder optimization level and hardware compatibility flags
  • fix: Invalid optimal shape calculation for odd values in max batch size

Triton Model Navigator v0.5.4

18 May 11:39
Compare
Choose a tag to compare
  • new: Custom implementation for ONNX and TensorRT runners

  • new: Use CUDA 12 for JAX in unit tests and functional tests

  • new: Step-by-step examples

  • new: Updated documentation

  • new: TensorRTCUDAGraph runner introduced with support for CUDA graphs

  • fix: Optimal shape not set correctly during adaptive conversion

  • fix: Find max batch size command for JAX

  • fix: Save stdout to logfiles in debug mode

  • Version of external components used during testing:

Triton Model Navigator v0.5.3

19 Apr 13:09
Compare
Choose a tag to compare
  • fix: filter outputs using output_metadata in ONNX runners

Triton Model Navigator v0.5.2

11 Apr 12:28
Compare
Choose a tag to compare

Triton Model Navigator v0.5.1

29 Mar 16:12
Compare
Choose a tag to compare

Triton Model Navigator v0.5.0

23 Mar 09:21
Compare
Choose a tag to compare
  • new: Support for PyTriton deployemnt

  • new: Support for Python models with python.optimize API

  • new: PyTorch 2 compile CPU and CUDA runners

  • new: Collect conversion max batch size in status

  • new: PyTorch runners with compile support

  • change: Improved handling CUDA and CPU runners

  • change: Reduced finding device max batch size time by running it once as separate pipeline

  • change: Stored find max batch size result in separate filed in status

  • Version of external components used during testing:

Triton Model Navigator v0.4.4

14 Mar 15:37
Compare
Choose a tag to compare
  • fix: when exporting single input model to saved model, unwrap one element list with inputs

Triton Model Navigator v0.4.3

13 Mar 15:50
Compare
Choose a tag to compare
  • fix: in Keras inference use model.predict(tensor) for single input models