Performance¶
The following tutorials will help you learn how to tune MXNet or use tools that will improve training and inference performance.
Essential¶
Improving Performance/api/faq/perf
How to get the best performance from MXNet.
Profilerbackend/profiler.html
How to profile MXNet models.
Compression¶
Compression: float16/api/faq/float16
How to use float16 in your model to boost training speed.
Gradient Compression/api/faq/gradient_compression
How to use gradient compression to reduce communication bandwidth and increase speed.
Accelerated Backend¶
TensorRTbackend/tensorrt/tensorrt.html
How to use NVIDIA’s TensorRT to boost inference performance.
Distributed Training¶
Distributed Training Using the KVStore API/api/faq/distributed_training.html
How to use the KVStore API to use multiple GPUs when training a model.
Training with Multiple GPUs Using Model Parallelism/api/faq/model_parallel_lstm.html
An overview of using multiple GPUs when training an LSTM.
Data Parallelism in MXNet/api/faq/multi_device
An overview of distributed training strategies.
MXNet with Horovodhttps://github.com/apache/mxnet/tree/master/example/distributed_training-horovod
A set of example scripts demonstrating MNIST and ImageNet training with Horovod as the distributed training backend.
Did this page help you?
Yes
No
Thanks for your feedback!