vision.transforms

Gluon provides pre-defined vision transformation and data augmentation functions in the mxnet.gluon.data.vision.transforms module.

transforms.Compose

Sequentially composes multiple transforms.

transforms.Cast

Cast inputs to a specific data type

transforms.ToTensor

Converts an image NDArray or batch of image NDArray to a tensor NDArray.

transforms.Normalize

Normalize an tensor of shape (C x H x W) or (N x C x H x W) with mean and standard deviation.

transforms.RandomResizedCrop

Crop the input image with random scale and aspect ratio.

transforms.CenterCrop

Crops the image src to the given size by trimming on all four sides and preserving the center of the image.

transforms.Resize

Resize an image or a batch of image NDArray to the given size.

transforms.RandomFlipLeftRight

Randomly flip the input image left to right with a probability of p(0.5 by default).

transforms.RandomFlipTopBottom

Randomly flip the input image top to bottom with a probability of p(0.5 by default).

transforms.RandomBrightness

Randomly jitters image brightness with a factor chosen from [max(0, 1 - brightness), 1 + brightness].

transforms.RandomContrast

Randomly jitters image contrast with a factor chosen from [max(0, 1 - contrast), 1 + contrast].

transforms.RandomSaturation

Randomly jitters image saturation with a factor chosen from [max(0, 1 - saturation), 1 + saturation].

transforms.RandomHue

Randomly jitters image hue with a factor chosen from [max(0, 1 - hue), 1 + hue].

transforms.RandomColorJitter

Randomly jitters the brightness, contrast, saturation, and hue of an image.

transforms.RandomLighting

Add AlexNet-style PCA-based noise to an image.

API Reference

Vision transforms.

class mxnet.gluon.data.vision.transforms.Cast(dtype='float32')[source]

Bases: mxnet.gluon.block.HybridBlock

Cast inputs to a specific data type

Parameters

dtype (str, default 'float32') – The target data type, in string or numpy.dtype.

Inputs:
  • data: input tensor with arbitrary shape and dtype.

Outputs:
  • out: output tensor with the same shape as data and data type as dtype.

hybrid_forward(F, *args)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class mxnet.gluon.data.vision.transforms.Compose(transforms)[source]

Bases: mxnet.gluon.nn.basic_layers.Sequential

Sequentially composes multiple transforms.

Parameters

transforms (list of transform Blocks.) – The list of transforms to be composed.

Inputs:
  • data: input tensor with shape of the first transform Block requires.

Outputs:
  • out: output tensor with shape of the last transform Block produces.

Examples

>>> transformer = transforms.Compose([transforms.Resize(300),
...                                   transforms.CenterCrop(256),
...                                   transforms.ToTensor()])
>>> image = mx.nd.random.uniform(0, 255, (224, 224, 3)).astype(dtype=np.uint8)
>>> transformer(image)
<NDArray 3x256x256 @cpu(0)>
class mxnet.gluon.data.vision.transforms.HybridCompose(transforms)[source]

Bases: mxnet.gluon.nn.basic_layers.HybridSequential

Sequentially composes multiple transforms. This is the Hybrid version of Compose.

Parameters

transforms (list of transform Blocks.) – The list of transforms to be composed.

Inputs:
  • data: input tensor with shape of the first transform Block requires.

Outputs:
  • out: output tensor with shape of the last transform Block produces.

Examples

>>> transformer = transforms.HybridCompose([transforms.Resize(300),
...                                   transforms.CenterCrop(256),
...                                   transforms.ToTensor()])
>>> image = mx.nd.random.uniform(0, 255, (224, 224, 3)).astype(dtype=np.uint8)
>>> transformer(image)
<NDArray 3x256x256 @cpu(0)>
class mxnet.gluon.data.vision.transforms.HybridRandomApply(transforms, p=0.5)[source]

Bases: mxnet.gluon.nn.basic_layers.HybridSequential

Apply a list of transformations randomly given probability

Parameters
  • transforms – List of transformations which must be HybridBlocks.

  • p (float) – Probability of applying the transformations.

Inputs:
  • data: input tensor.

Outputs:
  • out: transformed image.

hybrid_forward(F, x, *args)[source]

Overrides to construct symbolic graph for this Block.

Parameters
  • x (Symbol or NDArray) – The first input tensor.

  • *args (list of Symbol or list of NDArray) – Additional input tensors.

class mxnet.gluon.data.vision.transforms.RandomApply(transforms, p=0.5)[source]

Bases: mxnet.gluon.nn.basic_layers.Sequential

Apply a list of transformations randomly given probability

Parameters
  • transforms – List of transformations.

  • p (float) – Probability of applying the transformations.

Inputs:
  • data: input tensor.

Outputs:
  • out: transformed image.

forward(x, *args)[source]

Overrides to implement forward computation using NDArray. Only accepts positional arguments.

Parameters

*args (list of NDArray) – Input tensors.