Data Loading API¶
Overview¶
This document summarizes supported data formats and iterator APIs to read the data including:
mxnet.io |
Data iterators for common data formats. |
mxnet.recordio |
Read and write for the RecordIO data format. |
mxnet.image |
Image Iterators and image augmentation functions |
First, let’s see how to write an iterator for a new data format.
The following iterator can be used to train a symbol whose input data variable has
name data
and input label variable has name softmax_label
.
The iterator also provides information about the batch, including the
shapes and name.
>>> nd_iter = mx.io.NDArrayIter(data={'data':mx.nd.ones((100,10))},
... label={'softmax_label':mx.nd.ones((100,))},
... batch_size=25)
>>> print(nd_iter.provide_data)
[DataDesc[data,(25, 10L),,NCHW]]
>>> print(nd_iter.provide_label)
[DataDesc[softmax_label,(25,),,NCHW]]
Let’s see a complete example of how to use data iterator in model training.
>>> data = mx.sym.Variable('data')
>>> label = mx.sym.Variable('softmax_label')
>>> fullc = mx.sym.FullyConnected(data=data, num_hidden=1)
>>> loss = mx.sym.SoftmaxOutput(data=fullc, label=label)
>>> mod = mx.mod.Module(loss, data_names=['data'], label_names=['softmax_label'])
>>> mod.bind(data_shapes=nd_iter.provide_data, label_shapes=nd_iter.provide_label)
>>> mod.fit(nd_iter, num_epoch=2)
A detailed tutorial is available at Iterators - Loading data.
Data iterators¶
io.NDArrayIter |
Returns an iterator for mx.nd.NDArray , numpy.ndarray , h5py.Dataset mx.nd.sparse.CSRNDArray or scipy.sparse.csr_matrix . |
io.CSVIter |
Returns the CSV file iterator. |
io.LibSVMIter |
Returns the LibSVM iterator which returns data with csr storage type. |
io.ImageRecordIter |
Iterates on image RecordIO files |
io.ImageRecordUInt8Iter |
Iterating on image RecordIO files |
io.MNISTIter |
Iterating on the MNIST dataset. |
recordio.MXRecordIO |
Reads/writes RecordIO data format, supporting sequential read and write. |
recordio.MXIndexedRecordIO |
Reads/writes RecordIO data format, supporting random access. |
image.ImageIter |
Image data iterator with a large number of augmentation choices. |
image.ImageDetIter |
Image iterator with a large number of augmentation choices for detection. |
Helper classes and functions¶
Data structures and other iterators¶
io.DataDesc |
DataDesc is used to store name, shape, type and layout information of the data or the label. |
io.DataBatch |
A data batch. |
io.DataIter |
The base class for an MXNet data iterator. |
io.ResizeIter |
Resize a data iterator to a given number of batches. |
io.PrefetchingIter |
Performs pre-fetch for other data iterators. |
io.MXDataIter |
A python wrapper a C++ data iterator. |
Functions to read and write RecordIO files¶
recordio.pack |
Pack a string into MXImageRecord. |
recordio.unpack |
Unpack a MXImageRecord to string. |
recordio.unpack_img |
Unpack a MXImageRecord to image. |
recordio.pack_img |
Pack an image into MXImageRecord . |
How to develop a new iterator¶
Writing a new data iterator in Python is straightforward. Most MXNet
training/inference programs accept an iterable object with provide_data
and provide_label
properties.
This tutorial explains how to
write an iterator from scratch.
The following example demonstrates how to combine
multiple data iterators into a single one. It can be used for multiple
modality training such as image captioning, in which images are read by
ImageRecordIter
while documents are read by CSVIter
class MultiIter:
def __init__(self, iter_list):
self.iters = iter_list
def next(self):
batches = [i.next() for i in self.iters]
return DataBatch(data=[*b.data for b in batches],
label=[*b.label for b in batches])
def reset(self):
for i in self.iters:
i.reset()
@property
def provide_data(self):
return [*i.provide_data for i in self.iters]
@property
def provide_label(self):
return [*i.provide_label for i in self.iters]
iter = MultiIter([mx.io.ImageRecordIter('image.rec'), mx.io.CSVIter('txt.csv')])
Parsing and performing another pre-processing such as augmentation may be expensive. If performance is critical, we can implement a data iterator in C++. Refer to src/io for examples.
How to change the batch layout¶
By default, the backend engine treats the first dimension of each data and label variable in data
iterators as the batch size (i.e. NCHW
or NT
layout). In order to override the axis for batch size,
the provide_data
(and provide_label
if there is label) properties should include the layouts. This
is especially useful in RNN since TNC
layouts are often more efficient. For example:
@property
def provide_data(self):
return [DataDesc(name='seq_var', shape=(seq_length, batch_size), layout='TN')]
The backend engine will recognize the index of N
in the layout
as the axis for batch size.
API Reference¶
mxnet.io - Data Iterators¶
Data iterators for common data formats.
-
class
mxnet.io.
NDArrayIter
(data, label=None, batch_size=1, shuffle=False, last_batch_handle='pad', data_name='data', label_name='softmax_label')[source]¶ Returns an iterator for
mx.nd.NDArray
,numpy.ndarray
,h5py.Dataset
mx.nd.sparse.CSRNDArray
orscipy.sparse.csr_matrix
.>>> data = np.arange(40).reshape((10,2,2)) >>> labels = np.ones([10, 1]) >>> dataiter = mx.io.NDArrayIter(data, labels, 3, True, last_batch_handle='discard') >>> for batch in dataiter: ... print batch.data[0].asnumpy() ... batch.data[0].shape ... [[[ 36. 37.] [ 38. 39.]] [[ 16. 17.] [ 18. 19.]] [[ 12. 13.] [ 14. 15.]]] (3L, 2L, 2L) [[[ 32. 33.] [ 34. 35.]] [[ 4. 5.] [ 6. 7.]] [[ 24. 25.] [ 26. 27.]]] (3L, 2L, 2L) [[[ 8. 9.] [ 10. 11.]] [[ 20. 21.] [ 22. 23.]] [[ 28. 29.] [ 30. 31.]]] (3L, 2L, 2L) >>> dataiter.provide_data # Returns a list of `DataDesc` [DataDesc[data,(3, 2L, 2L),
,NCHW]] >>> dataiter.provide_label # Returns a list of `DataDesc` [DataDesc[softmax_label,(3, 1L),,NCHW]] In the above example, data is shuffled as shuffle parameter is set to True and remaining examples are discarded as last_batch_handle parameter is set to discard.
Usage of last_batch_handle parameter:
>>> dataiter = mx.io.NDArrayIter(data, labels, 3, True, last_batch_handle='pad') >>> batchidx = 0 >>> for batch in dataiter: ... batchidx += 1 ... >>> batchidx # Padding added after the examples read are over. So, 10/3+1 batches are created. 4 >>> dataiter = mx.io.NDArrayIter(data, labels, 3, True, last_batch_handle='discard') >>> batchidx = 0 >>> for batch in dataiter: ... batchidx += 1 ... >>> batchidx # Remaining examples are discarded. So, 10/3 batches are created. 3
NDArrayIter also supports multiple input and labels.
>>> data = {'data1':np.zeros(shape=(10,2,2)), 'data2':np.zeros(shape=(20,2,2))} >>> label = {'label1':np.zeros(shape=(10,1)), 'label2':np.zeros(shape=(20,1))} >>> dataiter = mx.io.NDArrayIter(data, label, 3, True, last_batch_handle='discard')
NDArrayIter also supports
mx.nd.sparse.CSRNDArray
with last_batch_handle set to discard.>>> csr_data = mx.nd.array(np.arange(40).reshape((10,4))).tostype('csr') >>> labels = np.ones([10, 1]) >>> dataiter = mx.io.NDArrayIter(csr_data, labels, 3, last_batch_handle='discard') >>> [batch.data[0] for batch in dataiter] [
, , ] Parameters: - data (array or list of array or dict of string to array) – The input data.
- label (array or list of array or dict of string to array, optional) – The input label.
- batch_size (int) – Batch size of data.
- shuffle (bool, optional) – Whether to shuffle the data. Only supported if no h5py.Dataset inputs are used.
- last_batch_handle (str, optional) – How to handle the last batch. This parameter can be ‘pad’, ‘discard’ or ‘roll_over’. ‘roll_over’ is intended for training and can cause problems if used for prediction.
- data_name (str, optional) – The data name.
- label_name (str, optional) – The label name.
-
provide_data
¶ The name and shape of data provided by this iterator.
-
provide_label
¶ The name and shape of label provided by this iterator.
-
mxnet.io.
CSVIter
(*args, **kwargs)¶ Returns the CSV file iterator.
In this function, the data_shape parameter is used to set the shape of each line of the input data. If a row in an input file is 1,2,3,4,5,6` and data_shape is (3,2), that row will be reshaped, yielding the array [[1,2],[3,4],[5,6]] of shape (3,2).
By default, the CSVIter has round_batch parameter set to
True
. So, if batch_size is 3 and there are 4 total rows in CSV file, 2 more examples are consumed at the first round. If reset function is called after first round, the call is ignored and remaining examples are returned in the second round.If one wants all the instances in the second round after calling reset, make sure to set round_batch to False.
If
data_csv = 'data/'
is set, then all the files in this directory will be read.reset()
is expected to be called only after a complete pass of data.By default, the CSVIter parses all entries in the data file as float32 data type, if dtype argument is set to be ‘int32’ or ‘int64’ then CSVIter will parse all entries in the file as int32 or int64 data type accordingly.
Examples:
// Contents of CSV file ``data/data.csv``. 1,2,3 2,3,4 3,4,5 4,5,6 // Creates a `CSVIter` with `batch_size`=2 and default `round_batch`=True. CSVIter = mx.io.CSVIter(data_csv = 'data/data.csv', data_shape = (3,), batch_size = 2) // Two batches read from the above iterator are as follows: [[ 1. 2. 3.] [ 2. 3. 4.]] [[ 3. 4. 5.] [ 4. 5. 6.]] // Creates a `CSVIter` with default `round_batch` set to True. CSVIter = mx.io.CSVIter(data_csv = 'data/data.csv', data_shape = (3,), batch_size = 3) // Two batches read from the above iterator in the first pass are as follows: [[1. 2. 3.] [2. 3. 4.] [3. 4. 5.]] [[4. 5. 6.] [1. 2. 3.] [2. 3. 4.]] // Now, `reset` method is called. CSVIter.reset() // Batch read from the above iterator in the second pass is as follows: [[ 3. 4. 5.] [ 4. 5. 6.] [ 1. 2. 3.]] // Creates a `CSVIter` with `round_batch`=False. CSVIter = mx.io.CSVIter(data_csv = 'data/data.csv', data_shape = (3,), batch_size = 3, round_batch=False) // Contents of two batches read from the above iterator in both passes, after calling // `reset` method before second pass, is as follows: [[1. 2. 3.] [2. 3. 4.] [3. 4. 5.]] [[4. 5. 6.] [2. 3. 4.] [3. 4. 5.]] // Creates a 'CSVIter' with `dtype`='int32' CSVIter = mx.io.CSVIter(data_csv = 'data/data.csv', data_shape = (3,), batch_size = 3, round_batch=False, dtype='int32') // Contents of two batches read from the above iterator in both passes, after calling // `reset` method before second pass, is as follows: [[1 2 3] [2 3 4] [3 4 5]] [[4 5 6] [2 3 4] [3 4 5]]
Defined in src/io/iter_csv.cc:L308
Parameters: - data_csv (string, required) – The input CSV file or a directory path.
- data_shape (Shape(tuple), required) – The shape of one example.
- label_csv (string, optional, default='NULL') – The input CSV file or a directory path. If NULL, all labels will be returned as 0.
- label_shape (Shape(tuple), optional, default=[1]) – The shape of one label.
- batch_size (int (non-negative), required) – Batch size.
- round_batch (boolean, optional, default=1) – Whether to use round robin to handle overflow batch or not.
- prefetch_buffer (long (non-negative), optional, default=4) – Maximum number of batches to prefetch.
- dtype ({None, 'float16', 'float32', 'float64', 'int32', 'int64', 'uint8'},optional, default='None') – Output data type.
None
means no change.
Returns: The result iterator.
Return type:
-
mxnet.io.
LibSVMIter
(*args, **kwargs)¶ Returns the LibSVM iterator which returns data with csr storage type. This iterator is experimental and should be used with care.
The input data is stored in a format similar to LibSVM file format, except that the indices are expected to be zero-based instead of one-based, and the column indices for each row are expected to be sorted in ascending order. Details of the LibSVM format are available here.
The data_shape parameter is used to set the shape of each line of the data. The dimension of both data_shape and label_shape are expected to be 1.
The data_libsvm parameter is used to set the path input LibSVM file. When it is set to a directory, all the files in the directory will be read.
When label_libsvm is set to
NULL
, both data and label are read from the file specified by data_libsvm. In this case, the data is stored in csr storage type, while the label is a 1D dense array.The LibSVMIter only support round_batch parameter set to
True
. Therefore, if batch_size is 3 and there are 4 total rows in libsvm file, 2 more examples are consumed at the first round.When num_parts and part_index are provided, the data is split into num_parts partitions, and the iterator only reads the part_index-th partition. However, the partitions are not guaranteed to be even.
reset()
is expected to be called only after a complete pass of data.Example:
# Contents of libsvm file ``data.t``. 1.0 0:0.5 2:1.2 -2.0 -3.0 0:0.6 1:2.4 2:1.2 4 2:-1.2 # Creates a `LibSVMIter` with `batch_size`=3. >>> data_iter = mx.io.LibSVMIter(data_libsvm = 'data.t', data_shape = (3,), batch_size = 3) # The data of the first batch is stored in csr storage type >>> batch = data_iter.next() >>> csr = batch.data[0] <CSRNDArray 3x3 @cpu(0)> >>> csr.asnumpy() [[ 0.5 0. 1.2 ] [ 0. 0. 0. ] [ 0.6 2.4 1.2]] # The label of first batch >>> label = batch.label[0] >>> label [ 1. -2. -3.] <NDArray 3 @cpu(0)> >>> second_batch = data_iter.next() # The data of the second batch >>> second_batch.data[0].asnumpy() [[ 0. 0. -1.2 ] [ 0.5 0. 1.2 ] [ 0. 0. 0. ]] # The label of the second batch >>> second_batch.label[0].asnumpy() [ 4. 1. -2.] >>> data_iter.reset() # To restart the iterator for the second pass of the data
When label_libsvm is set to the path to another LibSVM file, data is read from data_libsvm and label from label_libsvm. In this case, both data and label are stored in the csr format. If the label column in the data_libsvm file is ignored.
Example:
# Contents of libsvm file ``label.t`` 1.0 -2.0 0:0.125 -3.0 2:1.2 4 1:1.0 2:-1.2 # Creates a `LibSVMIter` with specified label file >>> data_iter = mx.io.LibSVMIter(data_libsvm = 'data.t', data_shape = (3,), label_libsvm = 'label.t', label_shape = (3,), batch_size = 3) # Both data and label are in csr storage type >>> batch = data_iter.next() >>> csr_data = batch.data[0] <CSRNDArray 3x3 @cpu(0)> >>> csr_data.asnumpy() [[ 0.5 0. 1.2 ] [ 0. 0. 0. ] [ 0.6 2.4 1.2 ]] >>> csr_label = batch.label[0] <CSRNDArray 3x3 @cpu(0)> >>> csr_label.asnumpy() [[ 0. 0. 0. ] [ 0.125 0. 0. ] [ 0. 0. 1.2 ]]
Defined in src/io/iter_libsvm.cc:L298
Parameters: - data_libsvm (string, required) – The input zero-base indexed LibSVM data file or a directory path.
- data_shape (Shape(tuple), required) – The shape of one example.
- label_libsvm (string, optional, default='NULL') – The input LibSVM label file or a directory path. If NULL, all labels will be read from
data_libsvm
. - label_shape (Shape(tuple), optional, default=[1]) – The shape of one label.
- num_parts (int, optional, default='1') – partition the data into multiple parts
- part_index (int, optional, default='0') – the index of the part will read
- batch_size (int (non-negative), required) – Batch size.
- round_batch (boolean, optional, default=1) – Whether to use round robin to handle overflow batch or not.
- prefetch_buffer (long (non-negative), optional, default=4) – Maximum number of batches to prefetch.
- dtype ({None, 'float16', 'float32', 'float64', 'int32', 'int64', 'uint8'},optional, default='None') – Output data type.
None
means no change.
Returns: The result iterator.
Return type:
-
mxnet.io.
ImageRecordIter
(*args, **kwargs)¶ Iterates on image RecordIO files
Reads batches of images from .rec RecordIO files. One can use
im2rec.py
tool (in tools/) to pack raw image files into RecordIO files. This iterator is less flexible to customization but is fast and has lot of language bindings. To iterate over raw images directly useImageIter
instead (in Python).Example:
data_iter = mx.io.ImageRecordIter( path_imgrec="./sample.rec", # The target record file. data_shape=(3, 227, 227), # Output data shape; 227x227 region will be cropped from the original image. batch_size=4, # Number of items per batch. resize=256 # Resize the shorter edge to 256 before cropping. # You can specify more augmentation options. Use help(mx.io.ImageRecordIter) to see all the options. ) # You can now use the data_iter to access batches of images. batch = data_iter.next() # first batch. images = batch.data[0] # This will contain 4 (=batch_size) images each of 3x227x227. # process the images ... data_iter.reset() # To restart the iterator from the beginning.
Defined in src/io/iter_image_recordio_2.cc:L751
Parameters: - path_imglist (string, optional, default='') – Path to the image list (.lst) file. Generally created with tools/im2rec.py. Format (Tab separated):
. - path_imgrec (string, optional, default='') – Path to the image RecordIO (.rec) file or a directory path. Created with tools/im2rec.py.
- path_imgidx (string, optional, default='') – Path to the image RecordIO index (.idx) file. Created with tools/im2rec.py.
- aug_seq (string, optional, default='aug_default') – The augmenter names to represent sequence of augmenters to be applied, seperated by comma. Additional keyword parameters will be seen by these augmenters.
- label_width (int, optional, default='1') – The number of labels per image.
- data_shape (Shape(tuple), required) – The shape of one output image in (channels, height, width) format.
- preprocess_threads (int, optional, default='4') – The number of threads to do preprocessing.
- verbose (boolean, optional, default=1) – If or not output verbose information.
- num_parts (int, optional, default='1') – Virtually partition the data into these many parts.
- part_index (int, optional, default='0') – The i-th virtual partition to be read.
- shuffle_chunk_size (long (non-negative), optional, default=0) – The data shuffle buffer size in MB. Only valid if shuffle is true.
- shuffle_chunk_seed (int, optional, default='0') – The random seed for shuffling
- shuffle (boolean, optional, default=0) – Whether to shuffle data randomly or not.
- seed (int, optional, default='0') – The random seed.
- batch_size (int (non-negative), required) – Batch size.
- round_batch (boolean, optional, default=1) – Whether to use round robin to handle overflow batch or not.
- prefetch_buffer (long (non-negative), optional, default=4) – Maximum number of batches to prefetch.
- dtype ({None, 'float16', 'float32', 'float64', 'int32', 'int64', 'uint8'},optional, default='None') – Output data type.
None
means no change. - resize (int, optional, default='-1') – Down scale the shorter edge to a new size before applying other augmentations.
- rand_crop (boolean, optional, default=0) – If or not randomly crop the image
- random_resized_crop (boolean, optional, default=0) – If or not perform random resized cropping on the image, as a standard preprocessing for resnet training on ImageNet data.
- max_rotate_angle (int, optional, default='0') – Rotate by a random degree in
[-v, v]
- max_aspect_ratio (float, optional, default=0) – Change the aspect (namely width/height) to a random value. If min_aspect_ratio is None then the aspect ratio ins sampled from [1 - max_aspect_ratio, 1 + max_aspect_ratio], else it is in
[min_aspect_ratio, max_aspect_ratio]
- min_aspect_ratio (float or None, optional, default=None) – Change the aspect (namely width/height) to a random value in
[min_aspect_ratio, max_aspect_ratio]
- max_shear_ratio (float, optional, default=0) – Apply a shear transformation (namely
(x,y)->(x+my,y)
) withm
randomly chose from[-max_shear_ratio, max_shear_ratio]
- max_crop_size (int, optional, default='-1') – Crop both width and height into a random size in
[min_crop_size, max_crop_size].``Ignored if ``random_resized_crop
is True. - min_crop_size (int, optional, default='-1') – Crop both width and height into a random size in
[min_crop_size, max_crop_size].``Ignored if ``random_resized_crop
is True. - max_random_scale (float, optional, default=1) – Resize into
[width*s, height*s]
withs
randomly chosen from[min_random_scale, max_random_scale]
. Ignored ifrandom_resized_crop
is True. - min_random_scale (float, optional, default=1) – Resize into
[width*s, height*s]
withs
randomly chosen from[min_random_scale, max_random_scale]``Ignored if ``random_resized_crop
is True. - max_random_area (float, optional, default=1) – Change the area (namely width * height) to a random value in
[min_random_area, max_random_area]
. Ignored ifrandom_resized_crop
is False. - min_random_area (float, optional, default=1) – Change the area (namely width * height) to a random value in
[min_random_area, max_random_area]
. Ignored ifrandom_resized_crop
is False. - max_img_size (float, optional, default=1e+10) – Set the maximal width and height after all resize and rotate argumentation are applied
- min_img_size (float, optional, default=0) – Set the minimal width and height after all resize and rotate argumentation are applied
- brightness (float, optional, default=0) – Add a random value in
[-brightness, brightness]
to the brightness of image. - contrast (float, optional, default=0) – Add a random value in
[-contrast, contrast]
to the contrast of image. - saturation (float, optional, default=0) – Add a random value in
[-saturation, saturation]
to the saturation of image. - pca_noise (float, optional, default=0) – Add PCA based noise to the image.
- random_h (int, optional, default='0') – Add a random value in
[-random_h, random_h]
to the H channel in HSL color space. - random_s (int, optional, default='0') – Add a random value in
[-random_s, random_s]
to the S channel in HSL color space. - random_l (int, optional, default='0') – Add a random value in
[-random_l, random_l]
to the L channel in HSL color space. - rotate (int, optional, default='-1') – Rotate by an angle. If set, it overwrites the
max_rotate_angle
option. - fill_value (int, optional, default='255') – Set the padding pixels value to
fill_value
. - inter_method (int, optional, default='1') – The interpolation method: 0-NN 1-bilinear 2-cubic 3-area 4-lanczos4 9-auto 10-rand.
- pad (int, optional, default='0') – Change size from
[width, height]
into[pad + width + pad, pad + height + pad]
by padding pixes - seed_aug (int or None, optional, default='None') – Random seed for augmentations.
- mirror (boolean, optional, default=0) – Whether to mirror the image or not. If true, images are flipped along the horizontal axis.
- rand_mirror (boolean, optional, default=0) – Whether to randomly mirror images or not. If true, 50% of the images will be randomly mirrored (flipped along the horizontal axis)
- mean_img (string, optional, default='') – Filename of the mean image.
- mean_r (float, optional, default=0) – The mean value to be subtracted on the R channel
- mean_g (float, optional, default=0) – The mean value to be subtracted on the G channel
- mean_b (float, optional, default=0) – The mean value to be subtracted on the B channel
- mean_a (float, optional, default=0) – The mean value to be subtracted on the alpha channel
- std_r (float, optional, default=1) – Augmentation Param: Standard deviation on R channel.
- std_g (float, optional, default=1) – Augmentation Param: Standard deviation on G channel.
- std_b (float, optional, default=1) – Augmentation Param: Standard deviation on B channel.
- std_a (float, optional, default=1) – Augmentation Param: Standard deviation on Alpha channel.
- scale (float, optional, default=1) – Multiply the image with a scale value.
- max_random_contrast (float, optional, default=0) – Change the contrast with a value randomly chosen from
[-max_random_contrast, max_random_contrast]
- max_random_illumination (float, optional, default=0) – Change the illumination with a value randomly chosen from
[-max_random_illumination, max_random_illumination]
Returns: The result iterator.
Return type: - path_imglist (string, optional, default='') – Path to the image list (.lst) file. Generally created with tools/im2rec.py. Format (Tab separated):
-
mxnet.io.
ImageRecordUInt8Iter
(*args, **kwargs)¶ Iterating on image RecordIO files
This iterator is identical to
ImageRecordIter
except for usinguint8
as the data type instead offloat
.Defined in src/io/iter_image_recordio_2.cc:L768
Parameters: - path_imglist (string, optional, default='') – Path to the image list (.lst) file. Generally created with tools/im2rec.py. Format (Tab separated):
. - path_imgrec (string, optional, default='') – Path to the image RecordIO (.rec) file or a directory path. Created with tools/im2rec.py.
- path_imgidx (string, optional, default='') – Path to the image RecordIO index (.idx) file. Created with tools/im2rec.py.
- aug_seq (string, optional, default='aug_default') – The augmenter names to represent sequence of augmenters to be applied, seperated by comma. Additional keyword parameters will be seen by these augmenters.
- label_width (int, optional, default='1') – The number of labels per image.
- data_shape (Shape(tuple), required) – The shape of one output image in (channels, height, width) format.
- preprocess_threads (int, optional, default='4') – The number of threads to do preprocessing.
- verbose (boolean, optional, default=1) – If or not output verbose information.
- num_parts (int, optional, default='1') – Virtually partition the data into these many parts.
- part_index (int, optional, default='0') – The i-th virtual partition to be read.
- shuffle_chunk_size (long (non-negative), optional, default=0) – The data shuffle buffer size in MB. Only valid if shuffle is true.
- shuffle_chunk_seed (int, optional, default='0') – The random seed for shuffling
- shuffle (boolean, optional, default=0) – Whether to shuffle data randomly or not.
- seed (int, optional, default='0') – The random seed.
- batch_size (int (non-negative), required) – Batch size.
- round_batch (boolean, optional, default=1) – Whether to use round robin to handle overflow batch or not.
- prefetch_buffer (long (non-negative), optional, default=4) – Maximum number of batches to prefetch.
- dtype ({None, 'float16', 'float32', 'float64', 'int32', 'int64', 'uint8'},optional, default='None') – Output data type.
None
means no change. - resize (int, optional, default='-1') – Down scale the shorter edge to a new size before applying other augmentations.
- rand_crop (boolean, optional, default=0) – If or not randomly crop the image
- random_resized_crop (boolean, optional, default=0) – If or not perform random resized cropping on the image, as a standard preprocessing for resnet training on ImageNet data.
- max_rotate_angle (int, optional, default='0') – Rotate by a random degree in
[-v, v]
- max_aspect_ratio (float, optional, default=0) – Change the aspect (namely width/height) to a random value. If min_aspect_ratio is None then the aspect ratio ins sampled from [1 - max_aspect_ratio, 1 + max_aspect_ratio], else it is in
[min_aspect_ratio, max_aspect_ratio]
- min_aspect_ratio (float or None, optional, default=None) – Change the aspect (namely width/height) to a random value in
[min_aspect_ratio, max_aspect_ratio]
- max_shear_ratio (float, optional, default=0) – Apply a shear transformation (namely
(x,y)->(x+my,y)
) withm
randomly chose from[-max_shear_ratio, max_shear_ratio]
- max_crop_size (int, optional, default='-1') – Crop both width and height into a random size in
[min_crop_size, max_crop_size].``Ignored if ``random_resized_crop
is True. - min_crop_size (int, optional, default='-1') – Crop both width and height into a random size in
[min_crop_size, max_crop_size].``Ignored if ``random_resized_crop
is True. - max_random_scale (float, optional, default=1) – Resize into
[width*s, height*s]
withs
randomly chosen from[min_random_scale, max_random_scale]
. Ignored ifrandom_resized_crop
is True. - min_random_scale (float, optional, default=1) – Resize into
[width*s, height*s]
withs
randomly chosen from[min_random_scale, max_random_scale]``Ignored if ``random_resized_crop
is True. - max_random_area (float, optional, default=1) – Change the area (namely width * height) to a random value in
[min_random_area, max_random_area]
. Ignored ifrandom_resized_crop
is False. - min_random_area (float, optional, default=1) – Change the area (namely width * height) to a random value in
[min_random_area, max_random_area]
. Ignored ifrandom_resized_crop
is False. - max_img_size (float, optional, default=1e+10) – Set the maximal width and height after all resize and rotate argumentation are applied
- min_img_size (float, optional, default=0) – Set the minimal width and height after all resize and rotate argumentation are applied
- brightness (float, optional, default=0) – Add a random value in
[-brightness, brightness]
to the brightness of image. - contrast (float, optional, default=0) – Add a random value in
[-contrast, contrast]
to the contrast of image. - saturation (float, optional, default=0) – Add a random value in
[-saturation, saturation]
to the saturation of image. - pca_noise (float, optional, default=0) – Add PCA based noise to the image.
- random_h (int, optional, default='0') – Add a random value in
[-random_h, random_h]
to the H channel in HSL color space. - random_s (int, optional, default='0') – Add a random value in
[-random_s, random_s]
to the S channel in HSL color space. - random_l (int, optional, default='0') – Add a random value in
[-random_l, random_l]
to the L channel in HSL color space. - rotate (int, optional, default='-1') – Rotate by an angle. If set, it overwrites the
max_rotate_angle
option. - fill_value (int, optional, default='255') – Set the padding pixels value to
fill_value
. - inter_method (int, optional, default='1') – The interpolation method: 0-NN 1-bilinear 2-cubic 3-area 4-lanczos4 9-auto 10-rand.
- pad (int, optional, default='0') – Change size from
[width, height]
into[pad + width + pad, pad + height + pad]
by padding pixes - seed_aug (int or None, optional, default='None') – Random seed for augmentations.
Returns: The result iterator.
Return type: - path_imglist (string, optional, default='') – Path to the image list (.lst) file. Generally created with tools/im2rec.py. Format (Tab separated):
-
mxnet.io.
MNISTIter
(*args, **kwargs)¶ Iterating on the MNIST dataset.
One can download the dataset from http://yann.lecun.com/exdb/mnist/
Defined in src/io/iter_mnist.cc:L265
Parameters: - image (string, optional, default='./train-images-idx3-ubyte') – Dataset Param: Mnist image path.
- label (string, optional, default='./train-labels-idx1-ubyte') – Dataset Param: Mnist label path.
- batch_size (int, optional, default='128') – Batch Param: Batch Size.
- shuffle (boolean, optional, default=1) – Augmentation Param: Whether to shuffle data.
- flat (boolean, optional, default=0) – Augmentation Param: Whether to flat the data into 1D.
- seed (int, optional, default='0') – Augmentation Param: Random Seed.
- silent (boolean, optional, default=0) – Auxiliary Param: Whether to print out data info.
- num_parts (int, optional, default='1') – partition the data into multiple parts
- part_index (int, optional, default='0') – the index of the part will read
- prefetch_buffer (long (non-negative), optional, default=4) – Maximum number of batches to prefetch.
- dtype ({None, 'float16', 'float32', 'float64', 'int32', 'int64', 'uint8'},optional, default='None') – Output data type.
None
means no change.
Returns: The result iterator.
Return type:
mxnet.io - Helper Classes & Functions¶
Data iterators for common data formats.
-
class
mxnet.io.
DataDesc
[source]¶ DataDesc is used to store name, shape, type and layout information of the data or the label.
The layout describes how the axes in shape should be interpreted, for example for image data setting layout=NCHW indicates that the first axis is number of examples in the batch(N), C is number of channels, H is the height and W is the width of the image.
For sequential data, by default layout is set to
NTC
, where N is number of examples in the batch, T the temporal axis representing time and C is the number of channels.Parameters: - cls (DataDesc) – The class.
- name (str) – Data name.
- shape (tuple of int) – Data shape.
- dtype (np.dtype, optional) – Data type.
- layout (str, optional) – Data layout.
-
static
get_batch_axis
(layout)[source]¶ Get the dimension that corresponds to the batch size.
When data parallelism is used, the data will be automatically split and concatenated along the batch-size dimension. Axis can be -1, which means the whole array will be copied for each data-parallelism device.
Parameters: layout (str) – layout string. For example, “NCHW”. Returns: An axis indicating the batch_size dimension. Return type: int
-
class
mxnet.io.
DataBatch
(data, label=None, pad=None, index=None, bucket_key=None, provide_data=None, provide_label=None)[source]¶ A data batch.
MXNet’s data iterator returns a batch of data for each next call. This data contains batch_size number of examples.
If the input data consists of images, then shape of these images depend on the layout attribute of DataDesc object in provide_data parameter.
If layout is set to ‘NCHW’ then, images should be stored in a 4-D matrix of shape
(batch_size, num_channel, height, width)
. If layout is set to ‘NHWC’ then, images should be stored in a 4-D matrix of shape(batch_size, height, width, num_channel)
. The channels are often in RGB order.Parameters: - data (list of NDArray, each array containing batch_size examples.) – A list of input data.
- label (list of NDArray, each array often containing a 1-dimensional array. optional) – A list of input labels.
- pad (int, optional) – The number of examples padded at the end of a batch. It is used when the total number of examples read is not divisible by the batch_size. These extra padded examples are ignored in prediction.
- index (numpy.array, optional) – The example indices in this batch.
- bucket_key (int, optional) – The bucket key, used for bucketing module.
- provide_data (list of DataDesc, optional) – A list of DataDesc objects. DataDesc is used to store
name, shape, type and layout information of the data.
The i-th element describes the name and shape of
data[i]
. - provide_label (list of DataDesc, optional) – A list of DataDesc objects. DataDesc is used to store
name, shape, type and layout information of the label.
The i-th element describes the name and shape of
label[i]
.
-
class
mxnet.io.
DataIter
(batch_size=0)[source]¶ The base class for an MXNet data iterator.
All I/O in MXNet is handled by specializations of this class. Data iterators in MXNet are similar to standard-iterators in Python. On each call to next they return a DataBatch which represents the next batch of data. When there is no more data to return, it raises a StopIteration exception.
Parameters: batch_size (int, optional) – The batch size, namely the number of items in the batch. See also
NDArrayIter
- Data-iterator for MXNet NDArray or numpy-ndarray objects.
CSVIter
- Data-iterator for csv data.
LibSVMIter
- Data-iterator for libsvm data.
ImageIter
- Data-iterator for images.
-
next
()[source]¶ Get next data batch from iterator.
Returns: The data of next batch. Return type: DataBatch Raises: StopIteration
– If the end of the data is reached.
-
iter_next
()[source]¶ Move to the next batch.
Returns: Whether the move is successful. Return type: boolean
-
getdata
()[source]¶ Get data of current batch.
Returns: The data of the current batch. Return type: list of NDArray
-
getlabel
()[source]¶ Get label of the current batch.
Returns: The label of the current batch. Return type: list of NDArray
-
class
mxnet.io.
ResizeIter
(data_iter, size, reset_internal=True)[source]¶ Resize a data iterator to a given number of batches.
Parameters: - data_iter (DataIter) – The data iterator to be resized.
- size (int) – The number of batches per epoch to resize to.
- reset_internal (bool) – Whether to reset internal iterator on ResizeIter.reset.
Examples
>>> nd_iter = mx.io.NDArrayIter(mx.nd.ones((100,10)), batch_size=25) >>> resize_iter = mx.io.ResizeIter(nd_iter, 2) >>> for batch in resize_iter: ... print(batch.data) [
] []
-
class
mxnet.io.
PrefetchingIter
(iters, rename_data=None, rename_label=None)[source]¶ Performs pre-fetch for other data iterators.
This iterator will create another thread to perform
iter_next
and then store the data in memory. It potentially accelerates the data read, at the cost of more memory usage.Parameters: - iters (DataIter or list of DataIter) – The data iterators to be pre-fetched.
- rename_data (None or list of dict) – The i-th element is a renaming map for the i-th iter, in the form of {‘original_name’ : ‘new_name’}. Should have one entry for each entry in iter[i].provide_data.
- rename_label (None or list of dict) – Similar to
rename_data
.
Examples
>>> iter1 = mx.io.NDArrayIter({'data':mx.nd.ones((100,10))}, batch_size=25) >>> iter2 = mx.io.NDArrayIter({'data':mx.nd.ones((100,10))}, batch_size=25) >>> piter = mx.io.PrefetchingIter([iter1, iter2], ... rename_data=[{'data': 'data_1'}, {'data': 'data_2'}]) >>> print(piter.provide_data) [DataDesc[data_1,(25, 10L),
,NCHW], DataDesc[data_2,(25, 10L),,NCHW]]
-
class
mxnet.io.
MXDataIter
(handle, data_name='data', label_name='softmax_label', **_)[source]¶ A python wrapper a C++ data iterator.
This iterator is the Python wrapper to all native C++ data iterators, such as CSVIter, ImageRecordIter, MNISTIter, etc. When initializing CSVIter for example, you will get an MXDataIter instance to use in your Python code. Calls to next, reset, etc will be delegated to the underlying C++ data iterators.
Usually you don’t need to interact with MXDataIter directly unless you are implementing your own data iterators in C++. To do that, please refer to examples under the src/io folder.
Parameters: - handle (DataIterHandle, required) – The handle to the underlying C++ Data Iterator.
- data_name (str, optional) – Data name. Default to “data”.
- label_name (str, optional) – Label name. Default to “softmax_label”.
See also
src/io : The underlying C++ data iterator implementation, e.g., CSVIter.
mxnet.recordio¶
Read and write for the RecordIO data format.
-
class
mxnet.recordio.
MXRecordIO
(uri, flag)[source]¶ Reads/writes RecordIO data format, supporting sequential read and write.
>>> record = mx.recordio.MXRecordIO('tmp.rec', 'w')
>>> for i in range(5): ... record.write('record_%d'%i) >>> record.close() >>> record = mx.recordio.MXRecordIO('tmp.rec', 'r') >>> for i in range(5): ... item = record.read() ... print(item) record_0 record_1 record_2 record_3 record_4 >>> record.close() Parameters: - uri (string) – Path to the record file.
- flag (string) – ‘w’ for write or ‘r’ for read.
-
reset
()[source]¶ Resets the pointer to first item.
If the record is opened with ‘w’, this function will truncate the file to empty.
>>> record = mx.recordio.MXRecordIO('tmp.rec', 'r') >>> for i in range(2): ... item = record.read() ... print(item) record_0 record_1 >>> record.reset() # Pointer is reset. >>> print(record.read()) # Started reading from start again. record_0 >>> record.close()
-
class
mxnet.recordio.
MXIndexedRecordIO
(idx_path, uri, flag, key_type=)[source]¶ Reads/writes RecordIO data format, supporting random access.
>>> for i in range(5): ... record.write_idx(i, 'record_%d'%i) >>> record.close() >>> record = mx.recordio.MXIndexedRecordIO('tmp.idx', 'tmp.rec', 'r') >>> record.read_idx(3) record_3
Parameters: - idx_path (str) – Path to the index file.
- uri (str) – Path to the record file. Only supports seekable file types.
- flag (str) – ‘w’ for write or ‘r’ for read.
- key_type (type) – Data type for keys.
-
seek
(idx)[source]¶ Sets the current read pointer position.
This function is internally called by read_idx(idx) to find the current reader pointer position. It doesn’t return anything.
-
tell
()[source]¶ Returns the current position of write head.
>>> record = mx.recordio.MXIndexedRecordIO('tmp.idx', 'tmp.rec', 'w') >>> print(record.tell()) 0 >>> for i in range(5): ... record.write_idx(i, 'record_%d'%i) ... print(record.tell()) 16 32 48 64 80
-
read_idx
(idx)[source]¶ Returns the record at given index.
>>> record = mx.recordio.MXIndexedRecordIO('tmp.idx', 'tmp.rec', 'w') >>> for i in range(5): ... record.write_idx(i, 'record_%d'%i) >>> record.close() >>> record = mx.recordio.MXIndexedRecordIO('tmp.idx', 'tmp.rec', 'r') >>> record.read_idx(3) record_3
-
mxnet.recordio.
IRHeader
¶ An alias for HEADER. Used to store metadata (e.g. labels) accompanying a record. See mxnet.recordio.pack and mxnet.recordio.pack_img for example uses.
Parameters: - flag (int) – Available for convenience, can be set arbitrarily.
- label (float or an array of float) – Typically used to store label(s) for a record.
- id (int) – Usually a unique id representing record.
- id2 (int) – Higher order bits of the unique id, should be set to 0 (in most cases).
alias of
HEADER
-
mxnet.recordio.
pack
(header, s)[source]¶ Pack a string into MXImageRecord.
Parameters: - header (IRHeader) – Header of the image record.
header.label
can be a number or an array. See more detail inIRHeader
. - s (str) – Raw image string to be packed.
Returns: s – The packed string.
Return type: str
Examples
>>> label = 4 # label can also be a 1-D array, for example: label = [1,2,3] >>> id = 2574 >>> header = mx.recordio.IRHeader(0, label, id, 0) >>> with open(path, 'r') as file: ... s = file.read() >>> packed_s = mx.recordio.pack(header, s)
- header (IRHeader) – Header of the image record.
-
mxnet.recordio.
unpack
(s)[source]¶ Unpack a MXImageRecord to string.
Parameters: s (str) – String buffer from MXRecordIO.read
.Returns: - header (IRHeader) – Header of the image record.
- s (str) – Unpacked string.
Examples
>>> record = mx.recordio.MXRecordIO('test.rec', 'r') >>> item = record.read() >>> header, s = mx.recordio.unpack(item) >>> header HEADER(flag=0, label=14.0, id=20129312, id2=0)
-
mxnet.recordio.
unpack_img
(s, iscolor=-1)[source]¶ Unpack a MXImageRecord to image.
Parameters: - s (str) – String buffer from
MXRecordIO.read
. - iscolor (int) – Image format option for
cv2.imdecode
.
Returns: - header (IRHeader) – Header of the image record.
- img (numpy.ndarray) – Unpacked image.
Examples
>>> record = mx.recordio.MXRecordIO('test.rec', 'r') >>> item = record.read() >>> header, img = mx.recordio.unpack_img(item) >>> header HEADER(flag=0, label=14.0, id=20129312, id2=0) >>> img array([[[ 23, 27, 45], [ 28, 32, 50], ..., [ 36, 40, 59], [ 35, 39, 58]], ..., [[ 91, 92, 113], [ 97, 98, 119], ..., [168, 169, 167], [166, 167, 165]]], dtype=uint8)
- s (str) – String buffer from
-
mxnet.recordio.
pack_img
(header, img, quality=95, img_fmt='.jpg')[source]¶ Pack an image into
MXImageRecord
.Parameters: - header (IRHeader) – Header of the image record.
header.label
can be a number or an array. See more detail inIRHeader
. - img (numpy.ndarray) – Image to be packed.
- quality (int) – Quality for JPEG encoding in range 1-100, or compression for PNG encoding in range 1-9.
- img_fmt (str) – Encoding of the image (.jpg for JPEG, .png for PNG).
Returns: s – The packed string.
Return type: str
Examples
>>> label = 4 # label can also be a 1-D array, for example: label = [1,2,3] >>> id = 2574 >>> header = mx.recordio.IRHeader(0, label, id, 0) >>> img = cv2.imread('test.jpg') >>> packed_s = mx.recordio.pack_img(header, img)
- header (IRHeader) – Header of the image record.