Symbol API¶

Overview¶

This document lists the routines of the symbolic expression package:

mxnet.symbol Symbolic configuration API of MXNet.

The Symbol API, defined in the symbol (or simply sym) package, provides neural network graphs and auto-differentiation. A symbol represents a multi-output symbolic expression. They are composited by operators, such as simple matrix operations (e.g. “+”), or a neural network layer (e.g. convolution layer). An operator can take several input variables, produce more than one output variables, and have internal state variables. A variable can be either free, which we can bind with value later, or an output of another symbol.

>>> a = mx.sym.Variable('a')
>>> b = mx.sym.Variable('b')
>>> c = 2 * a + b
>>> type(c)

>>> e = c.bind(mx.cpu(), {'a': mx.nd.array([1,2]), 'b':mx.nd.array([2,3])})
>>> y = e.forward()
>>> y
[]
>>> y[0].asnumpy()
array([ 4.,  7.], dtype=float32)

A detailed tutorial is available at Symbol - Neural network graphs and auto-differentiation.

Note

most operators provided in symbol are similar to those in ndarray although there are few differences:

symbol adopts declarative programming. In other words, we need to first compose the computations, and then feed it with data for execution whereas ndarray adopts imperative programming.
Most binary operators in symbol such as + and > don’t broadcast. We need to call the broadcast version of the operator such as broadcast_plus explicitly.

In the rest of this document, we first overview the methods provided by the symbol.Symbol class, and then list other routines provided by the symbol package.

The `Symbol` class¶

Composition¶

Composite multiple symbols into a new one by an operator.

Symbol.__call__ Composes symbol using inputs.

Arithmetic operations¶

`Symbol.__add__`	x.__add__(y) <=> x+y
`Symbol.__sub__`	x.__sub__(y) <=> x-y
`Symbol.__rsub__`	x.__rsub__(y) <=> y-x
`Symbol.__neg__`	x.__neg__() <=> -x
`Symbol.__mul__`	x.__mul__(y) <=> x*y
`Symbol.__div__`	x.__div__(y) <=> x/y
`Symbol.__rdiv__`	x.__rdiv__(y) <=> y/x
`Symbol.__mod__`	x.__mod__(y) <=> x%y
`Symbol.__rmod__`	x.__rmod__(y) <=> y%x
`Symbol.__pow__`	x.__pow__(y) <=> x**y

Comparison operators¶

`Symbol.__lt__`	x.__lt__(y) <=> x
`Symbol.__le__`	x.__le__(y) <=> x<=y
`Symbol.__gt__`	x.__gt__(y) <=> x>y
`Symbol.__ge__`	x.__ge__(y) <=> x>=y
`Symbol.__eq__`	x.__eq__(y) <=> x==y
`Symbol.__ne__`	x.__ne__(y) <=> x!=y

Query information¶

`Symbol.name`	Gets name string from the symbol, this function only works for non-grouped symbol.
`Symbol.list_arguments`	Lists all the arguments in the symbol.
`Symbol.list_outputs`	Lists all the outputs in the symbol.
`Symbol.list_auxiliary_states`	Lists all the auxiliary states in the symbol.
`Symbol.list_attr`	Gets all attributes from the symbol.
`Symbol.attr`	Returns the attribute string for corresponding input key from the symbol.
`Symbol.attr_dict`	Recursively gets all attributes from the symbol and its children.

Get internal and output symbol¶

`Symbol.__getitem__`	x.__getitem__(i) <=> x[i]
`Symbol.__iter__`	Returns a generator object of symbol.
`Symbol.get_internals`	Gets a new grouped symbol sgroup.
`Symbol.get_children`	Gets a new grouped symbol whose output contains inputs to output nodes of the original symbol.

Inference type and shape¶

`Symbol.infer_type`	Infers the type of all arguments and all outputs, given the known types for some arguments.
`Symbol.infer_shape`	Infers the shapes of all arguments and all outputs given the known shapes of some arguments.
`Symbol.infer_shape_partial`	Infers the shape partially.

Bind¶

`Symbol.bind`	Binds the current symbol to an executor and returns it.
`Symbol.simple_bind`	Bind current symbol to get an executor, allocate all the arguments needed.

Save¶

`Symbol.save`	Saves symbol to a file.
`Symbol.tojson`	Saves symbol to a JSON string.
`Symbol.debug_str`	Gets a debug string of symbol.

Symbol creation routines¶

`var`	Creates a symbolic variable with specified name.
`zeros`	Returns a new symbol of given shape and type, filled with zeros.
`ones`	Returns a new symbol of given shape and type, filled with ones.
`arange`	Returns evenly spaced values within a given interval.

Symbol manipulation routines¶

Changing shape and type¶

`cast`	Casts all elements of the input to a new type.
`reshape`	Reshapes the input array.
`flatten`	Flattens the input array into a 2-D array by collapsing the higher dimensions.
`expand_dims`	Inserts a new axis of size 1 into the array shape

Expanding elements¶

`broadcast_to`	Broadcasts the input array to a new shape.
`broadcast_axes`	Broadcasts the input array over particular axes.
`repeat`	Repeats elements of an array.
`tile`	Repeats the whole array multiple times.
`pad`	Pads an input array with a constant or edge values of the array.

Rearranging elements¶

`transpose`	Permutes the dimensions of an array.
`swapaxes`	Interchanges two axes of an array.
`flip`	Reverses the order of elements along given axis while preserving array shape.

Joining and splitting symbols¶

`concat`	Joins input arrays along a given axis.
`split`	Splits an array along a particular axis into multiple sub-arrays.

Indexing routines¶

`slice`	Slices a contiguous region of the array.
`slice_axis`	Slices along a given axis.
`take`	Takes elements from an input array along the given axis.
`batch_take`	Takes elements from a data batch.
`one_hot`	Returns a one-hot array.

Mathematical functions¶

Arithmetic operations¶

`broadcast_add`	Returns element-wise sum of the input arrays with broadcasting.
`broadcast_sub`	Returns element-wise difference of the input arrays with broadcasting.
`broadcast_mul`	Returns element-wise product of the input arrays with broadcasting.
`broadcast_div`	Returns element-wise division of the input arrays with broadcasting.
`broadcast_mod`	Returns element-wise modulo of the input arrays with broadcasting.
`negative`	Numerical negative of the argument, element-wise.
`reciprocal`	Returns the reciprocal of the argument, element-wise.
`dot`	Dot product of two arrays.
`batch_dot`	Batchwise dot product.
`add_n`	Adds all input arguments element-wise.

Trigonometric functions¶

`sin`	Computes the element-wise sine of the input array.
`cos`	Computes the element-wise cosine of the input array.
`tan`	Computes the element-wise tangent of the input array.
`arcsin`	Returns element-wise inverse sine of the input array.
`arccos`	Returns element-wise inverse cosine of the input array.
`arctan`	Returns element-wise inverse tangent of the input array.
`hypot`	Given the “legs” of a right triangle, returns its hypotenuse.
`broadcast_hypot`	Returns the hypotenuse of a right angled triangle, given its “legs” with broadcasting.
`degrees`	Converts each element of the input array from radians to degrees.
`radians`	Converts each element of the input array from degrees to radians.

Hyperbolic functions¶

`sinh`	Returns the hyperbolic sine of the input array, computed element-wise.
`cosh`	Returns the hyperbolic cosine of the input array, computed element-wise.
`tanh`	Returns the hyperbolic tangent of the input array, computed element-wise.
`arcsinh`	Returns the element-wise inverse hyperbolic sine of the input array, computed element-wise.
`arccosh`	Returns the element-wise inverse hyperbolic cosine of the input array, computed element-wise.
`arctanh`	Returns the element-wise inverse hyperbolic tangent of the input array, computed element-wise.

Reduce functions¶

`sum`	Computes the sum of array elements over given axes.
`nansum`	Computes the sum of array elements over given axes treating Not a Numbers (`NaN`) as zero.
`prod`	Computes the product of array elements over given axes.
`nanprod`	Computes the product of array elements over given axes treating Not a Numbers (`NaN`) as one.
`mean`	Computes the mean of array elements over given axes.
`max`	Computes the max of array elements over given axes.
`min`	Computes the min of array elements over given axes.
`norm`	Flattens the input array and then computes the l2 norm.

Rounding¶

`round`	Returns element-wise rounded value to the nearest integer of the input.
`rint`	Returns element-wise rounded value to the nearest integer of the input.
`fix`	Returns element-wise rounded value to the nearest integer towards zero of the input.
`floor`	Returns element-wise floor of the input.
`ceil`	Returns element-wise ceiling of the input.
`trunc`	Return the element-wise truncated value of the input.

Exponents and logarithms¶

`exp`	Returns element-wise exponential value of the input.
`expm1`	Returns `exp(x) - 1` computed element-wise on the input.
`log`	Returns element-wise Natural logarithmic value of the input.
`log10`	Returns element-wise Base-10 logarithmic value of the input.
`log2`	Returns element-wise Base-2 logarithmic value of the input.
`log1p`	Returns element-wise `log(1 + x)` value of the input.

Powers¶

`broadcast_power`	Returns result of first array elements raised to powers from second array, element-wise with broadcasting.
`sqrt`	Returns element-wise square-root value of the input.
`rsqrt`	Returns element-wise inverse square-root value of the input.
`square`	Returns element-wise squared value of the input.

Logic functions¶

`broadcast_equal`	Returns the result of element-wise equal to (==) comparison operation with broadcasting.
`broadcast_not_equal`	Returns the result of element-wise not equal to (!=) comparison operation with broadcasting.
`broadcast_greater`	Returns the result of element-wise greater than (>) comparison operation with broadcasting.
`broadcast_greater_equal`	Returns the result of element-wise greater than or equal to (>=) comparison operation with broadcasting.
`broadcast_lesser`	Returns the result of element-wise lesser than (<) comparison operation with broadcasting.
`broadcast_lesser_equal`	Returns the result of element-wise lesser than or equal to (<=) comparison operation with broadcasting.

Random sampling¶

`random_uniform`	Draw random samples from a uniform distribution.
`random_normal`	Draw random samples from a normal (Gaussian) distribution.
`random_gamma`	Draw random samples from a gamma distribution.
`random_exponential`	Draw random samples from an exponential distribution.
`random_poisson`	Draw random samples from a Poisson distribution.
`random_negative_binomial`	Draw random samples from a negative binomial distribution.
`random_generalized_negative_binomial`	Draw random samples from a generalized negative binomial distribution.
`sample_uniform`	Concurrent sampling from multiple uniform distributions on the intervals given by [low,high).
`sample_normal`	Concurrent sampling from multiple normal distributions with parameters mu (mean) and sigma (standard deviation).
`sample_gamma`	Concurrent sampling from multiple gamma distributions with parameters alpha (shape) and beta (scale).
`sample_exponential`	Concurrent sampling from multiple exponential distributions with parameters lambda (rate).
`sample_poisson`	Concurrent sampling from multiple Poisson distributions with parameters lambda (rate).
`sample_negative_binomial`	Concurrent sampling from multiple negative binomial distributions with parameters k (failure limit) and p (failure probability).
`sample_generalized_negative_binomial`	Concurrent sampling from multiple generalized negative binomial distributions with parameters mu (mean) and alpha (dispersion).
`mxnet.random.seed`	Seeds the random number generators in MXNet.

Sorting and searching¶

`sort`	Returns a sorted copy of an input array along the given axis.
`topk`	Returns the top k elements in an input array along the given axis.
`argsort`	Returns the indices that would sort an input array along the given axis.
`argmax`	Returns indices of the maximum values along an axis.
`argmin`	Returns indices of the minimum values along an axis.

Linear Algebra¶

`linalg_gemm`	Performs general matrix multiplication and accumulation.
`linalg_gemm2`	Performs general matrix multiplication.
`linalg_potrf`	Performs Cholesky factorization of a symmetric positive-definite matrix.
`linalg_potri`	Performs matrix inversion from a Cholesky factorization.
`linalg_trmm`	Performs multiplication with a triangular matrix.
`linalg_trsm`	Solves matrix equations involving a triangular matrix.
`linalg_sumlogdiag`	Computes the sum of the logarithms of all diagonal elements in a matrix.

Miscellaneous¶

`maximum`	Returns element-wise maximum of the input elements.
`minimum`	Returns element-wise minimum of the input elements.
`broadcast_maximum`	Returns element-wise maximum of the input arrays with broadcasting.
`broadcast_minimum`	Returns element-wise minimum of the input arrays with broadcasting.
`clip`	Clips (limits) the values in an array.
`abs`	Returns element-wise absolute value of the input.
`sign`	Returns element-wise sign of the input.
`gamma`	Returns the gamma function (extension of the factorial function to the reals) , computed element-wise on the input array.
`gammaln`	Returns element-wise log of the absolute value of the gamma function of the input.

Neural network¶

Basic¶

`FullyConnected`	Applies a linear transformation: \(Y = XW^T + b\).
`Convolution`	Compute N-D convolution on (N+2)-D input.
`Activation`	Applies an activation function element-wise to the input.
`BatchNorm`	Batch normalization.
`Pooling`	Performs pooling on the input.
`SoftmaxOutput`	Computes the gradient of cross entropy loss with respect to softmax output.
`softmax`	Applies the softmax function.
`log_softmax`	Computes the log softmax of the input.

More¶

`Correlation`	Applies correlation to inputs.
`Deconvolution`	Computes 2D transposed convolution (aka fractionally strided convolution) of the input tensor.
`RNN`	Applies a recurrent layer to input.
`Embedding`	Maps integer indices to vector representations (embeddings).
`LeakyReLU`	Applies Leaky rectified linear unit activation element-wise to the input.
`InstanceNorm`	Applies instance normalization to the n-dimensional input array.
`L2Normalization`	Normalize the input array using the L2 norm.
`LRN`	Applies local response normalization to the input.
`ROIPooling`	Performs region of interest(ROI) pooling on the input array.
`SoftmaxActivation`	Applies softmax activation to input.
`Dropout`	Applies dropout operation to input array.
`BilinearSampler`	Applies bilinear sampling to input feature map.
`GridGenerator`	Generates 2D sampling grid for bilinear sampling.
`UpSampling`	Performs nearest neighbor/bilinear up sampling to inputs.
`SpatialTransformer`	Applies a spatial transformer to input feature map.
`LinearRegressionOutput`	Computes and optimizes for squared loss during backward propagation.
`LogisticRegressionOutput`	Applies a logistic function to the input.
`MAERegressionOutput`	Computes mean absolute error of the input.
`SVMOutput`	Computes support vector machine based transformation of the input.
`softmax_cross_entropy`	Calculate cross entropy of softmax output and one-hot label.
`smooth_l1`	Calculate Smooth L1 Loss(lhs, scalar) by summing
`IdentityAttachKLSparseReg`	Apply a sparse regularization to the output a sigmoid activation function.
`MakeLoss`	Make your own loss function in network construction.
`BlockGrad`	Stops gradient computation.
`Custom`	Apply a custom operator implemented in a frontend language (like Python).

Contrib¶

Warning

This package contains experimental APIs and may change in the near future.

The contrib.symbol module contains many useful experimental APIs for new features. This is a place for the community to try out the new features, so that feature contributors can receive feedback.

`CTCLoss`	Connectionist Temporal Classification Loss.
`DeformableConvolution`	Compute 2-D deformable convolution on 4-D input.
`DeformablePSROIPooling`	Performs deformable position-sensitive region-of-interest pooling on inputs.The DeformablePSROIPooling operation is described in https://arxiv.org/abs/1703.06211 .batch_size will change to the number of region bounding boxes after DeformablePSROIPooling
`MultiBoxDetection`	Convert multibox detection predictions.
`MultiBoxPrior`	Generate prior(anchor) boxes from data, sizes and ratios.
`MultiBoxTarget`	Compute Multibox training targets
`MultiProposal`	Generate region proposals via RPN
`PSROIPooling`	Performs region-of-interest pooling on inputs.
`Proposal`	Generate region proposals via RPN
`count_sketch`	Apply CountSketch to input: map a d-dimension data to k-dimension data”
`ctc_loss`	Connectionist Temporal Classification Loss.
`dequantize`	Dequantize the input tensor into a float tensor.
`fft`	Apply 1D FFT to input”
`ifft`	Apply 1D ifft to input”
`quantize`	Quantize a input tensor from float to out_type, with user-specified min_range and max_range.

API Reference¶

Symbolic configuration API of MXNet.

class mxnet.symbol.Symbol(handle)[source]¶

Symbol is symbolic graph of the mxnet.

name¶

Gets name string from the symbol, this function only works for non-grouped symbol.

Returns:	value – The name of this symbol, returns `None` for grouped symbol.
Return type:	str

attr(key)[source]¶

Returns the attribute string for corresponding input key from the symbol.

This function only works for non-grouped symbols.

>>> data = mx.sym.Variable('data', attr={'mood': 'angry'})
>>> data.attr('mood')
'angry'

Parameters:	key (str) – The key corresponding to the desired attribute.
Returns:	value – The desired attribute value, returns `None` if the attribute does not exist.
Return type:	str

list_attr(recursive=False)[source]¶

Gets all attributes from the symbol.

>>> data = mx.sym.Variable('data', attr={'mood': 'angry'})
>>> data.list_attr()
{'mood': 'angry'}

Returns:	ret – A dictionary mapping attribute keys to values.
Return type:	Dict of str to str

attr_dict()[source]¶

Recursively gets all attributes from the symbol and its children.

>>> a = mx.sym.Variable('a', attr={'a1':'a2'})
>>> b = mx.sym.Variable('b', attr={'b1':'b2'})
>>> c = a+b
>>> c.attr_dict()
{'a': {'a1': 'a2'}, 'b': {'b1': 'b2'}}

Returns:	ret – There is a key in the returned dict for every child with non-empty attribute set. For each symbol, the name of the symbol is its key in the dict and the correspond value is that symbol’s attribute list (itself a dictionary).
Return type:	Dict of str to dict

get_internals()[source]¶

Gets a new grouped symbol sgroup. The output of sgroup is a list of outputs of all of the internal nodes.

Consider the following code:

>>> a = mx.sym.var('a')
>>> b = mx.sym.var('b')
>>> c = a + b
>>> d = c.get_internals()
>>> d

>>> d.list_outputs()
['a', 'b', '_plus4_output']

Returns:	sgroup – A symbol group containing all internal and leaf nodes of the computation graph used to compute the symbol.
Return type:	Symbol

get_children()[source]¶

Gets a new grouped symbol whose output contains inputs to output nodes of the original symbol.

>>> x = mx.sym.Variable('x')
>>> y = mx.sym.Variable('y')
>>> z = mx.sym.Variable('z')
>>> a = y+z
>>> b = x+a
>>> b.get_children()

>>> b.get_children().list_outputs()
['x', '_plus10_output']
>>> b.get_children().get_children().list_outputs()
['y', 'z']

Returns:	sgroup – The children of the head node. If the symbol has no inputs then `None` will be returned.
Return type:	Symbol or None

list_arguments()[source]¶

Lists all the arguments in the symbol.

>>> a = mx.sym.var('a')
>>> b = mx.sym.var('b')
>>> c = a + b
>>> c.list_arguments
['a', 'b']

Returns:	args – List containing the names of all the arguments required to compute the symbol.
Return type:	list of string

list_outputs()[source]¶

Lists all the outputs in the symbol.

>>> a = mx.sym.var('a')
>>> b = mx.sym.var('b')
>>> c = a + b
>>> c.list_outputs()
['_plus12_output']

Returns:	List of all the outputs. For most symbols, this list contains only the name of this symbol. For symbol groups, this is a list with the names of all symbols in the group.
Return type:	list of str

list_auxiliary_states()[source]¶

Lists all the auxiliary states in the symbol.

>>> a = mx.sym.var('a')
>>> b = mx.sym.var('b')
>>> c = a + b
>>> c.list_auxiliary_states()
[]

Example of auxiliary states in BatchNorm.

>>> data = mx.symbol.Variable('data')
>>> weight = mx.sym.Variable(name='fc1_weight')
>>> fc1  = mx.symbol.FullyConnected(data = data, weight=weight, name='fc1', num_hidden=128)
>>> fc2 = mx.symbol.BatchNorm(fc1, name='batchnorm0')
>>> fc2.list_auxiliary_states()
['batchnorm0_moving_mean', 'batchnorm0_moving_var']

Returns:	aux_states – List of the auxiliary states in input symbol.
Return type:	list of str

Notes

Auxiliary states are special states of symbols that do not correspond to an argument, and are not updated by gradient descent. Common examples of auxiliary states include the moving_mean and moving_variance in BatchNorm. Most operators do not have auxiliary states.

list_inputs()[source]¶

Lists all arguments and auxiliary states of this Symbol.

Returns:	inputs – List of all inputs.
Return type:	list of str

Examples

>>> bn = mx.sym.BatchNorm(name='bn')
>>> bn.list_arguments()
['bn_data', 'bn_gamma', 'bn_beta']
>>> bn.list_auxiliary_states()
['bn_moving_mean', 'bn_moving_var']
>>> bn.list_inputs()
['bn_data', 'bn_gamma', 'bn_beta', 'bn_moving_mean', 'bn_moving_var']

infer_type(*args, **kwargs)[source]¶

Infers the type of all arguments and all outputs, given the known types for some arguments.

This function takes the known types of some arguments in either positional way or keyword argument way as input. It returns a tuple of None values if there is not enough information to deduce the missing types.

Inconsistencies in the known types will cause an error to be raised.

>>> a = mx.sym.var('a')
>>> b = mx.sym.var('b')
>>> c = a + b
>>> arg_types, out_types, aux_types = c.infer_type(a='float32')
>>> arg_types
[, ]
>>> out_types
[]
>>> aux_types
[]

Parameters:

*args – Type of known arguments in a positional way. Unknown type can be marked as None.
**kwargs – Keyword arguments of known types.

Returns:

arg_types (list of numpy.dtype or None) – List of argument types. The order is same as the order of list_arguments().
out_types (list of numpy.dtype or None) – List of output types. The order is same as the order of list_outputs().
aux_types (list of numpy.dtype or None) – List of auxiliary state types. The order is same as the order of list_auxiliary_states().

infer_shape(*args, **kwargs)[source]¶

Infers the shapes of all arguments and all outputs given the known shapes of some arguments.

This function takes the known shapes of some arguments in either positional way or keyword argument way as input. It returns a tuple of None values if there is not enough information to deduce the missing shapes.

>>> a = mx.sym.var('a')
>>> b = mx.sym.var('b')
>>> c = a + b
>>> arg_shapes, out_shapes, aux_shapes = c.infer_shape(a=(3,3))
>>> arg_shapes
[(3L, 3L), (3L, 3L)]
>>> out_shapes
[(3L, 3L)]
>>> aux_shapes
[]
>>> c.infer_shape(a=(0,3)) # 0s in shape means unknown dimensions. So, returns None.
(None, None, None)

Inconsistencies in the known shapes will cause an error to be raised. See the following example:

>>> data = mx.sym.Variable('data')
>>> out = mx.sym.FullyConnected(data=data, name='fc1', num_hidden=1000)
>>> out = mx.sym.Activation(data=out, act_type='relu')
>>> out = mx.sym.FullyConnected(data=out, name='fc2', num_hidden=10)
>>> weight_shape= (1, 100)
>>> data_shape = (100, 100)
>>> out.infer_shape(data=data_shape, fc1_weight=weight_shape)
Error in operator fc1: Shape inconsistent, Provided=(1,100), inferred shape=(1000,100)

Parameters:

*args – Shape of arguments in a positional way. Unknown shape can be marked as None.
**kwargs – Keyword arguments of the known shapes.

Returns:

arg_shapes (list of tuple or None) – List of argument shapes. The order is same as the order of list_arguments().
out_shapes (list of tuple or None) – List of output shapes. The order is same as the order of list_outputs().
aux_shapes (list of tuple or None) – List of auxiliary state shapes. The order is same as the order of list_auxiliary_states().

infer_shape_partial(*args, **kwargs)[source]¶

Infers the shape partially.

This functions works the same way as infer_shape, except that this function can return partial results.

In the following example, information about fc2 is not available. So, infer_shape will return a tuple of None values but infer_shape_partial will return partial values.

>>> data = mx.sym.Variable('data')
>>> prev = mx.sym.Variable('prev')
>>> fc1  = mx.sym.FullyConnected(data=data, name='fc1', num_hidden=128)
>>> fc2  = mx.sym.FullyConnected(data=prev, name='fc2', num_hidden=128)
>>> out  = mx.sym.Activation(data=mx.sym.elemwise_add(fc1, fc2), act_type='relu')
>>> out.list_arguments()
['data', 'fc1_weight', 'fc1_bias', 'prev', 'fc2_weight', 'fc2_bias']
>>> out.infer_shape(data=(10,64))
(None, None, None)
>>> out.infer_shape_partial(data=(10,64))
([(10L, 64L), (128L, 64L), (128L,), (), (), ()], [(10L, 128L)], [])
>>> # infers shape if you give information about fc2
>>> out.infer_shape(data=(10,64), prev=(10,128))
([(10L, 64L), (128L, 64L), (128L,), (10L, 128L), (128L, 128L), (128L,)], [(10L, 128L)], [])

Parameters:

*args – Shape of arguments in a positional way. Unknown shape can be marked as None
**kwargs – Keyword arguments of known shapes.

Returns:

arg_shapes (list of tuple or None) – List of argument shapes. The order is same as the order of list_arguments().
out_shapes (list of tuple or None) – List of output shapes. The order is same as the order of list_outputs().
aux_shapes (list of tuple or None) – List of auxiliary state shapes. The order is same as the order of list_auxiliary_states().

debug_str()[source]¶

Gets a debug string of symbol.

It contains Symbol output, variables and operators in the computation graph with their inputs, variables and attributes.

Returns:	Debug string of the symbol.
Return type:	string

Examples

>>> a = mx.sym.Variable('a')
>>> b = mx.sym.sin(a)
>>> c = 2 * a + b
>>> d = mx.sym.FullyConnected(data=c, num_hidden=10)
>>> d.debug_str()
>>> print d.debug_str()
Symbol Outputs:
        output[0]=fullyconnected0(0)
Variable:a
--------------------
Op:_mul_scalar, Name=_mulscalar0
Inputs:
        arg[0]=a(0) version=0
Attrs:
        scalar=2
--------------------
Op:sin, Name=sin0
Inputs:
        arg[0]=a(0) version=0
--------------------
Op:elemwise_add, Name=_plus0
Inputs:
        arg[0]=_mulscalar0(0)
        arg[1]=sin0(0)
Variable:fullyconnected0_weight
Variable:fullyconnected0_bias
--------------------
Op:FullyConnected, Name=fullyconnected0
Inputs:
        arg[0]=_plus0(0)
        arg[1]=fullyconnected0_weight(0) version=0
        arg[2]=fullyconnected0_bias(0) version=0
Attrs:
        num_hidden=10

save(fname)[source]¶

Saves symbol to a file.

You can also use pickle to do the job if you only work on python. The advantage of load/save functions is that the file contents are language agnostic. This means the model saved by one language binding can be loaded by a different language binding of MXNet. You also get the benefit of being able to directly load/save from cloud storage(S3, HDFS).

Parameters:

fname (str) –

The name of the file.

“s3://my-bucket/path/my-s3-symbol”
“hdfs://my-bucket/path/my-hdfs-symbol”
“/path-to/my-local-symbol”

See also

symbol.load(): Used to load symbol from file.

tojson()[source]¶

Saves symbol to a JSON string.

See also

symbol.load_json(): Used to load symbol from JSON string.

simple_bind(ctx, grad_req='write', type_dict=None, group2ctx=None, shared_arg_names=None, shared_exec=None, shared_buffer=None, **kwargs)[source]¶

Bind current symbol to get an executor, allocate all the arguments needed. Allows specifying data types.

This function simplifies the binding procedure. You need to specify only input data shapes. Before binding the executor, the function allocates arguments and auxiliary states that were not explicitly specified. Allows specifying data types.

>>> x = mx.sym.Variable('x')
>>> y = mx.sym.FullyConnected(x, num_hidden=4)
>>> exe = y.simple_bind(mx.cpu(), x=(5,4), grad_req='null')
>>> exe.forward()
[]
>>> exe.outputs[0].asnumpy()
array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]], dtype=float32)
>>> exe.arg_arrays
[, , ]
>>> exe.grad_arrays
[, , ]

Parameters:	ctx (Context) – The device context the generated executor to run on. grad_req (string) – {‘write’, ‘add’, ‘null’}, or list of str or dict of str to str, optional To specify how we should update the gradient to the args_grad. ‘write’ means every time gradient is written to specified args_grad NDArray. ‘add’ means every time gradient is added to the specified NDArray. ‘null’ means no action is taken, the gradient may not be calculated. type_dict (Dict of str->numpy.dtype) – Input type dictionary, name->dtype group2ctx (Dict of string to mx.Context) – The dict mapping the ctx_group attribute to the context assignment. shared_arg_names (List of string) – The argument names whose NDArray of shared_exec can be reused for initializing the current executor. shared_exec (Executor) – The executor whose arg_arrays, arg_arrays, grad_arrays, and aux_arrays can be reused for initializing the current executor. shared_buffer (Dict of string to NDArray) – The dict mapping argument names to the NDArray that can be reused for initializing the current executor. This buffer will be checked for reuse if one argument name of the current executor is not found in shared_arg_names. kwargs (Dict of str->shape) – Input shape dictionary, name->shape
Returns:	executor – The generated executor
Return type:	mxnet.Executor

bind(ctx, args, args_grad=None, grad_req='write', aux_states=None, group2ctx=None, shared_exec=None)[source]¶

Binds the current symbol to an executor and returns it.

We first declare the computation and then bind to the data to run. This function returns an executor which provides method forward() method for evaluation and a outputs() method to get all the results.

>>> a = mx.sym.Variable('a')
>>> b = mx.sym.Variable('b')
>>> c = a + b

>>> ex = c.bind(ctx=mx.cpu(), args={'a' : mx.nd.ones([2,3]), 'b' : mx.nd.ones([2,3])})
>>> ex.forward()
[]
>>> ex.outputs[0].asnumpy()
[[ 2.  2.  2.]
[ 2.  2.  2.]]

Parameters:	ctx (Context) – The device context the generated executor to run on. args (list of NDArray or dict of str to NDArray) – Input arguments to the symbol. If the input type is a list of NDArray, the order should be same as the order of list_arguments(). If the input type is a dict of str to NDArray, then it maps the name of arguments to the corresponding NDArray. In either case, all the arguments must be provided. args_grad (list of NDArray or dict of str to NDArray, optional) – When specified, args_grad provides NDArrays to hold the result of gradient value in backward. If the input type is a list of NDArray, the order should be same as the order of list_arguments(). If the input type is a dict of str to NDArray, then it maps the name of arguments to the corresponding NDArray. When the type is a dict of str to NDArray, one only need to provide the dict for required argument gradient. Only the specified argument gradient will be calculated. grad_req ({'write', 'add', 'null'}, or list of str or dict of str to str, optional) – To specify how we should update the gradient to the args_grad. ‘write’ means everytime gradient is write to specified args_grad NDArray. ‘add’ means everytime gradient is add to the specified NDArray. ‘null’ means no action is taken, the gradient may not be calculated. aux_states (list of NDArray, or dict of str to NDArray, optional) – Input auxiliary states to the symbol, only needed when the output of list_auxiliary_states() is not empty. If the input type is a list of NDArray, the order should be same as the order of list_auxiliary_states(). If the input type is a dict of str to NDArray, then it maps the name of auxiliary_states to the corresponding NDArray, In either case, all the auxiliary states need to be provided. group2ctx (Dict of string to mx.Context) – The dict mapping the ctx_group attribute to the context assignment. shared_exec (mx.executor.Executor) – Executor to share memory with. This is intended for runtime reshaping, variable length sequences, etc. The returned executor shares state with shared_exec, and should not be used in parallel with it.
Returns:	executor – The generated executor
Return type:	Executor

Notes

Auxiliary states are the special states of symbols that do not correspond to an argument, and do not have gradient but are still useful for the specific operations. Common examples of auxiliary states include the moving_mean and moving_variance states in BatchNorm. Most operators do not have auxiliary states and in those cases, this parameter can be safely ignored.

One can give up gradient by using a dict in args_grad and only specify gradient they interested in.

gradient(wrt)[source]¶

Gets the autodiff of current symbol.

This function can only be used if current symbol is a loss function.

Note

This function is currently not implemented.

Parameters:	wrt (Array of String) – keyword arguments of the symbol that the gradients are taken.
Returns:	grad – A gradient Symbol with returns to be the corresponding gradients.
Return type:	Symbol

eval(ctx=None, **kwargs)[source]¶

Evaluates a symbol given arguments.

The eval method combines a call to bind (which returns an executor) with a call to forward (executor method). For the common use case, where you might repeatedly evaluate with same arguments, eval is slow. In that case, you should call bind once and then repeatedly call forward. This function allows simpler syntax for less cumbersome introspection.

>>> a = mx.sym.Variable('a')
>>> b = mx.sym.Variable('b')
>>> c = a + b
>>> ex = c.eval(ctx = mx.cpu(), a = mx.nd.ones([2,3]), b = mx.nd.ones([2,3]))
>>> ex
[]
>>> ex[0].asnumpy()
array([[ 2.,  2.,  2.],
       [ 2.,  2.,  2.]], dtype=float32)

Parameters:

ctx (Context) – The device context the generated executor to run on.
kwargs (Keyword arguments of type NDArray) – Input arguments to the symbol. All the arguments must be provided.

Returns:

result (a list of NDArrays corresponding to the values taken by each symbol when)
evaluated on given args. When called on a single symbol (not a group),
the result will be a list with one element.

reshape(shape)[source]¶

Shorthand for mxnet.sym.reshape.

Parameters:	shape (tuple of int) – The new shape should not change the array size, namely `np.prod(new_shape)` should be equal to `np.prod(self.shape)`. One shape dimension can be -1. In this case, the value is inferred from the length of the array and remaining dimensions.
Returns:	A reshaped symbol.
Return type:	Symbol

mxnet.symbol.var(name, attr=None, shape=None, lr_mult=None, wd_mult=None, dtype=None, init=None, **kwargs)[source]¶

Creates a symbolic variable with specified name.

>>> data = mx.sym.Variable('data', attr={'a': 'b'})
>>> data

Parameters:	name (str) – Variable name. attr (Dict of strings) – Additional attributes to set on the variable. Format {string : string}. shape (tuple) – The shape of a variable. If specified, this will be used during the shape inference. If one has specified a different shape for this variable using a keyword argument when calling shape inference, this shape information will be ignored. lr_mult (float) – The learning rate multiplier for input variable. wd_mult (float) – Weight decay multiplier for input variable. dtype (str or numpy.dtype) – The dtype for input variable. If not specified, this value will be inferred. init (initializer (mxnet.init.)) – Initializer for this variable to (optionally) override the default initializer. kwargs* (Additional attribute variables) – Additional attributes must start and end with double underscores.
Returns:	variable – A symbol corresponding to an input to the computation graph.
Return type:	Symbol

mxnet.symbol.Variable(name, attr=None, shape=None, lr_mult=None, wd_mult=None, dtype=None, init=None, **kwargs)¶

Creates a symbolic variable with specified name.

>>> data = mx.sym.Variable('data', attr={'a': 'b'})
>>> data

Parameters:	name (str) – Variable name. attr (Dict of strings) – Additional attributes to set on the variable. Format {string : string}. shape (tuple) – The shape of a variable. If specified, this will be used during the shape inference. If one has specified a different shape for this variable using a keyword argument when calling shape inference, this shape information will be ignored. lr_mult (float) – The learning rate multiplier for input variable. wd_mult (float) – Weight decay multiplier for input variable. dtype (str or numpy.dtype) – The dtype for input variable. If not specified, this value will be inferred. init (initializer (mxnet.init.)) – Initializer for this variable to (optionally) override the default initializer. kwargs* (Additional attribute variables) – Additional attributes must start and end with double underscores.
Returns:	variable – A symbol corresponding to an input to the computation graph.
Return type:	Symbol

mxnet.symbol.Group(symbols)[source]¶

Creates a symbol that contains a collection of other symbols, grouped together.

>>> a = mx.sym.Variable('a')
>>> b = mx.sym.Variable('b')
>>> mx.sym.Group([a,b])

Parameters:	symbols (list) – List of symbols to be grouped.
Returns:	sym – A group symbol.
Return type:	Symbol

mxnet.symbol.load(fname)[source]¶

Loads symbol from a JSON file.

You can also use pickle to do the job if you only work on python. The advantage of load/save is the file is language agnostic. This means the file saved using save can be loaded by other language binding of mxnet. You also get the benefit being able to directly load/save from cloud storage(S3, HDFS).

Parameters:

fname (str) –

The name of the file, examples:

s3://my-bucket/path/my-s3-symbol
hdfs://my-bucket/path/my-hdfs-symbol
/path-to/my-local-symbol

Returns: sym – The loaded symbol.

Return type: Symbol

See also

Symbol.save(): Used to save symbol into file.

mxnet.symbol.load_json(json_str)[source]¶

Loads symbol from json string.

Parameters:	json_str (str) – A JSON string.
Returns:	sym – The loaded symbol.
Return type:	Symbol

See also

Symbol.tojson(): Used to save symbol into json string.

mxnet.symbol.pow(base, exp)[source]¶

Returns element-wise result of base element raised to powers from exp element.

Both inputs can be Symbol or scalar number. Broadcasting is not supported. Use broadcast_pow instead.

Parameters:	base (Symbol or scalar) – The base symbol exp (Symbol or scalar) – The exponent symbol
Returns:	The bases in x raised to the exponents in y.
Return type:	Symbol or scalar

Examples

>>> mx.sym.pow(2, 3)
8
>>> x = mx.sym.Variable('x')
>>> y = mx.sym.Variable('y')
>>> z = mx.sym.pow(x, 2)
>>> z.eval(x=mx.nd.array([1,2]))[0].asnumpy()
array([ 1.,  4.], dtype=float32)
>>> z = mx.sym.pow(3, y)
>>> z.eval(y=mx.nd.array([2,3]))[0].asnumpy()
array([  9.,  27.], dtype=float32)
>>> z = mx.sym.pow(x, y)
>>> z.eval(x=mx.nd.array([3,4]), y=mx.nd.array([2,3]))[0].asnumpy()
array([  9.,  64.], dtype=float32)

mxnet.symbol.maximum(left, right)[source]¶

Returns element-wise maximum of the input elements.

Both inputs can be Symbol or scalar number. Broadcasting is not supported.

Parameters:	left (Symbol or scalar) – First symbol to be compared. right (Symbol or scalar) – Second symbol to be compared.
Returns:	The element-wise maximum of the input symbols.
Return type:	Symbol or scalar

Examples

>>> mx.sym.maximum(2, 3.5)
3.5
>>> x = mx.sym.Variable('x')
>>> y = mx.sym.Variable('y')
>>> z = mx.sym.maximum(x, 4)
>>> z.eval(x=mx.nd.array([3,5,2,10]))[0].asnumpy()
array([  4.,   5.,   4.,  10.], dtype=float32)
>>> z = mx.sym.maximum(x, y)
>>> z.eval(x=mx.nd.array([3,4]), y=mx.nd.array([10,2]))[0].asnumpy()
array([ 10.,   4.], dtype=float32)

mxnet.symbol.minimum(left, right)[source]¶

Returns element-wise minimum of the input elements.

Both inputs can be Symbol or scalar number. Broadcasting is not supported.

Parameters:	left (Symbol or scalar) – First symbol to be compared. right (Symbol or scalar) – Second symbol to be compared.
Returns:	The element-wise minimum of the input symbols.
Return type:	Symbol or scalar

Examples

>>> mx.sym.minimum(2, 3.5)
2
>>> x = mx.sym.Variable('x')
>>> y = mx.sym.Variable('y')
>>> z = mx.sym.minimum(x, 4)
>>> z.eval(x=mx.nd.array([3,5,2,10]))[0].asnumpy()
array([ 3.,  4.,  2.,  4.], dtype=float32)
>>> z = mx.sym.minimum(x, y)
>>> z.eval(x=mx.nd.array([3,4]), y=mx.nd.array([10,2]))[0].asnumpy()
array([ 3.,  2.], dtype=float32)

mxnet.symbol.hypot(left, right)[source]¶

Given the “legs” of a right triangle, returns its hypotenuse.

Equivalent to \(\sqrt(left^2 + right^2)\), element-wise. Both inputs can be Symbol or scalar number. Broadcasting is not supported.

Parameters:	left (Symbol or scalar) – First leg of the triangle(s). right (Symbol or scalar) – Second leg of the triangle(s).
Returns:	The hypotenuse of the triangle(s)
Return type:	Symbol or scalar

Examples

>>> mx.sym.hypot(3, 4)
5.0
>>> x = mx.sym.Variable('x')
>>> y = mx.sym.Variable('y')
>>> z = mx.sym.hypot(x, 4)
>>> z.eval(x=mx.nd.array([3,5,2]))[0].asnumpy()
array([ 5.,  6.40312433,  4.47213602], dtype=float32)
>>> z = mx.sym.hypot(x, y)
>>> z.eval(x=mx.nd.array([3,4]), y=mx.nd.array([10,2]))[0].asnumpy()
array([ 10.44030666,   4.47213602], dtype=float32)

mxnet.symbol.zeros(shape, dtype=None, **kwargs)[source]¶

Returns a new symbol of given shape and type, filled with zeros.

Parameters:	shape (int or sequence of ints) – Shape of the new array. dtype (str or numpy.dtype, optional) – The value type of the inner value, default to `np.float32`.
Returns:	out – The created Symbol.
Return type:	Symbol

mxnet.symbol.ones(shape, dtype=None, **kwargs)[source]¶

Returns a new symbol of given shape and type, filled with ones.

Parameters:	shape (int or sequence of ints) – Shape of the new array. dtype (str or numpy.dtype, optional) – The value type of the inner value, default to `np.float32`.
Returns:	out – The created Symbol
Return type:	Symbol

mxnet.symbol.full(shape, val, dtype=None, **kwargs)[source]¶

Returns a new array of given shape and type, filled with the given value val.

Parameters:	shape (int or sequence of ints) – Shape of the new array. val (scalar) – Fill value. dtype (str or numpy.dtype, optional) – The value type of the inner value, default to `np.float32`.
Returns:	out – The created Symbol
Return type:	Symbol

mxnet.symbol.arange(start, stop=None, step=1.0, repeat=1, name=None, dtype=None)[source]¶

Returns evenly spaced values within a given interval.

Parameters:	start (number) – Start of interval. The interval includes this value. The default start value is 0. stop (number, optional) – End of interval. The interval does not include this value. step (number, optional) – Spacing between values. repeat (int, optional) – “The repeating time of all elements. E.g repeat=3, the element a will be repeated three times –> a, a, a. dtype (str or numpy.dtype, optional) – The value type of the inner value, default to `np.float32`.
Returns:	out – The created Symbol
Return type:	Symbol

mxnet.symbol.Activation(data=None, act_type=_Null, name=None, attr=None, out=None, **kwargs)¶

Applies an activation function element-wise to the input.

The following activation functions are supported:

relu: Rectified Linear Unit, \(y = max(x, 0)\)
sigmoid: \(y = \frac{1}{1 + exp(-x)}\)
tanh: Hyperbolic tangent, \(y = \frac{exp(x) - exp(-x)}{exp(x) + exp(-x)}\)
softrelu: Soft ReLU, or SoftPlus, \(y = log(1 + exp(x))\)

Defined in src/operator/activation.cc:L91

Parameters:	data (Symbol) – Input array to activation function. act_type ({'relu', 'sigmoid', 'softrelu', 'tanh'}, required) – Activation function to be applied. name (string, optional.) – Name of the resulting symbol.
Returns:	The result symbol.
Return type:	Symbol

Examples

A one-hidden-layer MLP with ReLU activation:

>>> data = Variable('data')
>>> mlp = FullyConnected(data=data, num_hidden=128, name='proj')
>>> mlp = Activation(data=mlp, act_type='relu', name='activation')
>>> mlp = FullyConnected(data=mlp, num_hidden=10, name='mlp')
>>> mlp

ReLU activation

>>> test_suites = [
... ('relu', lambda x: np.maximum(x, 0)),
... ('sigmoid', lambda x: 1 / (1 + np.exp(-x))),
... ('tanh', lambda x: np.tanh(x)),
... ('softrelu', lambda x: np.log(1 + np.exp(x)))
... ]
>>> x = test_utils.random_arrays((2, 3, 4))
>>> for act_type, numpy_impl in test_suites:
... op = Activation(act_type=act_type, name='act')
... y = test_utils.simple_forward(op, act_data=x)
... y_np = numpy_impl(x)
... print('%s: %s' % (act_type, test_utils.almost_equal(y, y_np)))
relu: True
sigmoid: True
tanh: True
softrelu: True

mxnet.symbol.BatchNorm(data=None, gamma=None, beta=None, moving_mean=None, moving_var=None, eps=_Null, momentum=_Null, fix_gamma=_Null, use_global_stats=_Null, output_mean_var=_Null, axis=_Null, cudnn_off=_Null, name=None, attr=None, out=None, **kwargs)¶

Batch normalization.

Normalizes a data batch by mean and variance, and applies a scale gamma as well as offset beta.

Assume the input has more than one dimension and we normalize along axis 1. We first compute the mean and variance along this axis:

\[\begin{split}data\_mean[i] = mean(data[:,i,:,...]) \\ data\_var[i] = var(data[:,i,:,...])\end{split}\]

Then compute the normalized output, which has the same shape as input, as following:

\[out[:,i,:,...] = \frac{data[:,i,:,...] - data\_mean[i]}{\sqrt{data\_var[i]+\epsilon}} * gamma[i] + beta[i]\]

Both mean and var returns a scalar by treating the input as a vector.

Assume the input has size k on axis 1, then both gamma and beta have shape (k,). If output_mean_var is set to be true, then outputs both data_mean and data_var as well, which are needed for the backward pass.

Besides the inputs and the outputs, this operator accepts two auxiliary states, moving_mean and moving_var, which are k-length vectors. They are global statistics for the whole dataset, which are updated by:

moving_mean = moving_mean * momentum + data_mean * (1 - momentum)
moving_var = moving_var * momentum + data_var * (1 - momentum)

If use_global_stats is set to be true, then moving_mean and moving_var are used instead of data_mean and data_var to compute the output. It is often used during inference.

The parameter axis specifies which axis of the input shape denotes the ‘channel’ (separately normalized groups). The default is 1. Specifying -1 sets the channel axis to be the last item in the input shape.

Both gamma and beta are learnable parameters. But if fix_gamma is true, then set gamma to 1 and its gradient to 0.

Defined in src/operator/batch_norm.cc:L399

Parameters:	data (Symbol) – Input data to batch normalization gamma (Symbol) – gamma array beta (Symbol) – beta array moving_mean (Symbol) – running mean of input moving_var (Symbol) – running variance of input eps (double, optional, default=0.001) – Epsilon to prevent div 0. Must be no less than CUDNN_BN_MIN_EPSILON defined in cudnn.h when using cudnn (usually 1e-5) momentum (float, optional, default=0.9) – Momentum for moving average fix_gamma (boolean, optional, default=True) – Fix gamma while training use_global_stats (boolean, optional, default=False) – Whether use global moving statistics instead of local batch-norm. This will force change batch-norm into a scale shift operator. output_mean_var (boolean, optional, default=False) – Output All,normal mean and var axis (int, optional, default='1') – Specify which shape axis the channel is specified cudnn_off (boolean, optional, default=False) – Do not select CUDNN operator, if available name (string, optional.) – Name of the resulting symbol.
Returns:	The result symbol.
Return type:	Symbol

mxnet.symbol.BatchNorm_v1(data=None, gamma=None, beta=None, eps=_Null, momentum=_Null, fix_gamma=_Null, use_global_stats=_Null, output_mean_var=_Null, name=None, attr=None, out=None, **kwargs)¶