Symbol API¶

Overview¶

This document lists the routines of the symbolic expression package:

mxnet.symbol Symbol API of MXNet.

The Symbol API, defined in the symbol (or simply sym) package, provides neural network graphs and auto-differentiation. A symbol represents a multi-output symbolic expression. They are composited by operators, such as simple matrix operations (e.g. “+”), or a neural network layer (e.g. convolution layer). An operator can take several input variables, produce more than one output variables, and have internal state variables. A variable can be either free, which we can bind with value later, or an output of another symbol.

>>> a = mx.sym.Variable('a')
>>> b = mx.sym.Variable('b')
>>> c = 2 * a + b
>>> type(c)

>>> e = c.bind(mx.cpu(), {'a': mx.nd.array([1,2]), 'b':mx.nd.array([2,3])})
>>> y = e.forward()
>>> y
[]
>>> y[0].asnumpy()
array([ 4.,  7.], dtype=float32)

A detailed tutorial is available at Symbol - Neural network graphs and auto-differentiation.

Note

most operators provided in symbol are similar to those in ndarray although there are few differences:

symbol adopts declarative programming. In other words, we need to first compose the computations, and then feed it with data for execution whereas ndarray adopts imperative programming.
Most binary operators in symbol such as + and > don’t broadcast. We need to call the broadcast version of the operator such as broadcast_plus explicitly.

In the rest of this document, we first overview the methods provided by the symbol.Symbol class, and then list other routines provided by the symbol package.

The `Symbol` class¶

Composition¶

Composite multiple symbols into a new one by an operator.

Symbol.__call__ Composes symbol using inputs.

Arithmetic operations¶

`Symbol.__add__`	x.__add__(y) <=> x+y
`Symbol.__sub__`	x.__sub__(y) <=> x-y
`Symbol.__rsub__`	x.__rsub__(y) <=> y-x
`Symbol.__neg__`	x.__neg__() <=> -x
`Symbol.__mul__`	x.__mul__(y) <=> x*y
`Symbol.__div__`	x.__div__(y) <=> x/y
`Symbol.__rdiv__`	x.__rdiv__(y) <=> y/x
`Symbol.__mod__`	x.__mod__(y) <=> x%y
`Symbol.__rmod__`	x.__rmod__(y) <=> y%x
`Symbol.__pow__`	x.__pow__(y) <=> x**y

Trigonometric functions¶

`Symbol.sin`	Convenience fluent method for `sin()`.
`Symbol.cos`	Convenience fluent method for `cos()`.
`Symbol.tan`	Convenience fluent method for `tan()`.
`Symbol.arcsin`	Convenience fluent method for `arcsin()`.
`Symbol.arccos`	Convenience fluent method for `arccos()`.
`Symbol.arctan`	Convenience fluent method for `arctan()`.
`Symbol.degrees`	Convenience fluent method for `degrees()`.
`Symbol.radians`	Convenience fluent method for `radians()`.

Hyperbolic functions¶

`Symbol.sinh`	Convenience fluent method for `sinh()`.
`Symbol.cosh`	Convenience fluent method for `cosh()`.
`Symbol.tanh`	Convenience fluent method for `tanh()`.
`Symbol.arcsinh`	Convenience fluent method for `arcsinh()`.
`Symbol.arccosh`	Convenience fluent method for `arccosh()`.
`Symbol.arctanh`	Convenience fluent method for `arctanh()`.

Exponents and logarithms¶

`Symbol.exp`	Convenience fluent method for `exp()`.
`Symbol.expm1`	Convenience fluent method for `expm1()`.
`Symbol.log`	Convenience fluent method for `log()`.
`Symbol.log10`	Convenience fluent method for `log10()`.
`Symbol.log2`	Convenience fluent method for `log2()`.
`Symbol.log1p`	Convenience fluent method for `log1p()`.

Powers¶

`Symbol.sqrt`	Convenience fluent method for `sqrt()`.
`Symbol.rsqrt`	Convenience fluent method for `rsqrt()`.
`Symbol.cbrt`	Convenience fluent method for `cbrt()`.
`Symbol.rcbrt`	Convenience fluent method for `rcbrt()`.
`Symbol.square`	Convenience fluent method for `square()`.

Basic neural network functions¶

`Symbol.relu`	Convenience fluent method for `relu()`.
`Symbol.sigmoid`	Convenience fluent method for `sigmoid()`.
`Symbol.softmax`	Convenience fluent method for `softmax()`.
`Symbol.log_softmax`	Convenience fluent method for `log_softmax()`.

Comparison operators¶

`Symbol.__lt__`	x.__lt__(y) <=> x
`Symbol.__le__`	x.__le__(y) <=> x<=y
`Symbol.__gt__`	x.__gt__(y) <=> x>y
`Symbol.__ge__`	x.__ge__(y) <=> x>=y
`Symbol.__eq__`	x.__eq__(y) <=> x==y
`Symbol.__ne__`	x.__ne__(y) <=> x!=y

Symbol creation¶

`Symbol.zeros_like`	Convenience fluent method for `zeros_like()`.
`Symbol.ones_like`	Convenience fluent method for `ones_like()`.

Changing shape and type¶

`Symbol.astype`	Convenience fluent method for `cast()`.
`Symbol.reshape`	Convenience fluent method for `reshape()`.
`Symbol.reshape_like`	Convenience fluent method for `reshape_like()`.
`Symbol.flatten`	Convenience fluent method for `flatten()`.
`Symbol.expand_dims`	Convenience fluent method for `expand_dims()`.

Expanding elements¶

`Symbol.broadcast_to`	Convenience fluent method for `broadcast_to()`.
`Symbol.broadcast_axes`	Convenience fluent method for `broadcast_axes()`.
`Symbol.tile`	Convenience fluent method for `tile()`.
`Symbol.pad`	Convenience fluent method for `pad()`.

Rearranging elements¶

`Symbol.transpose`	Convenience fluent method for `transpose()`.
`Symbol.swapaxes`	Convenience fluent method for `swapaxes()`.
`Symbol.flip`	Convenience fluent method for `flip()`.

Reduce functions¶

`Symbol.sum`	Convenience fluent method for `sum()`.
`Symbol.nansum`	Convenience fluent method for `nansum()`.
`Symbol.prod`	Convenience fluent method for `prod()`.
`Symbol.nanprod`	Convenience fluent method for `nanprod()`.
`Symbol.mean`	Convenience fluent method for `mean()`.
`Symbol.max`	Convenience fluent method for `max()`.
`Symbol.min`	Convenience fluent method for `min()`.
`Symbol.norm`	Convenience fluent method for `norm()`.

Rounding¶

`Symbol.round`	Convenience fluent method for `round()`.
`Symbol.rint`	Convenience fluent method for `rint()`.
`Symbol.fix`	Convenience fluent method for `fix()`.
`Symbol.floor`	Convenience fluent method for `floor()`.
`Symbol.ceil`	Convenience fluent method for `ceil()`.
`Symbol.trunc`	Convenience fluent method for `trunc()`.

Sorting and searching¶

`Symbol.sort`	Convenience fluent method for `sort()`.
`Symbol.argsort`	Convenience fluent method for `argsort()`.
`Symbol.topk`	Convenience fluent method for `topk()`.
`Symbol.argmax`	Convenience fluent method for `argmax()`.
`Symbol.argmin`	Convenience fluent method for `argmin()`.
`Symbol.argmax_channel`	Convenience fluent method for `argmax_channel()`.

Query information¶

`Symbol.name`	Gets name string from the symbol, this function only works for non-grouped symbol.
`Symbol.list_arguments`	Lists all the arguments in the symbol.
`Symbol.list_outputs`	Lists all the outputs in the symbol.
`Symbol.list_auxiliary_states`	Lists all the auxiliary states in the symbol.
`Symbol.list_attr`	Gets all attributes from the symbol.
`Symbol.attr`	Returns the attribute string for corresponding input key from the symbol.
`Symbol.attr_dict`	Recursively gets all attributes from the symbol and its children.

Indexing¶

`Symbol.slice`	Convenience fluent method for `slice()`.
`Symbol.slice_axis`	Convenience fluent method for `slice_axis()`.
`Symbol.take`	Convenience fluent method for `take()`.
`Symbol.one_hot`	Convenience fluent method for `one_hot()`.
`Symbol.pick`	Convenience fluent method for `pick()`.

Get internal and output symbol¶

`Symbol.__getitem__`	x.__getitem__(i) <=> x[i]
`Symbol.__iter__`	Returns a generator object of symbol.
`Symbol.get_internals`	Gets a new grouped symbol sgroup.
`Symbol.get_children`	Gets a new grouped symbol whose output contains inputs to output nodes of the original symbol.

Inference type and shape¶

`Symbol.infer_type`	Infers the type of all arguments and all outputs, given the known types for some arguments.
`Symbol.infer_shape`	Infers the shapes of all arguments and all outputs given the known shapes of some arguments.
`Symbol.infer_shape_partial`	Infers the shape partially.

Bind¶

`Symbol.bind`	Binds the current symbol to an executor and returns it.
`Symbol.simple_bind`	Bind current symbol to get an executor, allocate all the arguments needed.

Save¶

`Symbol.save`	Saves symbol to a file.
`Symbol.tojson`	Saves symbol to a JSON string.
`Symbol.debug_str`	Gets a debug string of symbol.

Miscellaneous¶

`Symbol.clip`	Convenience fluent method for `clip()`.
`Symbol.sign`	Convenience fluent method for `sign()`.

Symbol creation routines¶

`var`	Creates a symbolic variable with specified name.
`zeros`	Returns a new symbol of given shape and type, filled with zeros.
`zeros_like`	Return an array of zeros with the same shape and type as the input array.
`ones`	Returns a new symbol of given shape and type, filled with ones.
`ones_like`	Return an array of ones with the same shape and type as the input array.
`arange`	Returns evenly spaced values within a given interval.

Symbol manipulation routines¶

Changing shape and type¶

`cast`	Casts all elements of the input to a new type.
`reshape`	Reshapes the input array.
`reshape_like`	Reshape lhs to have the same shape as rhs.
`flatten`	Flattens the input array into a 2-D array by collapsing the higher dimensions.
`expand_dims`	Inserts a new axis of size 1 into the array shape

Expanding elements¶

`broadcast_to`	Broadcasts the input array to a new shape.
`broadcast_axes`	Broadcasts the input array over particular axes.
`repeat`	Repeats elements of an array.
`tile`	Repeats the whole array multiple times.
`pad`	Pads an input array with a constant or edge values of the array.

Rearranging elements¶

`transpose`	Permutes the dimensions of an array.
`swapaxes`	Interchanges two axes of an array.
`flip`	Reverses the order of elements along given axis while preserving array shape.

Joining and splitting symbols¶

`concat`	Joins input arrays along a given axis.
`split`	Splits an array along a particular axis into multiple sub-arrays.
`stack`	Join a sequence of arrays along a new axis.

Indexing routines¶

`slice`	Slices a contiguous region of the array.
`slice_axis`	Slices along a given axis.
`take`	Takes elements from an input array along the given axis.
`batch_take`	Takes elements from a data batch.
`one_hot`	Returns a one-hot array.
`pick`	Picks elements from an input array according to the input indices along the given axis.
`where`	Given three ndarrays, condition, x, and y, return an ndarray with the elements from x or y, depending on the elements from condition are true or false.
`gather_nd`	Gather elements or slices from data and store to a tensor whose shape is defined by indices.
`scatter_nd`	Scatters data into a new tensor according to indices.

Mathematical functions¶

Arithmetic operations¶

`broadcast_add`	Returns element-wise sum of the input arrays with broadcasting.
`broadcast_sub`	Returns element-wise difference of the input arrays with broadcasting.
`broadcast_mul`	Returns element-wise product of the input arrays with broadcasting.
`broadcast_div`	Returns element-wise division of the input arrays with broadcasting.
`broadcast_mod`	Returns element-wise modulo of the input arrays with broadcasting.
`negative`	Numerical negative of the argument, element-wise.
`dot`	Dot product of two arrays.
`batch_dot`	Batchwise dot product.
`add_n`	Adds all input arguments element-wise.

Trigonometric functions¶

`sin`	Computes the element-wise sine of the input array.
`cos`	Computes the element-wise cosine of the input array.
`tan`	Computes the element-wise tangent of the input array.
`arcsin`	Returns element-wise inverse sine of the input array.
`arccos`	Returns element-wise inverse cosine of the input array.
`arctan`	Returns element-wise inverse tangent of the input array.
`hypot`	Given the “legs” of a right triangle, returns its hypotenuse.
`broadcast_hypot`	Returns the hypotenuse of a right angled triangle, given its “legs” with broadcasting.
`degrees`	Converts each element of the input array from radians to degrees.
`radians`	Converts each element of the input array from degrees to radians.

Hyperbolic functions¶

`sinh`	Returns the hyperbolic sine of the input array, computed element-wise.
`cosh`	Returns the hyperbolic cosine of the input array, computed element-wise.
`tanh`	Returns the hyperbolic tangent of the input array, computed element-wise.
`arcsinh`	Returns the element-wise inverse hyperbolic sine of the input array, computed element-wise.
`arccosh`	Returns the element-wise inverse hyperbolic cosine of the input array, computed element-wise.
`arctanh`	Returns the element-wise inverse hyperbolic tangent of the input array, computed element-wise.

Reduce functions¶

`sum`	Computes the sum of array elements over given axes.
`nansum`	Computes the sum of array elements over given axes treating Not a Numbers (`NaN`) as zero.
`prod`	Computes the product of array elements over given axes.
`nanprod`	Computes the product of array elements over given axes treating Not a Numbers (`NaN`) as one.
`mean`	Computes the mean of array elements over given axes.
`max`	Computes the max of array elements over given axes.
`min`	Computes the min of array elements over given axes.
`norm`	Flattens the input array and then computes the l2 norm.

Rounding¶

`round`	Returns element-wise rounded value to the nearest integer of the input.
`rint`	Returns element-wise rounded value to the nearest integer of the input.
`fix`	Returns element-wise rounded value to the nearest integer towards zero of the input.
`floor`	Returns element-wise floor of the input.
`ceil`	Returns element-wise ceiling of the input.
`trunc`	Return the element-wise truncated value of the input.

Exponents and logarithms¶

`exp`	Returns element-wise exponential value of the input.
`expm1`	Returns `exp(x) - 1` computed element-wise on the input.
`log`	Returns element-wise Natural logarithmic value of the input.
`log10`	Returns element-wise Base-10 logarithmic value of the input.
`log2`	Returns element-wise Base-2 logarithmic value of the input.
`log1p`	Returns element-wise `log(1 + x)` value of the input.

Powers¶

`broadcast_power`	Returns result of first array elements raised to powers from second array, element-wise with broadcasting.
`sqrt`	Returns element-wise square-root value of the input.
`rsqrt`	Returns element-wise inverse square-root value of the input.
`cbrt`	Returns element-wise cube-root value of the input.
`rcbrt`	Returns element-wise inverse cube-root value of the input.
`square`	Returns element-wise squared value of the input.
`reciprocal`	Returns the reciprocal of the argument, element-wise.

Comparison¶

`broadcast_equal`	Returns the result of element-wise equal to (==) comparison operation with broadcasting.
`broadcast_not_equal`	Returns the result of element-wise not equal to (!=) comparison operation with broadcasting.
`broadcast_greater`	Returns the result of element-wise greater than (>) comparison operation with broadcasting.
`broadcast_greater_equal`	Returns the result of element-wise greater than or equal to (>=) comparison operation with broadcasting.
`broadcast_lesser`	Returns the result of element-wise lesser than (<) comparison operation with broadcasting.
`broadcast_lesser_equal`	Returns the result of element-wise lesser than or equal to (<=) comparison operation with broadcasting.

Random sampling¶

`sample_uniform`	Concurrent sampling from multiple uniform distributions on the intervals given by [low,high).
`sample_normal`	Concurrent sampling from multiple normal distributions with parameters mu (mean) and sigma (standard deviation).
`sample_gamma`	Concurrent sampling from multiple gamma distributions with parameters alpha (shape) and beta (scale).
`sample_exponential`	Concurrent sampling from multiple exponential distributions with parameters lambda (rate).
`sample_poisson`	Concurrent sampling from multiple Poisson distributions with parameters lambda (rate).
`sample_negative_binomial`	Concurrent sampling from multiple negative binomial distributions with parameters k (failure limit) and p (failure probability).
`sample_generalized_negative_binomial`	Concurrent sampling from multiple generalized negative binomial distributions with parameters mu (mean) and alpha (dispersion).
`mxnet.random.seed`	Seeds the random number generators in MXNet.

Sorting and searching¶

`sort`	Returns a sorted copy of an input array along the given axis.
`topk`	Returns the top k elements in an input array along the given axis.
`argsort`	Returns the indices that would sort an input array along the given axis.
`argmax`	Returns indices of the maximum values along an axis.
`argmin`	Returns indices of the minimum values along an axis.
`argmax_channel`	Returns argmax indices of each channel from the input array.

Sequence operation¶

`SequenceLast`	Takes the last element of a sequence.
`SequenceMask`	Sets all elements outside the sequence to a constant value.
`SequenceReverse`	Reverses the elements of each sequence.

Miscellaneous¶

`maximum`	Returns element-wise maximum of the input elements.
`minimum`	Returns element-wise minimum of the input elements.
`broadcast_maximum`	Returns element-wise maximum of the input arrays with broadcasting.
`broadcast_minimum`	Returns element-wise minimum of the input arrays with broadcasting.
`clip`	Clips (limits) the values in an array.
`abs`	Returns element-wise absolute value of the input.
`sign`	Returns element-wise sign of the input.
`gamma`	Returns the gamma function (extension of the factorial function to the reals), computed element-wise on the input array.
`gammaln`	Returns element-wise log of the absolute value of the gamma function of the input.

Neural network¶

Basic¶

`FullyConnected`	Applies a linear transformation: \(Y = XW^T + b\).
`Convolution`	Compute N-D convolution on (N+2)-D input.
`Activation`	Applies an activation function element-wise to the input.
`BatchNorm`	Batch normalization.
`Pooling`	Performs pooling on the input.
`SoftmaxOutput`	Computes the gradient of cross entropy loss with respect to softmax output.
`softmax`	Applies the softmax function.
`log_softmax`	Computes the log softmax of the input.
`relu`	Computes rectified linear.
`sigmoid`	Computes sigmoid of x element-wise.

More¶

`Correlation`	Applies correlation to inputs.
`Deconvolution`	Computes 2D transposed convolution (aka fractionally strided convolution) of the input tensor.
`RNN`	Applies a recurrent layer to input.
`Embedding`	Maps integer indices to vector representations (embeddings).
`LeakyReLU`	Applies Leaky rectified linear unit activation element-wise to the input.
`InstanceNorm`	Applies instance normalization to the n-dimensional input array.
`L2Normalization`	Normalize the input array using the L2 norm.
`LRN`	Applies local response normalization to the input.
`ROIPooling`	Performs region of interest(ROI) pooling on the input array.
`SoftmaxActivation`	Applies softmax activation to input.
`Dropout`	Applies dropout operation to input array.
`BilinearSampler`	Applies bilinear sampling to input feature map.
`GridGenerator`	Generates 2D sampling grid for bilinear sampling.
`UpSampling`	Performs nearest neighbor/bilinear up sampling to inputs.
`SpatialTransformer`	Applies a spatial transformer to input feature map.
`LinearRegressionOutput`	Computes and optimizes for squared loss during backward propagation.
`LogisticRegressionOutput`	Applies a logistic function to the input.
`MAERegressionOutput`	Computes mean absolute error of the input.
`SVMOutput`	Computes support vector machine based transformation of the input.
`softmax_cross_entropy`	Calculate cross entropy of softmax output and one-hot label.
`smooth_l1`	Calculate Smooth L1 Loss(lhs, scalar) by summing
`IdentityAttachKLSparseReg`	Apply a sparse regularization to the output a sigmoid activation function.
`MakeLoss`	Make your own loss function in network construction.
`BlockGrad`	Stops gradient computation.
`Custom`	Apply a custom operator implemented in a frontend language (like Python).

API Reference¶

class mxnet.symbol.Symbol(handle)[source]¶

Symbol is symbolic graph of the mxnet.

__repr__()[source]¶: Gets a string representation of the symbol.

__iter__()[source]¶

Returns a generator object of symbol.

One can loop through the returned object list to get outputs.

Example

>>> a = mx.sym.Variable('a')
>>> b = mx.sym.Variable('b')
>>> c = a+b
>>> d = mx.sym.Variable('d')
>>> e = d+c
>>> out = e.get_children()
>>> out

>>> for i in out:
...     i
...

__add__(other)[source]¶

x.__add__(y) <=> x+y

Scalar input is supported. Broadcasting is not supported. Use broadcast_add instead.

__sub__(other)[source]¶

x.__sub__(y) <=> x-y

Scalar input is supported. Broadcasting is not supported. Use broadcast_sub instead.

__rsub__(other)[source]¶

x.__rsub__(y) <=> y-x

Only NDArray is supported for now.

Example

>>> x = mx.nd.ones((2,3))*3
>>> y = mx.nd.ones((2,3))
>>> x.__rsub__(y).asnumpy()
array([[-2., -2., -2.],
       [-2., -2., -2.]], dtype=float32)

__mul__(other)[source]¶

x.__mul__(y) <=> x*y

Scalar input is supported. Broadcasting is not supported. Use broadcast_mul instead.

__div__(other)[source]¶

x.__div__(y) <=> x/y

Scalar input is supported. Broadcasting is not supported. Use broadcast_div instead.

__rdiv__(other)[source]¶

x.__rdiv__(y) <=> y/x

Only NDArray is supported for now.

Example

>>> x = mx.nd.ones((2,3))*3
>>> y = mx.nd.ones((2,3))
>>> x.__rdiv__(y).asnumpy()
array([[ 0.33333334,  0.33333334,  0.33333334],
       [ 0.33333334,  0.33333334,  0.33333334]], dtype=float32)

__mod__(other)[source]¶

x.__mod__(y) <=> x%y

Scalar input is supported. Broadcasting is not supported. Use broadcast_mod instead.

__rmod__(other)[source]¶

x.__rmod__(y) <=> y%x

Only NDArray is supported for now.

Example

>>> x = mx.nd.ones((2,3))*3
>>> y = mx.nd.ones((2,3))
>>> x.__rmod__(y).asnumpy()
array([[ 1.,  1.,  1.,
       [ 1.,  1.,  1., dtype=float32)

__pow__(other)[source]¶

x.__pow__(y) <=> x**y

Scalar input is supported. Broadcasting is not supported. Use broadcast_pow instead.

__neg__()[source]¶

x.__neg__() <=> -x

Numerical negative, element-wise.

Example

>>> a = mx.sym.Variable('a')
>>> a

>>> -a

>>> a_neg = a.__neg__()
>>> c = a_neg*b
>>> ex = c.eval(ctx=mx.cpu(), a=mx.nd.ones([2,3]), b=mx.nd.ones([2,3]))
>>> ex[0].asnumpy()
array([[-1., -1., -1.],
       [-1., -1., -1.]], dtype=float32)

__deepcopy__(_)[source]¶

Returns a deep copy of the input object.

This function returns a deep copy of the input object including the current state of all its parameters such as weights, biases, etc.

Any changes made to the deep copy do not reflect in the original object.

Example

>>> import copy
>>> data = mx.sym.Variable('data')
>>> data_1 = copy.deepcopy(data)
>>> data_1 = 2*data
>>> data_1.tojson()
>>> data_1 is data    # Data got modified
False

__eq__(other)[source]¶

x.__eq__(y) <=> x==y

Scalar input is supported. Broadcasting is not supported. Use broadcast_equal instead.

__ne__(other)[source]¶

x.__ne__(y) <=> x!=y

Scalar input is supported. Broadcasting is not supported. Use broadcast_not_equal instead.

__gt__(other)[source]¶

x.__gt__(y) <=> x>y

Scalar input is supported. Broadcasting is not supported. Use broadcast_greater instead.

__ge__(other)[source]¶

x.__ge__(y) <=> x>=y

Scalar input is supported. Broadcasting is not supported. Use broadcast_greater_equal instead.

__lt__(other)[source]¶

x.__lt__(y) <=> x

Scalar input is supported. Broadcasting is not supported. Use broadcast_lesser instead.

__le__(other)[source]¶

x.__le__(y) <=> x<=y

Scalar input is supported. Broadcasting is not supported. Use broadcast_lesser_equal instead.

__call__(*args, **kwargs)[source]¶

Composes symbol using inputs.

x.__call__(y, z) <=> x(y,z)

This function internally calls _compose to compose the symbol and returns the composed symbol.

Example

>>> data = mx.symbol.Variable('data')
>>> net1 = mx.symbol.FullyConnected(data=data, name='fc1', num_hidden=10)
>>> net2 = mx.symbol.FullyConnected(name='fc3', num_hidden=10)
>>> composed = net2(fc3_data=net1, name='composed')
>>> composed

>>> called = net2.__call__(fc3_data=net1, name='composed')
>>> called

Parameters:	args – Positional arguments. kwargs – Keyword arguments.
Returns:
Return type:	The resulting symbol.

__getitem__(index)[source]¶

x.__getitem__(i) <=> x[i]

Returns a sliced view of the input symbol.

Example

>>> a = mx.sym.var('a')
>>> a.__getitem__(0)

>>> a[0]

Parameters:	index (int or str) – Indexing key

name¶

Gets name string from the symbol, this function only works for non-grouped symbol.

Returns:	value – The name of this symbol, returns `None` for grouped symbol.
Return type:	str

attr(key)[source]¶

Returns the attribute string for corresponding input key from the symbol.

This function only works for non-grouped symbols.

Example

>>> data = mx.sym.Variable('data', attr={'mood': 'angry'})
>>> data.attr('mood')
'angry'

Parameters:	key (str) – The key corresponding to the desired attribute.
Returns:	value – The desired attribute value, returns `None` if the attribute does not exist.
Return type:	str

list_attr(recursive=False)[source]¶

Gets all attributes from the symbol.

Example

>>> data = mx.sym.Variable('data', attr={'mood': 'angry'})
>>> data.list_attr()
{'mood': 'angry'}

Returns:	ret – A dictionary mapping attribute keys to values.
Return type:	Dict of str to str

attr_dict()[source]¶

Recursively gets all attributes from the symbol and its children.

Example

>>> a = mx.sym.Variable('a', attr={'a1':'a2'})
>>> b = mx.sym.Variable('b', attr={'b1':'b2'})
>>> c = a+b
>>> c.attr_dict()
{'a': {'a1': 'a2'}, 'b': {'b1': 'b2'}}

Returns:	ret – There is a key in the returned dict for every child with non-empty attribute set. For each symbol, the name of the symbol is its key in the dict and the correspond value is that symbol’s attribute list (itself a dictionary).
Return type:	Dict of str to dict

get_internals()[source]¶

Gets a new grouped symbol sgroup. The output of sgroup is a list of outputs of all of the internal nodes.

Consider the following code:

Example

>>> a = mx.sym.var('a')
>>> b = mx.sym.var('b')
>>> c = a + b
>>> d = c.get_internals()
>>> d

>>> d.list_outputs()
['a', 'b', '_plus4_output']

Returns:	sgroup – A symbol group containing all internal and leaf nodes of the computation graph used to compute the symbol.
Return type:	Symbol

get_children()[source]¶

Gets a new grouped symbol whose output contains inputs to output nodes of the original symbol.

Example

>>> x = mx.sym.Variable('x')
>>> y = mx.sym.Variable('y')
>>> z = mx.sym.Variable('z')
>>> a = y+z
>>> b = x+a
>>> b.get_children()

>>> b.get_children().list_outputs()
['x', '_plus10_output']
>>> b.get_children().get_children().list_outputs()
['y', 'z']

Returns:	sgroup – The children of the head node. If the symbol has no inputs then `None` will be returned.
Return type:	Symbol or None

list_arguments()[source]¶

Lists all the arguments in the symbol.

Example

>>> a = mx.sym.var('a')
>>> b = mx.sym.var('b')
>>> c = a + b
>>> c.list_arguments
['a', 'b']

Returns:	args – List containing the names of all the arguments required to compute the symbol.
Return type:	list of string

list_outputs()[source]¶

Lists all the outputs in the symbol.

Example

>>> a = mx.sym.var('a')
>>> b = mx.sym.var('b')
>>> c = a + b
>>> c.list_outputs()
['_plus12_output']

Returns:	List of all the outputs. For most symbols, this list contains only the name of this symbol. For symbol groups, this is a list with the names of all symbols in the group.
Return type:	list of str

list_auxiliary_states()[source]¶

Lists all the auxiliary states in the symbol.

Example

>>> a = mx.sym.var('a')
>>> b = mx.sym.var('b')
>>> c = a + b
>>> c.list_auxiliary_states()
[]

Example of auxiliary states in BatchNorm.

>>> data = mx.symbol.Variable('data')
>>> weight = mx.sym.Variable(name='fc1_weight')
>>> fc1  = mx.symbol.FullyConnected(data = data, weight=weight, name='fc1', num_hidden=128)
>>> fc2 = mx.symbol.BatchNorm(fc1, name='batchnorm0')
>>> fc2.list_auxiliary_states()
['batchnorm0_moving_mean', 'batchnorm0_moving_var']

Returns:	aux_states – List of the auxiliary states in input symbol.
Return type:	list of str

Notes

Auxiliary states are special states of symbols that do not correspond to an argument, and are not updated by gradient descent. Common examples of auxiliary states include the moving_mean and moving_variance in BatchNorm. Most operators do not have auxiliary states.

list_inputs()[source]¶

Lists all arguments and auxiliary states of this Symbol.

Returns:	inputs – List of all inputs.
Return type:	list of str

Examples

>>> bn = mx.sym.BatchNorm(name='bn')
>>> bn.list_arguments()
['bn_data', 'bn_gamma', 'bn_beta']
>>> bn.list_auxiliary_states()
['bn_moving_mean', 'bn_moving_var']
>>> bn.list_inputs()
['bn_data', 'bn_gamma', 'bn_beta', 'bn_moving_mean', 'bn_moving_var']

infer_type(*args, **kwargs)[source]¶

Infers the type of all arguments and all outputs, given the known types for some arguments.

This function takes the known types of some arguments in either positional way or keyword argument way as input. It returns a tuple of None values if there is not enough information to deduce the missing types.

Inconsistencies in the known types will cause an error to be raised.

Example

>>> a = mx.sym.var('a')
>>> b = mx.sym.var('b')
>>> c = a + b
>>> arg_types, out_types, aux_types = c.infer_type(a='float32')
>>> arg_types
[, ]
>>> out_types
[]
>>> aux_types
[]

Parameters:

*args – Type of known arguments in a positional way. Unknown type can be marked as None.
**kwargs – Keyword arguments of known types.

Returns:

arg_types (list of numpy.dtype or None) – List of argument types. The order is same as the order of list_arguments().
out_types (list of numpy.dtype or None) – List of output types. The order is same as the order of list_outputs().
aux_types (list of numpy.dtype or None) – List of auxiliary state types. The order is same as the order of list_auxiliary_states().

infer_shape(*args, **kwargs)[source]¶

Infers the shapes of all arguments and all outputs given the known shapes of some arguments.

This function takes the known shapes of some arguments in either positional way or keyword argument way as input. It returns a tuple of None values if there is not enough information to deduce the missing shapes.

Example

>>> a = mx.sym.var('a')
>>> b = mx.sym.var('b')
>>> c = a + b
>>> arg_shapes, out_shapes, aux_shapes = c.infer_shape(a=(3,3))
>>> arg_shapes
[(3L, 3L), (3L, 3L)]
>>> out_shapes
[(3L, 3L)]
>>> aux_shapes
[]
>>> c.infer_shape(a=(0,3)) # 0s in shape means unknown dimensions. So, returns None.
(None, None, None)

Inconsistencies in the known shapes will cause an error to be raised. See the following example:

>>> data = mx.sym.Variable('data')
>>> out = mx.sym.FullyConnected(data=data, name='fc1', num_hidden=1000)
>>> out = mx.sym.Activation(data=out, act_type='relu')
>>> out = mx.sym.FullyConnected(data=out, name='fc2', num_hidden=10)
>>> weight_shape= (1, 100)
>>> data_shape = (100, 100)
>>> out.infer_shape(data=data_shape, fc1_weight=weight_shape)
Error in operator fc1: Shape inconsistent, Provided=(1,100), inferred shape=(1000,100)

Parameters:

*args – Shape of arguments in a positional way. Unknown shape can be marked as None.
**kwargs – Keyword arguments of the known shapes.

Returns:

arg_shapes (list of tuple or None) – List of argument shapes. The order is same as the order of list_arguments().
out_shapes (list of tuple or None) – List of output shapes. The order is same as the order of list_outputs().
aux_shapes (list of tuple or None) – List of auxiliary state shapes. The order is same as the order of list_auxiliary_states().

infer_shape_partial(*args, **kwargs)[source]¶

Infers the shape partially.

This functions works the same way as infer_shape, except that this function can return partial results.

In the following example, information about fc2 is not available. So, infer_shape will return a tuple of None values but infer_shape_partial will return partial values.

Example

>>> data = mx.sym.Variable('data')
>>> prev = mx.sym.Variable('prev')
>>> fc1  = mx.sym.FullyConnected(data=data, name='fc1', num_hidden=128)
>>> fc2  = mx.sym.FullyConnected(data=prev, name='fc2', num_hidden=128)
>>> out  = mx.sym.Activation(data=mx.sym.elemwise_add(fc1, fc2), act_type='relu')
>>> out.list_arguments()
['data', 'fc1_weight', 'fc1_bias', 'prev', 'fc2_weight', 'fc2_bias']
>>> out.infer_shape(data=(10,64))
(None, None, None)
>>> out.infer_shape_partial(data=(10,64))
([(10L, 64L), (128L, 64L), (128L,), (), (), ()], [(10L, 128L)], [])
>>> # infers shape if you give information about fc2
>>> out.infer_shape(data=(10,64), prev=(10,128))
([(10L, 64L), (128L, 64L), (128L,), (10L, 128L), (128L, 128L), (128L,)], [(10L, 128L)], [])

Parameters:

*args – Shape of arguments in a positional way. Unknown shape can be marked as None
**kwargs – Keyword arguments of known shapes.

Returns:

arg_shapes (list of tuple or None) – List of argument shapes. The order is same as the order of list_arguments().
out_shapes (list of tuple or None) – List of output shapes. The order is same as the order of list_outputs().
aux_shapes (list of tuple or None) – List of auxiliary state shapes. The order is same as the order of list_auxiliary_states().

debug_str()[source]¶

Gets a debug string of symbol.

It contains Symbol output, variables and operators in the computation graph with their inputs, variables and attributes.

Returns:	Debug string of the symbol.
Return type:	string

Examples

>>> a = mx.sym.Variable('a')
>>> b = mx.sym.sin(a)
>>> c = 2 * a + b
>>> d = mx.sym.FullyConnected(data=c, num_hidden=10)
>>> d.debug_str()
>>> print d.debug_str()
Symbol Outputs:
        output[0]=fullyconnected0(0)
Variable:a
--------------------
Op:_mul_scalar, Name=_mulscalar0
Inputs:
        arg[0]=a(0) version=0
Attrs:
        scalar=2
--------------------
Op:sin, Name=sin0
Inputs:
        arg[0]=a(0) version=0
--------------------
Op:elemwise_add, Name=_plus0
Inputs:
        arg[0]=_mulscalar0(0)
        arg[1]=sin0(0)
Variable:fullyconnected0_weight
Variable:fullyconnected0_bias
--------------------
Op:FullyConnected, Name=fullyconnected0
Inputs:
        arg[0]=_plus0(0)
        arg[1]=fullyconnected0_weight(0) version=0
        arg[2]=fullyconnected0_bias(0) version=0
Attrs:
        num_hidden=10

save(fname)[source]¶

Saves symbol to a file.

You can also use pickle to do the job if you only work on python. The advantage of load/save functions is that the file contents are language agnostic. This means the model saved by one language binding can be loaded by a different language binding of MXNet. You also get the benefit of being able to directly load/save from cloud storage(S3, HDFS).

Parameters:

fname (str) –

The name of the file.

“s3://my-bucket/path/my-s3-symbol”
“hdfs://my-bucket/path/my-hdfs-symbol”
“/path-to/my-local-symbol”

See also

symbol.load_json(): Used to load symbol from JSON string.

simple_bind(ctx, grad_req='write', type_dict=None, stype_dict=None, group2ctx=None, shared_arg_names=None, shared_exec=None, shared_buffer=None, **kwargs)[source]¶

Bind current symbol to get an executor, allocate all the arguments needed. Allows specifying data types.

This function simplifies the binding procedure. You need to specify only input data shapes. Before binding the executor, the function allocates arguments and auxiliary states that were not explicitly specified. Allows specifying data types.

Example

>>> x = mx.sym.Variable('x')
>>> y = mx.sym.FullyConnected(x, num_hidden=4)
>>> exe = y.simple_bind(mx.cpu(), x=(5,4), grad_req='null')
>>> exe.forward()
[]
>>> exe.outputs[0].asnumpy()
array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]], dtype=float32)
>>> exe.arg_arrays
[, , ]
>>> exe.grad_arrays
[, , ]

Parameters:	ctx (Context) – The device context the generated executor to run on. grad_req (string) – {‘write’, ‘add’, ‘null’}, or list of str or dict of str to str, optional To specify how we should update the gradient to the args_grad. ‘write’ means every time gradient is written to specified args_grad NDArray. ‘add’ means every time gradient is added to the specified NDArray. ‘null’ means no action is taken, the gradient may not be calculated. type_dict (Dict of str->numpy.dtype) – Input type dictionary, name->dtype stype_dict (Dict of str->str) – Input storage type dictionary, name->storage_type group2ctx (Dict of string to mx.Context) – The dict mapping the ctx_group attribute to the context assignment. shared_arg_names (List of string) – The argument names whose NDArray of shared_exec can be reused for initializing the current executor. shared_exec (Executor) – The executor whose arg_arrays, arg_arrays, grad_arrays, and aux_arrays can be reused for initializing the current executor. shared_buffer (Dict of string to NDArray) – The dict mapping argument names to the NDArray that can be reused for initializing the current executor. This buffer will be checked for reuse if one argument name of the current executor is not found in shared_arg_names. The `NDArray`s are expected have default storage type. kwargs (Dict of str->shape) – Input shape dictionary, name->shape
Returns:	executor – The generated executor
Return type:	mxnet.Executor

bind(ctx, args, args_grad=None, grad_req='write', aux_states=None, group2ctx=None, shared_exec=None)[source]¶

Binds the current symbol to an executor and returns it.

We first declare the computation and then bind to the data to run. This function returns an executor which provides method forward() method for evaluation and a outputs() method to get all the results.

Example

>>> a = mx.sym.Variable('a')
>>> b = mx.sym.Variable('b')
>>> c = a + b

>>> ex = c.bind(ctx=mx.cpu(), args={'a' : mx.nd.ones([2,3]), 'b' : mx.nd.ones([2,3])})
>>> ex.forward()
[]
>>> ex.outputs[0].asnumpy()
[[ 2.  2.  2.]
[ 2.  2.  2.]]

Parameters:	ctx (Context) – The device context the generated executor to run on. args (list of NDArray or dict of str to NDArray) – Input arguments to the symbol. If the input type is a list of NDArray, the order should be same as the order of list_arguments(). If the input type is a dict of str to NDArray, then it maps the name of arguments to the corresponding NDArray. In either case, all the arguments must be provided. args_grad (list of NDArray or dict of str to NDArray, optional) – When specified, args_grad provides NDArrays to hold the result of gradient value in backward. If the input type is a list of NDArray, the order should be same as the order of list_arguments(). If the input type is a dict of str to NDArray, then it maps the name of arguments to the corresponding NDArray. When the type is a dict of str to NDArray, one only need to provide the dict for required argument gradient. Only the specified argument gradient will be calculated. grad_req ({'write', 'add', 'null'}, or list of str or dict of str to str, optional) – To specify how we should update the gradient to the args_grad. ‘write’ means everytime gradient is write to specified args_grad NDArray. ‘add’ means everytime gradient is add to the specified NDArray. ‘null’ means no action is taken, the gradient may not be calculated. aux_states (list of NDArray, or dict of str to NDArray, optional) – Input auxiliary states to the symbol, only needed when the output of list_auxiliary_states() is not empty. If the input type is a list of NDArray, the order should be same as the order of list_auxiliary_states(). If the input type is a dict of str to NDArray, then it maps the name of auxiliary_states to the corresponding NDArray, In either case, all the auxiliary states need to be provided. group2ctx (Dict of string to mx.Context) – The dict mapping the ctx_group attribute to the context assignment. shared_exec (mx.executor.Executor) – Executor to share memory with. This is intended for runtime reshaping, variable length sequences, etc. The returned executor shares state with shared_exec, and should not be used in parallel with it.
Returns:	executor – The generated executor
Return type:	Executor

Notes

Auxiliary states are the special states of symbols that do not correspond to an argument, and do not have gradient but are still useful for the specific operations. Common examples of auxiliary states include the moving_mean and moving_variance states in BatchNorm. Most operators do not have auxiliary states and in those cases, this parameter can be safely ignored.

One can give up gradient by using a dict in args_grad and only specify gradient they interested in.

gradient(wrt)[source]¶

Gets the autodiff of current symbol.

This function can only be used if current symbol is a loss function.

Note

This function is currently not implemented.

Parameters:	wrt (Array of String) – keyword arguments of the symbol that the gradients are taken.
Returns:	grad – A gradient Symbol with returns to be the corresponding gradients.
Return type:	Symbol

eval(ctx=None, **kwargs)[source]¶

Evaluates a symbol given arguments.

The eval method combines a call to bind (which returns an executor) with a call to forward (executor method). For the common use case, where you might repeatedly evaluate with same arguments, eval is slow. In that case, you should call bind once and then repeatedly call forward. This function allows simpler syntax for less cumbersome introspection.

Example

>>> a = mx.sym.Variable('a')
>>> b = mx.sym.Variable('b')
>>> c = a + b
>>> ex = c.eval(ctx = mx.cpu(), a = mx.nd.ones([2,3]), b = mx.nd.ones([2,3]))
>>> ex
[]
>>> ex[0].asnumpy()
array([[ 2.,  2.,  2.],
       [ 2.,  2.,  2.]], dtype=float32)

Parameters:

ctx (Context) – The device context the generated executor to run on.
kwargs (Keyword arguments of type NDArray) – Input arguments to the symbol. All the arguments must be provided.

Returns:

result (a list of NDArrays corresponding to the values taken by each symbol when)
evaluated on given args. When called on a single symbol (not a group),
the result will be a list with one element.

reshape(*args, **kwargs)[source]¶

Convenience fluent method for reshape().

The arguments are the same as for reshape(), with this array as data.

reshape_like(*args, **kwargs)[source]¶

Convenience fluent method for reshape_like().

The arguments are the same as for reshape_like(), with this array as data.

astype(*args, **kwargs)[source]¶

Convenience fluent method for cast().

The arguments are the same as for cast(), with this array as data.

zeros_like(*args, **kwargs)[source]¶

Convenience fluent method for zeros_like().

The arguments are the same as for zeros_like(), with this array as data.

ones_like(*args, **kwargs)[source]¶

Convenience fluent method for ones_like().

The arguments are the same as for ones_like(), with this array as data.

broadcast_axes(*args, **kwargs)[source]¶

Convenience fluent method for broadcast_axes().

The arguments are the same as for broadcast_axes(), with this array as data.

repeat(*args, **kwargs)[source]¶

Convenience fluent method for repeat().

The arguments are the same as for repeat(), with this array as data.

pad(*args, **kwargs)[source]¶

Convenience fluent method for pad().

The arguments are the same as for pad(), with this array as data.

swapaxes(*args, **kwargs)[source]¶

Convenience fluent method for swapaxes().

The arguments are the same as for swapaxes(), with this array as data.

split(*args, **kwargs)[source]¶

Convenience fluent method for split().

The arguments are the same as for split(), with this array as data.

slice(*args, **kwargs)[source]¶

Convenience fluent method for slice().

The arguments are the same as for slice(), with this array as data.

slice_axis(*args, **kwargs)[source]¶

Convenience fluent method for slice_axis().

The arguments are the same as for slice_axis(), with this array as data.

take(*args, **kwargs)[source]¶

Convenience fluent method for take().

The arguments are the same as for take(), with this array as data.

one_hot(*args, **kwargs)[source]¶

Convenience fluent method for one_hot().

The arguments are the same as for one_hot(), with this array as data.

pick(*args, **kwargs)[source]¶

Convenience fluent method for pick().

The arguments are the same as for pick(), with this array as data.

sort(*args, **kwargs)[source]¶

Convenience fluent method for sort().

The arguments are the same as for sort(), with this array as data.

topk(*args, **kwargs)[source]¶

Convenience fluent method for topk().

The arguments are the same as for topk(), with this array as data.

argsort(*args, **kwargs)[source]¶

Convenience fluent method for argsort().

The arguments are the same as for argsort(), with this array as data.

argmax(*args, **kwargs)[source]¶

Convenience fluent method for argmax().

The arguments are the same as for argmax(), with this array as data.

argmax_channel(*args, **kwargs)[source]¶

Convenience fluent method for argmax_channel().

The arguments are the same as for argmax_channel(), with this array as data.

argmin(*args, **kwargs)[source]¶

Convenience fluent method for argmin().

The arguments are the same as for argmin(), with this array as data.

clip(*args, **kwargs)[source]¶

Convenience fluent method for clip().

The arguments are the same as for clip(), with this array as data.

abs(*args, **kwargs)[source]¶

Convenience fluent method for abs().

The arguments are the same as for abs(), with this array as data.

sign(*args, **kwargs)[source]¶

Convenience fluent method for sign().

The arguments are the same as for sign(), with this array as data.

flatten(*args, **kwargs)[source]¶

Convenience fluent method for flatten().

The arguments are the same as for flatten(), with this array as data.

expand_dims(*args, **kwargs)[source]¶

Convenience fluent method for expand_dims().

The arguments are the same as for expand_dims(), with this array as data.

broadcast_to(*args, **kwargs)[source]¶

Convenience fluent method for broadcast_to().

The arguments are the same as for broadcast_to(), with this array as data.

tile(*args, **kwargs)[source]¶

Convenience fluent method for tile().

The arguments are the same as for tile(), with this array as data.

transpose(*args, **kwargs)[source]¶

Convenience fluent method for transpose().

The arguments are the same as for transpose(), with this array as data.

flip(*args, **kwargs)[source]¶

Convenience fluent method for flip().

The arguments are the same as for flip(), with this array as data.

sum(*args, **kwargs)[source]¶

Convenience fluent method for sum().

The arguments are the same as for sum(), with this array as data.

nansum(*args, **kwargs)[source]¶

Convenience fluent method for nansum().

The arguments are the same as for nansum(), with this array as data.

prod(*args, **kwargs)[source]¶

Convenience fluent method for prod().

The arguments are the same as for prod(), with this array as data.

nanprod(*args, **kwargs)[source]¶

Convenience fluent method for nanprod().

The arguments are the same as for nanprod(), with this array as data.

mean(*args, **kwargs)[source]¶

Convenience fluent method for mean().

The arguments are the same as for mean(), with this array as data.

max(*args, **kwargs)[source]¶

Convenience fluent method for max().

The arguments are the same as for max(), with this array as data.

min(*args, **kwargs)[source]¶

Convenience fluent method for min().

The arguments are the same as for min(), with this array as data.

norm(*args, **kwargs)[source]¶

Convenience fluent method for norm().

The arguments are the same as for norm(), with this array as data.

round(*args, **kwargs)[source]¶

Convenience fluent method for round().

The arguments are the same as for round(), with this array as data.

rint(*args, **kwargs)[source]¶

Convenience fluent method for rint().

The arguments are the same as for rint(), with this array as data.

fix(*args, **kwargs)[source]¶

Convenience fluent method for fix().

The arguments are the same as for fix(), with this array as data.

floor(*args, **kwargs)[source]¶

Convenience fluent method for floor().

The arguments are the same as for floor(), with this array as data.

ceil(*args, **kwargs)[source]¶

Convenience fluent method for ceil().

The arguments are the same as for ceil(), with this array as data.

trunc(*args, **kwargs)[source]¶

Convenience fluent method for trunc().

The arguments are the same as for trunc(), with this array as data.

sin(*args, **kwargs)[source]¶

Convenience fluent method for sin().

The arguments are the same as for sin(), with this array as data.

cos(*args, **kwargs)[source]¶

Convenience fluent method for cos().

The arguments are the same as for cos(), with this array as data.

tan(*args, **kwargs)[source]¶

Convenience fluent method for tan().

The arguments are the same as for tan(), with this array as data.

arcsin(*args, **kwargs)[source]¶

Convenience fluent method for arcsin().

The arguments are the same as for arcsin(), with this array as data.

arccos(*args, **kwargs)[source]¶

Convenience fluent method for arccos().

The arguments are the same as for arccos(), with this array as data.

arctan(*args, **kwargs)[source]¶

Convenience fluent method for arctan().

The arguments are the same as for arctan(), with this array as data.

degrees(*args, **kwargs)[source]¶

Convenience fluent method for degrees().

The arguments are the same as for degrees(), with this array as data.

radians(*args, **kwargs)[source]¶

Convenience fluent method for radians().

The arguments are the same as for radians(), with this array as data.

sinh(*args, **kwargs)[source]¶

Convenience fluent method for sinh().

The arguments are the same as for sinh(), with this array as data.

cosh(*args, **kwargs)[source]¶

Convenience fluent method for cosh().

The arguments are the same as for cosh(), with this array as data.

tanh(*args, **kwargs)[source]¶

Convenience fluent method for tanh().

The arguments are the same as for tanh(), with this array as data.

arcsinh(*args, **kwargs)[source]¶

Convenience fluent method for arcsinh().

The arguments are the same as for arcsinh(), with this array as data.

arccosh(*args, **kwargs)[source]¶

Convenience fluent method for arccosh().

The arguments are the same as for arccosh(), with this array as data.

arctanh(*args, **kwargs)[source]¶

Convenience fluent method for arctanh().

The arguments are the same as for arctanh(), with this array as data.

exp(*args, **kwargs)[source]¶

Convenience fluent method for exp().

The arguments are the same as for exp(), with this array as data.

expm1(*args, **kwargs)[source]¶

Convenience fluent method for expm1().

The arguments are the same as for expm1(), with this array as data.

log(*args, **kwargs)[source]¶

Convenience fluent method for log().

The arguments are the same as for log(), with this array as data.

log10(*args, **kwargs)[source]¶

Convenience fluent method for log10().

The arguments are the same as for log10(), with this array as data.

log2(*args, **kwargs)[source]¶

Convenience fluent method for log2().

The arguments are the same as for log2(), with this array as data.

log1p(*args, **kwargs)[source]¶

Convenience fluent method for log1p().

The arguments are the same as for log1p(), with this array as data.

sqrt(*args, **kwargs)[source]¶

Convenience fluent method for sqrt().

The arguments are the same as for sqrt(), with this array as data.

rsqrt(*args, **kwargs)[source]¶

Convenience fluent method for rsqrt().

The arguments are the same as for rsqrt(), with this array as data.

cbrt(*args, **kwargs)[source]¶

Convenience fluent method for cbrt().

The arguments are the same as for cbrt(), with this array as data.

rcbrt(*args, **kwargs)[source]¶

Convenience fluent method for rcbrt().

The arguments are the same as for rcbrt(), with this array as data.

square(*args, **kwargs)[source]¶

Convenience fluent method for square().

The arguments are the same as for square(), with this array as data.

reciprocal(*args, **kwargs)[source]¶

Convenience fluent method for reciprocal().

The arguments are the same as for reciprocal(), with this array as data.

relu(*args, **kwargs)[source]¶

Convenience fluent method for relu().

The arguments are the same as for relu(), with this array as data.

sigmoid(*args, **kwargs)[source]¶

Convenience fluent method for sigmoid().

The arguments are the same as for sigmoid(), with this array as data.

softmax(*args, **kwargs)[source]¶

Convenience fluent method for softmax().

The arguments are the same as for softmax(), with this array as data.

log_softmax(*args, **kwargs)[source]¶

Convenience fluent method for log_softmax().

The arguments are the same as for log_softmax(), with this array as data.

Symbol API of MXNet.

mxnet.symbol.Activation(data=None, act_type=_Null, name=None, attr=None, out=None, **kwargs)¶

Applies an activation function element-wise to the input.

The following activation functions are supported:

relu: Rectified Linear Unit, \(y = max(x, 0)\)
sigmoid: \(y = \frac{1}{1 + exp(-x)}\)
tanh: Hyperbolic tangent, \(y = \frac{exp(x) - exp(-x)}{exp(x) + exp(-x)}\)
softrelu: Soft ReLU, or SoftPlus, \(y = log(1 + exp(x))\)

Defined in src/operator/activation.cc:L91

Parameters:	data (Symbol) – Input array to activation function. act_type ({'relu', 'sigmoid', 'softrelu', 'tanh'}, required) – Activation function to be applied. name (string, optional.) – Name of the resulting symbol.
Returns:	The result symbol.
Return type:	Symbol

Examples

A one-hidden-layer MLP with ReLU activation:

>>> data = Variable('data')
>>> mlp = FullyConnected(data=data, num_hidden=128, name='proj')
>>> mlp = Activation(data=mlp, act_type='relu', name='activation')
>>> mlp = FullyConnected(data=mlp, num_hidden=10, name='mlp')
>>> mlp

ReLU activation

>>> test_suites = [
... ('relu', lambda x: np.maximum(x, 0)),
... ('sigmoid', lambda x: 1 / (1 + np.exp(-x))),
... ('tanh', lambda x: np.tanh(x)),
... ('softrelu', lambda x: np.log(1 + np.exp(x)))
... ]
>>> x = test_utils.random_arrays((2, 3, 4))
>>> for act_type, numpy_impl in test_suites:
... op = Activation(act_type=act_type, name='act')
... y = test_utils.simple_forward(op, act_data=x)
... y_np = numpy_impl(x)
... print('%s: %s' % (act_type, test_utils.almost_equal(y, y_np)))
relu: True
sigmoid: True
tanh: True
softrelu: True

mxnet.symbol.BatchNorm(data=None, gamma=None, beta=None, moving_mean=None, moving_var=None, eps=_Null, momentum=_Null, fix_gamma=_Null, use_global_stats=_Null, output_mean_var=_Null, axis=_Null, cudnn_off=_Null, name=None, attr=None, out=None, **kwargs)¶

Batch normalization.

Normalizes a data batch by mean and variance, and applies a scale gamma as well as offset beta.

Assume the input has more than one dimension and we normalize along axis 1. We first compute the mean and variance along this axis:

\[\begin{split}data\_mean[i] = mean(data[:,i,:,...]) \\ data\_var[i] = var(data[:,i,:,...])\end{split}\]

Then compute the normalized output, which has the same shape as input, as following:

\[out[:,i,:,...] = \frac{data[:,i,:,...] - data\_mean[i]}{\sqrt{data\_var[i]+\epsilon}} * gamma[i] + beta[i]\]

Both mean and var returns a scalar by treating the input as a vector.

Assume the input has size k on axis 1, then both gamma and beta have shape (k,). If output_mean_var is set to be true, then outputs both data_mean and data_var as well, which are needed for the backward pass.

Besides the inputs and the outputs, this operator accepts two auxiliary states, moving_mean and moving_var, which are k-length vectors. They are global statistics for the whole dataset, which are updated by:

moving_mean = moving_mean * momentum + data_mean * (1 - momentum)
moving_var = moving_var * momentum + data_var * (1 - momentum)

If use_global_stats is set to be true, then moving_mean and moving_var are used instead of data_mean and data_var to compute the output. It is often used during inference.

The parameter axis specifies which axis of the input shape denotes the ‘channel’ (separately normalized groups). The default is 1. Specifying -1 sets the channel axis to be the last item in the input shape.

Both gamma and beta are learnable parameters. But if fix_gamma is true, then set gamma to 1 and its gradient to 0.

Defined in src/operator/batch_norm.cc:L399

Parameters:	data (Symbol) – Input data to batch normalization gamma (Symbol) – gamma array beta (Symbol) – beta array moving_mean (Symbol) – running mean of input moving_var (Symbol) – running variance of input eps (double, optional, default=0.001) – Epsilon to prevent div 0. Must be no less than CUDNN_BN_MIN_EPSILON defined in cudnn.h when using cudnn (usually 1e-5) momentum (float, optional, default=0.9) – Momentum for moving average fix_gamma (boolean, optional, default=1) – Fix gamma while training use_global_stats (boolean, optional, default=0) – Whether use global moving statistics instead of local batch-norm. This will force change batch-norm into a scale shift operator. output_mean_var (boolean, optional, default=0) – Output All,normal mean and var axis (int, optional, default='1') – Specify which shape axis the channel is specified cudnn_off (boolean, optional, default=0) – Do not select CUDNN operator, if available name (string, optional.) – Name of the resulting symbol.
Returns:	The result symbol.
Return type:	Symbol

mxnet.symbol.BatchNorm_v1(data=None, gamma=None, beta=None, eps=_Null, momentum=_Null, fix_gamma=_Null, use_global_stats=_Null, output_mean_var=_Null, name=None, attr=None, out=None, **kwargs)¶