ndarray.sparse¶
Sparse NDArray API of MXNet.
Functions
|
Creates a CSRNDArray, an 2D array with compressed sparse row (CSR) format. |
|
Creates a RowSparseNDArray, a multidimensional row sparse array with a set of tensor slices at given indices. |
|
Returns element-wise sum of the input arrays with broadcasting. |
|
Returns element-wise difference of the input arrays with broadcasting. |
|
Returns element-wise product of the input arrays with broadcasting. |
|
Returns element-wise division of the input arrays with broadcasting. |
|
Adds all input arguments element-wise. |
|
Maps integer indices to vector representations (embeddings). |
|
Applies a linear transformation: \(Y = XW^T + b\). |
|
Computes and optimizes for squared loss during backward propagation. |
|
Applies a logistic function to the input. |
|
Computes mean absolute error of the input. |
|
Returns element-wise absolute value of the input. |
|
Update function for AdaGrad optimizer. |
|
Update function for Adam optimizer. |
|
Adds all input arguments element-wise. |
|
Returns element-wise inverse cosine of the input array. |
|
Returns the element-wise inverse hyperbolic cosine of the input array, computed element-wise. |
|
Returns element-wise inverse sine of the input array. |
|
Returns the element-wise inverse hyperbolic sine of the input array, computed element-wise. |
|
Returns element-wise inverse tangent of the input array. |
|
Returns the element-wise inverse hyperbolic tangent of the input array, computed element-wise. |
|
Returns element-wise sum of the input arrays with broadcasting. |
|
Returns element-wise division of the input arrays with broadcasting. |
|
Returns element-wise difference of the input arrays with broadcasting. |
|
Returns element-wise product of the input arrays with broadcasting. |
|
Returns element-wise sum of the input arrays with broadcasting. |
|
Returns element-wise difference of the input arrays with broadcasting. |
|
Casts tensor storage type to the new type. |
|
Returns element-wise cube-root value of the input. |
|
Returns element-wise ceiling of the input. |
|
Clips (limits) the values in an array. |
|
Joins input arrays along a given axis. |
|
Computes the element-wise cosine of the input array. |
|
Returns the hyperbolic cosine of the input array, computed element-wise. |
|
Converts each element of the input array from radians to degrees. |
|
Dot product of two arrays. |
|
Adds arguments element-wise. |
|
Divides arguments element-wise. |
|
Multiplies arguments element-wise. |
|
Subtracts arguments element-wise. |
|
Returns element-wise exponential value of the input. |
|
Returns |
|
Returns element-wise rounded value to the nearest integer towards zero of the input. |
|
Returns element-wise floor of the input. |
|
Update function for Ftrl optimizer. |
|
Returns the gamma function (extension of the factorial function to the reals), computed element-wise on the input array. |
|
Returns element-wise log of the absolute value of the gamma function of the input. |
|
Returns element-wise Natural logarithmic value of the input. |
|
Returns element-wise Base-10 logarithmic value of the input. |
|
Returns element-wise |
|
Returns element-wise Base-2 logarithmic value of the input. |
|
Make your own loss function in network construction. |
|
Computes the mean of array elements over given axes. |
|
Numerical negative of the argument, element-wise. |
|
Computes the norm on an NDArray. |
|
Converts each element of the input array from degrees to radians. |
|
Computes rectified linear activation. |
|
Pick rows specified by user input index array from a row sparse matrix and save them in the output sparse matrix. |
|
Returns element-wise rounded value to the nearest integer of the input. |
|
Returns element-wise rounded value to the nearest integer of the input. |
|
Returns element-wise inverse square-root value of the input. |
|
Momentum update function for Stochastic Gradient Descent (SGD) optimizer. |
|
Update function for Stochastic Gradient Descent (SGD) optimizer. |
|
Computes sigmoid of x element-wise. |
|
Returns element-wise sign of the input. |
|
Computes the element-wise sine of the input array. |
|
Returns the hyperbolic sine of the input array, computed element-wise. |
|
Slices a region of the array. |
|
Returns element-wise square-root value of the input. |
|
Returns element-wise squared value of the input. |
|
Stops gradient computation. |
|
Computes the sum of array elements over given axes. |
|
Computes the element-wise tangent of the input array. |
|
Returns the hyperbolic tangent of the input array, computed element-wise. |
|
Return the element-wise truncated value of the input. |
|
Return the elements, either from x or y, depending on the condition. |
|
Return an array of zeros with the same shape, type and storage type as the input array. |
Classes
|
The base class of an NDArray stored in a sparse storage format. |
|
A sparse representation of 2D NDArray in the Compressed Sparse Row format. |
|
A sparse representation of a set of NDArray row slices at given indices. |
-
mxnet.ndarray.sparse.
csr_matrix
(arg1, shape=None, ctx=None, dtype=None)[source]¶ Creates a CSRNDArray, an 2D array with compressed sparse row (CSR) format.
The CSRNDArray can be instantiated in several ways:
- csr_matrix(D):
- to construct a CSRNDArray with a dense 2D array
D
D (array_like) - An object exposing the array interface, an object whose __array__ method returns an array, or any (nested) sequence.
ctx (Context, optional) - Device context (default is the current default context).
dtype (str or numpy.dtype, optional) - The data type of the output array. The default dtype is
D.dtype
ifD
is an NDArray or numpy.ndarray, float32 otherwise.
- to construct a CSRNDArray with a dense 2D array
- csr_matrix(S)
- to construct a CSRNDArray with a sparse 2D array
S
S (CSRNDArray or scipy.sparse.csr.csr_matrix) - A sparse matrix.
ctx (Context, optional) - Device context (default is the current default context).
dtype (str or numpy.dtype, optional) - The data type of the output array. The default dtype is
S.dtype
.
- to construct a CSRNDArray with a sparse 2D array
- csr_matrix((M, N))
- to construct an empty CSRNDArray with shape
(M, N)
M (int) - Number of rows in the matrix
N (int) - Number of columns in the matrix
ctx (Context, optional) - Device context (default is the current default context).
dtype (str or numpy.dtype, optional) - The data type of the output array. The default dtype is float32.
- to construct an empty CSRNDArray with shape
- csr_matrix((data, indices, indptr))
- to construct a CSRNDArray based on the definition of compressed sparse row format using three separate arrays, where the column indices for row i are stored in
indices[indptr[i]:indptr[i+1]]
and their corresponding values are stored indata[indptr[i]:indptr[i+1]]
. The column indices for a given row are expected to be sorted in ascending order. Duplicate column entries for the same row are not allowed. data (array_like) - An object exposing the array interface, which holds all the non-zero entries of the matrix in row-major order.
indices (array_like) - An object exposing the array interface, which stores the column index for each non-zero element in
data
.indptr (array_like) - An object exposing the array interface, which stores the offset into
data
of the first non-zero element number of each row of the matrix.shape (tuple of int, optional) - The shape of the array. The default shape is inferred from the indices and indptr arrays.
ctx (Context, optional) - Device context (default is the current default context).
dtype (str or numpy.dtype, optional) - The data type of the output array. The default dtype is
data.dtype
ifdata
is an NDArray or numpy.ndarray, float32 otherwise.
- to construct a CSRNDArray based on the definition of compressed sparse row format using three separate arrays, where the column indices for row i are stored in
- csr_matrix((data, (row, col)))
- to construct a CSRNDArray based on the COOrdinate format using three seperate arrays, where
row[i]
is the row index of the element,col[i]
is the column index of the element anddata[i]
is the data corresponding to the element. All the missing elements in the input are taken to be zeroes. data (array_like) - An object exposing the array interface, which holds all the non-zero entries of the matrix in COO format.
row (array_like) - An object exposing the array interface, which stores the row index for each non zero element in
data
.col (array_like) - An object exposing the array interface, which stores the col index for each non zero element in
data
.shape (tuple of int, optional) - The shape of the array. The default shape is inferred from the
row
andcol
arrays.ctx (Context, optional) - Device context (default is the current default context).
dtype (str or numpy.dtype, optional) - The data type of the output array. The default dtype is float32.
- to construct a CSRNDArray based on the COOrdinate format using three seperate arrays, where
- Parameters
arg1 (tuple of int, tuple of array_like, array_like, CSRNDArray, scipy.sparse.csr_matrix, scipy.sparse.coo_matrix, tuple of int or tuple of array_like) – The argument to help instantiate the csr matrix. See above for further details.
shape (tuple of int, optional) – The shape of the csr matrix.
ctx (Context, optional) – Device context (default is the current default context).
dtype (str or numpy.dtype, optional) – The data type of the output array.
- Returns
A CSRNDArray with the csr storage representation.
- Return type
Example
>>> a = mx.nd.sparse.csr_matrix(([1, 2, 3], [1, 0, 2], [0, 1, 2, 2, 3]), shape=(4, 3)) >>> a.asnumpy() array([[ 0., 1., 0.], [ 2., 0., 0.], [ 0., 0., 0.], [ 0., 0., 3.]], dtype=float32)
See also
CSRNDArray()
MXNet NDArray in compressed sparse row format.
-
mxnet.ndarray.sparse.
row_sparse_array
(arg1, shape=None, ctx=None, dtype=None)[source]¶ Creates a RowSparseNDArray, a multidimensional row sparse array with a set of tensor slices at given indices.
The RowSparseNDArray can be instantiated in several ways:
- row_sparse_array(D):
to construct a RowSparseNDArray with a dense ndarray
D
- D (array_like) - An object exposing the array interface, an object whose __array__ method returns an array, or any (nested) sequence. - ctx (Context, optional) - Device context (default is the current default context). - dtype (str or numpy.dtype, optional) - The data type of the output array. The default dtype isD.dtype
ifD
is an NDArray or numpy.ndarray, float32 otherwise.
- row_sparse_array(S)
to construct a RowSparseNDArray with a sparse ndarray
S
- S (RowSparseNDArray) - A sparse ndarray. - ctx (Context, optional) - Device context (default is the current default context). - dtype (str or numpy.dtype, optional) - The data type of the output array. The default dtype isS.dtype
.
- row_sparse_array((D0, D1 .. Dn))
to construct an empty RowSparseNDArray with shape
(D0, D1, ... Dn)
- D0, D1 .. Dn (int) - The shape of the ndarray - ctx (Context, optional) - Device context (default is the current default context). - dtype (str or numpy.dtype, optional) - The data type of the output array. The default dtype is float32.
- row_sparse_array((data, indices))
to construct a RowSparseNDArray based on the definition of row sparse format using two separate arrays, where the indices stores the indices of the row slices with non-zeros, while the values are stored in data. The corresponding NDArray
dense
represented by RowSparseNDArrayrsp
hasdense[rsp.indices[i], :, :, :, ...] = rsp.data[i, :, :, :, ...]
The row indices for are expected to be sorted in ascending order. - data (array_like) - An object exposing the array interface, which holds all the non-zero row slices of the array. - indices (array_like) - An object exposing the array interface, which stores the row index for each row slice with non-zero elements. - shape (tuple of int, optional) - The shape of the array. The default shape is inferred from the indices and indptr arrays. - ctx (Context, optional) - Device context (default is the current default context). - dtype (str or numpy.dtype, optional) - The data type of the output array. The default dtype is float32.
- Parameters
arg1 (NDArray, numpy.ndarray, RowSparseNDArray, tuple of int or tuple of array_like) – The argument to help instantiate the row sparse ndarray. See above for further details.
shape (tuple of int, optional) – The shape of the row sparse ndarray. (Default value = None)
ctx (Context, optional) – Device context (default is the current default context).
dtype (str or numpy.dtype, optional) – The data type of the output array. (Default value = None)
- Returns
An RowSparseNDArray with the row_sparse storage representation.
- Return type
Examples
>>> a = mx.nd.sparse.row_sparse_array(([[1, 2], [3, 4]], [1, 4]), shape=(6, 2)) >>> a.asnumpy() array([[ 0., 0.], [ 1., 2.], [ 0., 0.], [ 0., 0.], [ 3., 4.], [ 0., 0.]], dtype=float32)
See also
RowSparseNDArray()
MXNet NDArray in row sparse format.
-
class
mxnet.ndarray.sparse.
BaseSparseNDArray
(handle, writable=True)[source]¶ Bases:
mxnet.ndarray.ndarray.NDArray
The base class of an NDArray stored in a sparse storage format.
See CSRNDArray and RowSparseNDArray for more details.
Methods
asnumpy
()Return a dense
numpy.ndarray
object with value copied from this arrayastype
(dtype[, copy])Return a copy of the array after casting to a specified type.
check_format
([full_check])Check whether the NDArray format is valid.
copyto
(other)Copies the value of this array to another array.
reshape
(*shape, **kwargs)Returns a view of this array with a new shape without altering any data.
Attributes
Number of elements in the array.
-
astype
(dtype, copy=True)[source]¶ Return a copy of the array after casting to a specified type.
- Parameters
dtype (numpy.dtype or str) – The type of the returned array.
copy (bool) – Default True. By default, astype always returns a newly allocated ndarray on the same context. If this is set to False, and the dtype requested is the same as the ndarray’s dtype, the ndarray is returned instead of a copy.
Examples
>>> x = mx.nd.sparse.zeros('row_sparse', (2,3), dtype='float32') >>> y = x.astype('int32') >>> y.dtype <type 'numpy.int32'>
-
check_format
(full_check=True)[source]¶ Check whether the NDArray format is valid.
- Parameters
full_check (bool, optional) – If True, rigorous check, O(N) operations. Otherwise basic check, O(1) operations (default True).
-
copyto
(other)[source]¶ Copies the value of this array to another array.
- Parameters
other (NDArray or CSRNDArray or RowSparseNDArray or Context) – The destination array or context.
- Returns
The copied array.
- Return type
-
reshape
(*shape, **kwargs)[source]¶ Returns a view of this array with a new shape without altering any data.
- Parameters
shape (tuple of int, or n ints) –
The new shape should not change the array size, namely
np.prod(new_shape)
should be equal tonp.prod(self.shape)
. Some dimensions of the shape can take special values from the set {0, -1, -2, -3, -4}. The significance of each is explained below:0
copy this dimension from the input to the output shape.Example:
- input shape = (2,3,4), shape = (4,0,2), output shape = (4,3,2) - input shape = (2,3,4), shape = (2,0,0), output shape = (2,3,4)
-1
infers the dimension of the output shape by using the remainder of the input dimensions keeping the size of the new array same as that of the input array. At most one dimension of shape can be -1.Example:
- input shape = (2,3,4), shape = (6,1,-1), output shape = (6,1,4) - input shape = (2,3,4), shape = (3,-1,8), output shape = (3,1,8) - input shape = (2,3,4), shape=(-1,), output shape = (24,)
-2
copy all/remainder of the input dimensions to the output shape.Example:
- input shape = (2,3,4), shape = (-2,), output shape = (2,3,4) - input shape = (2,3,4), shape = (2,-2), output shape = (2,3,4) - input shape = (2,3,4), shape = (-2,1,1), output shape = (2,3,4,1,1)
-3
use the product of two consecutive dimensions of the input shape as the output dimension.Example:
- input shape = (2,3,4), shape = (-3,4), output shape = (6,4) - input shape = (2,3,4,5), shape = (-3,-3), output shape = (6,20) - input shape = (2,3,4), shape = (0,-3), output shape = (2,12) - input shape = (2,3,4), shape = (-3,-2), output shape = (6,4)
-4
split one dimension of the input into two dimensions passed subsequent to -4 in shape (can contain -1).Example:
- input shape = (2,3,4), shape = (-4,1,2,-2), output shape =(1,2,3,4) - input shape = (2,3,4), shape = (2,-4,-1,3,-2), output shape = (2,1,3,4)
If the argument reverse is set to 1, then the special values are inferred from right to left.
Example:
- without reverse=1, for input shape = (10,5,4), shape = (-1,0), output shape would be (40,5). - with reverse=1, output shape will be (50,4).
reverse (bool, default False) – If true then the special values are inferred from right to left. Only supported as keyword argument.
- Returns
An array with desired shape that shares data with this array.
- Return type
Examples
>>> x = mx.nd.arange(0,6).reshape(2,3) >>> x.asnumpy() array([[ 0., 1., 2.], [ 3., 4., 5.]], dtype=float32) >>> y = x.reshape(3,2) >>> y.asnumpy() array([[ 0., 1.], [ 2., 3.], [ 4., 5.]], dtype=float32) >>> y = x.reshape(3,-1) >>> y.asnumpy() array([[ 0., 1.], [ 2., 3.], [ 4., 5.]], dtype=float32) >>> y = x.reshape(3,2) >>> y.asnumpy() array([[ 0., 1.], [ 2., 3.], [ 4., 5.]], dtype=float32) >>> y = x.reshape(-3) >>> y.asnumpy() array([ 0. 1. 2. 3. 4. 5.], dtype=float32) >>> y[:] = -1 >>> x.asnumpy() array([[-1., -1., -1.], [-1., -1., -1.]], dtype=float32)
-
property
size
¶ Number of elements in the array.
Equivalent to the product of the array’s dimensions.
Examples
>>> import numpy as np >>> x = mx.nd.zeros((3, 5, 2)) >>> x.size 30 >>> np.prod(x.shape) 30
-
-
class
mxnet.ndarray.sparse.
CSRNDArray
(handle, writable=True)[source]¶ Bases:
mxnet.ndarray.sparse.BaseSparseNDArray
A sparse representation of 2D NDArray in the Compressed Sparse Row format.
A CSRNDArray represents an NDArray as three separate arrays: data, indptr and indices. It uses the CSR representation where the column indices for row i are stored in
indices[indptr[i]:indptr[i+1]]
and their corresponding values are stored indata[indptr[i]:indptr[i+1]]
.The column indices for a given row are expected to be sorted in ascending order. Duplicate column entries for the same row are not allowed.
Example
Methods
asscipy
()Returns a
scipy.sparse.csr.csr_matrix
object with value copied from this arraycopyto
(other)Copies the value of this array to another array.
tostype
(stype)Return a copy of the array with chosen storage type.
Attributes
A deep copy NDArray of the data array of the CSRNDArray.
A deep copy NDArray of the indices array of the CSRNDArray.
A deep copy NDArray of the indptr array of the CSRNDArray.
>>> a = mx.nd.array([[0, 1, 0], [2, 0, 0], [0, 0, 0], [0, 0, 3]]) >>> a = a.tostype('csr') >>> a.data.asnumpy() array([ 1., 2., 3.], dtype=float32) >>> a.indices.asnumpy() array([1, 0, 2]) >>> a.indptr.asnumpy() array([0, 1, 2, 2, 3])
See also
csr_matrix
Several ways to construct a CSRNDArray
-
asscipy
()[source]¶ Returns a
scipy.sparse.csr.csr_matrix
object with value copied from this arrayExamples
>>> x = mx.nd.sparse.zeros('csr', (2,3)) >>> y = x.asscipy() >>> type(y) <type 'scipy.sparse.csr.csr_matrix'> >>> y <2x3 sparse matrix of type '<type 'numpy.float32'>' with 0 stored elements in Compressed Sparse Row format>
-
copyto
(other)[source]¶ Copies the value of this array to another array.
If
other
is aNDArray
orCSRNDArray
object, thenother.shape
andself.shape
should be the same. This function copies the value fromself
toother
.If
other
is a context, a newCSRNDArray
will be first created on the target context, and the value ofself
is copied.- Parameters
other (NDArray or CSRNDArray or Context) – The destination array or context.
- Returns
The copied array. If
other
is anNDArray
orCSRNDArray
, then the return value andother
will point to the sameNDArray
orCSRNDArray
.- Return type
-
property
data
¶ A deep copy NDArray of the data array of the CSRNDArray. This generates a deep copy of the data of the current csr matrix.
- Returns
This CSRNDArray’s data array.
- Return type
-
property
indices
¶ A deep copy NDArray of the indices array of the CSRNDArray. This generates a deep copy of the column indices of the current csr matrix.
- Returns
This CSRNDArray’s indices array.
- Return type
-
property
indptr
¶ A deep copy NDArray of the indptr array of the CSRNDArray. This generates a deep copy of the indptr of the current csr matrix.
- Returns
This CSRNDArray’s indptr array.
- Return type
-
class
mxnet.ndarray.sparse.
RowSparseNDArray
(handle, writable=True)[source]¶ Bases:
mxnet.ndarray.sparse.BaseSparseNDArray
A sparse representation of a set of NDArray row slices at given indices.
A RowSparseNDArray represents a multidimensional NDArray using two separate arrays: data and indices. The number of dimensions has to be at least 2.
data: an NDArray of any dtype with shape [D0, D1, …, Dn].
indices: a 1-D int64 NDArray with shape [D0] with values sorted in ascending order.
Methods
copyto
(other)Copies the value of this array to another array.
retain
(*args, **kwargs)Convenience fluent method for
retain()
.tostype
(stype)Return a copy of the array with chosen storage type.
Attributes
A deep copy NDArray of the data array of the RowSparseNDArray.
A deep copy NDArray of the indices array of the RowSparseNDArray.
The indices stores the indices of the row slices with non-zeros, while the values are stored in data. The corresponding NDArray
dense
represented by RowSparseNDArrayrsp
hasdense[rsp.indices[i], :, :, :, ...] = rsp.data[i, :, :, :, ...]
>>> dense.asnumpy() array([[ 1., 2., 3.], [ 0., 0., 0.], [ 4., 0., 5.], [ 0., 0., 0.], [ 0., 0., 0.]], dtype=float32) >>> rsp = dense.tostype('row_sparse') >>> rsp.indices.asnumpy() array([0, 2], dtype=int64) >>> rsp.data.asnumpy() array([[ 1., 2., 3.], [ 4., 0., 5.]], dtype=float32)
A RowSparseNDArray is typically used to represent non-zero row slices of a large NDArray of shape [LARGE0, D1, .. , Dn] where LARGE0 >> D0 and most row slices are zeros.
RowSparseNDArray is used principally in the definition of gradients for operations that have sparse gradients (e.g. sparse dot and sparse embedding).
See also
row_sparse_array
Several ways to construct a RowSparseNDArray
-
copyto
(other)[source]¶ Copies the value of this array to another array.
If
other
is aNDArray
orRowSparseNDArray
object, thenother.shape
andself.shape
should be the same. This function copies the value fromself
toother
.If
other
is a context, a newRowSparseNDArray
will be first created on the target context, and the value ofself
is copied.- Parameters
other (NDArray or RowSparseNDArray or Context) – The destination array or context.
- Returns
The copied array. If
other
is anNDArray
orRowSparseNDArray
, then the return value andother
will point to the sameNDArray
orRowSparseNDArray
.- Return type
-
property
data
¶ A deep copy NDArray of the data array of the RowSparseNDArray. This generates a deep copy of the data of the current row_sparse matrix.
- Returns
This RowSparseNDArray’s data array.
- Return type
-
property
indices
¶ A deep copy NDArray of the indices array of the RowSparseNDArray. This generates a deep copy of the row indices of the current row_sparse matrix.
- Returns
This RowSparseNDArray’s indices array.
- Return type
-
retain
(*args, **kwargs)[source]¶ Convenience fluent method for
retain()
.The arguments are the same as for
retain()
, with this array as data.
-
mxnet.ndarray.sparse.
add
(lhs, rhs)[source]¶ Returns element-wise sum of the input arrays with broadcasting.
Equivalent to
lhs + rhs
,mx.nd.broadcast_add(lhs, rhs)
andmx.nd.broadcast_plus(lhs, rhs)
when shapes of lhs and rhs do not match. If lhs.shape == rhs.shape, this is equivalent tomx.nd.elemwise_add(lhs, rhs)
Note
If the corresponding dimensions of two arrays have the same size or one of them has size 1, then the arrays are broadcastable to a common shape.abs
- Parameters
lhs (scalar or mxnet.ndarray.sparse.array) – First array to be added.
rhs (scalar or mxnet.ndarray.sparse.array) – Second array to be added. If
lhs.shape != rhs.shape
, they must be broadcastable to a common shape.
- Returns
The element-wise sum of the input arrays.
- Return type
Examples
>>> a = mx.nd.ones((2,3)).tostype('csr') >>> b = mx.nd.ones((2,3)).tostype('csr') >>> a.asnumpy() array([[ 1., 1., 1.], [ 1., 1., 1.]], dtype=float32) >>> b.asnumpy() array([[ 1., 1., 1.], [ 1., 1., 1.]], dtype=float32) >>> (a+b).asnumpy() array([[ 2., 2., 2.], [ 2., 2., 2.]], dtype=float32) >>> c = mx.nd.ones((2,3)).tostype('row_sparse') >>> d = mx.nd.ones((2,3)).tostype('row_sparse') >>> c.asnumpy() array([[ 1., 1., 1.], [ 1., 1., 1.]], dtype=float32) >>> d.asnumpy() array([[ 1., 1., 1.], [ 1., 1., 1.]], dtype=float32) >>> (c+d).asnumpy() array([[ 2., 2., 2.], [ 2., 2., 2.]], dtype=float32)
-
mxnet.ndarray.sparse.
subtract
(lhs, rhs)[source]¶ Returns element-wise difference of the input arrays with broadcasting.
Equivalent to
lhs - rhs
,mx.nd.broadcast_sub(lhs, rhs)
andmx.nd.broadcast_minus(lhs, rhs)
when shapes of lhs and rhs do not match. If lhs.shape == rhs.shape, this is equivalent tomx.nd.elemwise_sub(lhs, rhs)
Note
If the corresponding dimensions of two arrays have the same size or one of them has size 1, then the arrays are broadcastable to a common shape.
- Parameters
lhs (scalar or mxnet.ndarray.sparse.array) – First array to be subtracted.
rhs (scalar or mxnet.ndarray.sparse.array) – Second array to be subtracted. If
lhs.shape != rhs.shape
, they must be broadcastable to a common shape.__spec__
- Returns
The element-wise difference of the input arrays.
- Return type
Examples
>>> a = mx.nd.ones((2,3)).tostype('csr') >>> b = mx.nd.ones((2,3)).tostype('csr') >>> a.asnumpy() array([[ 1., 1., 1.], [ 1., 1., 1.]], dtype=float32) >>> b.asnumpy() array([[ 1., 1., 1.], [ 1., 1., 1.]], dtype=float32) >>> (a-b).asnumpy() array([[ 0., 0., 0.], [ 0., 0., 0.]], dtype=float32) >>> c = mx.nd.ones((2,3)).tostype('row_sparse') >>> d = mx.nd.ones((2,3)).tostype('row_sparse') >>> c.asnumpy() array([[ 1., 1., 1.], [ 1., 1., 1.]], dtype=float32) >>> d.asnumpy() array([[ 1., 1., 1.], [ 1., 1., 1.]], dtype=float32) >>> (c-d).asnumpy() array([[ 0., 0., 0.], [ 0., 0., 0.]], dtype=float32)
-
mxnet.ndarray.sparse.
multiply
(lhs, rhs)[source]¶ Returns element-wise product of the input arrays with broadcasting.
Equivalent to
lhs * rhs
andmx.nd.broadcast_mul(lhs, rhs)
when shapes of lhs and rhs do not match. If lhs.shape == rhs.shape, this is equivalent tomx.nd.elemwise_mul(lhs, rhs)
Note
If the corresponding dimensions of two arrays have the same size or one of them has size 1, then the arrays are broadcastable to a common shape.
- Parameters
lhs (scalar or mxnet.ndarray.sparse.array) – First array to be multiplied.
rhs (scalar or mxnet.ndarray.sparse.array) – Second array to be multiplied. If
lhs.shape != rhs.shape
, they must be broadcastable to a common shape.
- Returns
The element-wise multiplication of the input arrays.
- Return type
Examples
>>> x = mx.nd.ones((2,3)).tostype('csr') >>> y = mx.nd.arange(2).reshape((2,1)) >>> z = mx.nd.arange(3) >>> x.asnumpy() array([[ 1., 1., 1.], [ 1., 1., 1.]], dtype=float32) >>> y.asnumpy() array([[ 0.], [ 1.]], dtype=float32) >>> z.asnumpy() array([ 0., 1., 2.], dtype=float32) >>> (x*2).asnumpy() array([[ 2., 2., 2.], [ 2., 2., 2.]], dtype=float32) >>> (x*y).asnumpy() array([[ 0., 0., 0.], [ 1., 1., 1.]], dtype=float32) >>> mx.nd.sparse.multiply(x, y).asnumpy() array([[ 0., 0., 0.], [ 1., 1., 1.]], dtype=float32) >>> (x*z).asnumpy() array([[ 0., 1., 2.], [ 0., 1., 2.]], dtype=float32) >>> mx.nd.sparse.multiply(x, z).asnumpy() array([[ 0., 1., 2.], [ 0., 1., 2.]], dtype=float32) >>> z = z.reshape((1, 3)) >>> z.asnumpy() array([[ 0., 1., 2.]], dtype=float32) >>> (x*z).asnumpy() array([[ 0., 1., 2.], [ 0., 1., 2.]], dtype=float32) >>> mx.nd.sparse.multiply(x, z).asnumpy() array([[ 0., 1., 2.], [ 0., 1., 2.]], dtype=float32)
-
mxnet.ndarray.sparse.
divide
(lhs, rhs)[source]¶ Returns element-wise division of the input arrays with broadcasting.
Equivalent to
lhs / rhs
andmx.nd.broadcast_div(lhs, rhs)
when shapes of lhs and rhs do not match. If lhs.shape == rhs.shape, this is equivalent tomx.nd.elemwise_div(lhs, rhs)
Note
If the corresponding dimensions of two arrays have the same size or one of them has size 1, then the arrays are broadcastable to a common shape.
- Parameters
lhs (scalar or mxnet.ndarray.sparse.array) – First array in division.
rhs (scalar or mxnet.ndarray.sparse.array) – Second array in division. The arrays to be divided. If
lhs.shape != rhs.shape
, they must be broadcastable to a common shape.
- Returns
The element-wise division of the input arrays.
- Return type
Examples
>>> x = (mx.nd.ones((2,3))*6).tostype('csr') >>> y = mx.nd.arange(2).reshape((2,1)) + 1 >>> z = mx.nd.arange(3) + 1 >>> x.asnumpy() array([[ 6., 6., 6.], [ 6., 6., 6.]], dtype=float32) >>> y.asnumpy() array([[ 1.], [ 2.]], dtype=float32) >>> z.asnumpy() array([ 1., 2., 3.], dtype=float32) >>> x/2 <NDArray 2x3 @cpu(0)> >>> (x/3).asnumpy() array([[ 2., 2., 2.], [ 2., 2., 2.]], dtype=float32) >>> (x/y).asnumpy() array([[ 6., 6., 6.], [ 3., 3., 3.]], dtype=float32) >>> mx.nd.sparse.divide(x,y).asnumpy() array([[ 6., 6., 6.], [ 3., 3., 3.]], dtype=float32) >>> (x/z).asnumpy() array([[ 6., 3., 2.], [ 6., 3., 2.]], dtype=float32) >>> mx.nd.sprase.divide(x,z).asnumpy() array([[ 6., 3., 2.], [ 6., 3., 2.]], dtype=float32) >>> z = z.reshape((1,3)) >>> z.asnumpy() array([[ 1., 2., 3.]], dtype=float32) >>> (x/z).asnumpy() array([[ 6., 3., 2.], [ 6., 3., 2.]], dtype=float32) >>> mx.nd.sparse.divide(x,z).asnumpy() array([[ 6., 3., 2.], [ 6., 3., 2.]], dtype=float32)
-
mxnet.ndarray.sparse.
ElementWiseSum
(*args, **kwargs)¶ Adds all input arguments element-wise.
\[add\_n(a_1, a_2, ..., a_n) = a_1 + a_2 + ... + a_n\]add_n
is potentially more efficient than callingadd
by n times.The storage type of
add_n
output depends on storage types of inputsadd_n(row_sparse, row_sparse, ..) = row_sparse
add_n(default, csr, default) = default
add_n(any input combinations longer than 4 (>4) with at least one default type) = default
otherwise,
add_n
falls all inputs back to default storage and generates default storage
Defined in src/operator/tensor/elemwise_sum.cc:L155
-
mxnet.ndarray.sparse.
Embedding
(data=None, weight=None, input_dim=_Null, output_dim=_Null, dtype=_Null, sparse_grad=_Null, out=None, name=None, **kwargs)¶ Maps integer indices to vector representations (embeddings).
This operator maps words to real-valued vectors in a high-dimensional space, called word embeddings. These embeddings can capture semantic and syntactic properties of the words. For example, it has been noted that in the learned embedding spaces, similar words tend to be close to each other and dissimilar words far apart.
For an input array of shape (d1, …, dK), the shape of an output array is (d1, …, dK, output_dim). All the input values should be integers in the range [0, input_dim).
If the input_dim is ip0 and output_dim is op0, then shape of the embedding weight matrix must be (ip0, op0).
When “sparse_grad” is False, if any index mentioned is too large, it is replaced by the index that addresses the last vector in an embedding matrix. When “sparse_grad” is True, an error will be raised if invalid indices are found.
Examples:
input_dim = 4 output_dim = 5 // Each row in weight matrix y represents a word. So, y = (w0,w1,w2,w3) y = [[ 0., 1., 2., 3., 4.], [ 5., 6., 7., 8., 9.], [ 10., 11., 12., 13., 14.], [ 15., 16., 17., 18., 19.]] // Input array x represents n-grams(2-gram). So, x = [(w1,w3), (w0,w2)] x = [[ 1., 3.], [ 0., 2.]] // Mapped input x to its vector representation y. Embedding(x, y, 4, 5) = [[[ 5., 6., 7., 8., 9.], [ 15., 16., 17., 18., 19.]], [[ 0., 1., 2., 3., 4.], [ 10., 11., 12., 13., 14.]]]
The storage type of weight can be either row_sparse or default.
Note
If “sparse_grad” is set to True, the storage type of gradient w.r.t weights will be “row_sparse”. Only a subset of optimizers support sparse gradients, including SGD, AdaGrad and Adam. Note that by default lazy updates is turned on, which may perform differently from standard updates. For more details, please check the Optimization API at: https://mxnet.incubator.apache.org/api/python/optimization/optimization.html
Defined in src/operator/tensor/indexing_op.cc:L597
- Parameters
data (NDArray) – The input array to the embedding operator.
weight (NDArray) – The embedding weight matrix.
input_dim (int, required) – Vocabulary size of the input indices.
output_dim (int, required) – Dimension of the embedding vectors.
dtype ({'bfloat16', 'float16', 'float32', 'float64', 'int32', 'int64', 'int8', 'uint8'},optional, default='float32') – Data type of weight.
sparse_grad (boolean, optional, default=0) – Compute row sparse gradient in the backward calculation. If set to True, the grad’s storage type is row_sparse.
out (NDArray, optional) – The output NDArray to hold the result.
- Returns
out – The output of this function.
- Return type
NDArray or list of NDArrays
-
mxnet.ndarray.sparse.
FullyConnected
(data=None, weight=None, bias=None, num_hidden=_Null, no_bias=_Null, flatten=_Null, out=None, name=None, **kwargs)¶ Applies a linear transformation: \(Y = XW^T + b\).
If
flatten
is set to be true, then the shapes are:data: (batch_size, x1, x2, …, xn)
weight: (num_hidden, x1 * x2 * … * xn)
bias: (num_hidden,)
out: (batch_size, num_hidden)
If
flatten
is set to be false, then the shapes are:data: (x1, x2, …, xn, input_dim)
weight: (num_hidden, input_dim)
bias: (num_hidden,)
out: (x1, x2, …, xn, num_hidden)
The learnable parameters include both
weight
andbias
.If
no_bias
is set to be true, then thebias
term is ignored.Note
The sparse support for FullyConnected is limited to forward evaluation with row_sparse weight and bias, where the length of weight.indices and bias.indices must be equal to num_hidden. This could be useful for model inference with row_sparse weights trained with importance sampling or noise contrastive estimation.
To compute linear transformation with ‘csr’ sparse data, sparse.dot is recommended instead of sparse.FullyConnected.
Defined in src/operator/nn/fully_connected.cc:L286
- Parameters
data (NDArray) – Input data.
weight (NDArray) – Weight matrix.
bias (NDArray) – Bias parameter.
num_hidden (int, required) – Number of hidden nodes of the output.
no_bias (boolean, optional, default=0) – Whether to disable bias parameter.
flatten (boolean, optional, default=1) – Whether to collapse all but the first axis of the input data tensor.
out (NDArray, optional) – The output NDArray to hold the result.
- Returns
out – The output of this function.
- Return type
NDArray or list of NDArrays
-
mxnet.ndarray.sparse.
LinearRegressionOutput
(data=None, label=None, grad_scale=_Null, out=None, name=None, **kwargs)¶ Computes and optimizes for squared loss during backward propagation. Just outputs
data
during forward propagation.If \(\hat{y}_i\) is the predicted value of the i-th sample, and \(y_i\) is the corresponding target value, then the squared loss estimated over \(n\) samples is defined as
\(\text{SquaredLoss}(\textbf{Y}, \hat{\textbf{Y}} ) = \frac{1}{n} \sum_{i=0}^{n-1} \lVert \textbf{y}_i - \hat{\textbf{y}}_i \rVert_2\)
Note
Use the LinearRegressionOutput as the final output layer of a net.
The storage type of
label
can bedefault
orcsr
LinearRegressionOutput(default, default) = default
LinearRegressionOutput(default, csr) = default
By default, gradients of this loss function are scaled by factor 1/m, where m is the number of regression outputs of a training example. The parameter grad_scale can be used to change this scale to grad_scale/m.
Defined in src/operator/regression_output.cc:L92
- Parameters
- Returns
out – The output of this function.
- Return type
NDArray or list of NDArrays
-
mxnet.ndarray.sparse.
LogisticRegressionOutput
(data=None, label=None, grad_scale=_Null, out=None, name=None, **kwargs)¶ Applies a logistic function to the input.
The logistic function, also known as the sigmoid function, is computed as \(\frac{1}{1+exp(-\textbf{x})}\).
Commonly, the sigmoid is used to squash the real-valued output of a linear model \(wTx+b\) into the [0,1] range so that it can be interpreted as a probability. It is suitable for binary classification or probability prediction tasks.
Note
Use the LogisticRegressionOutput as the final output layer of a net.
The storage type of
label
can bedefault
orcsr
LogisticRegressionOutput(default, default) = default
LogisticRegressionOutput(default, csr) = default
The loss function used is the Binary Cross Entropy Loss:
\(-{(y\log(p) + (1 - y)\log(1 - p))}\)
Where y is the ground truth probability of positive outcome for a given example, and p the probability predicted by the model. By default, gradients of this loss function are scaled by factor 1/m, where m is the number of regression outputs of a training example. The parameter grad_scale can be used to change this scale to grad_scale/m.
Defined in src/operator/regression_output.cc:L152
- Parameters
- Returns
out – The output of this function.
- Return type
NDArray or list of NDArrays
-
mxnet.ndarray.sparse.
MAERegressionOutput
(data=None, label=None, grad_scale=_Null, out=None, name=None, **kwargs)¶ Computes mean absolute error of the input.
MAE is a risk metric corresponding to the expected value of the absolute error.
If \(\hat{y}_i\) is the predicted value of the i-th sample, and \(y_i\) is the corresponding target value, then the mean absolute error (MAE) estimated over \(n\) samples is defined as
\(\text{MAE}(\textbf{Y}, \hat{\textbf{Y}} ) = \frac{1}{n} \sum_{i=0}^{n-1} \lVert \textbf{y}_i - \hat{\textbf{y}}_i \rVert_1\)
Note
Use the MAERegressionOutput as the final output layer of a net.
The storage type of
label
can bedefault
orcsr
MAERegressionOutput(default, default) = default
MAERegressionOutput(default, csr) = default
By default, gradients of this loss function are scaled by factor 1/m, where m is the number of regression outputs of a training example. The parameter grad_scale can be used to change this scale to grad_scale/m.
Defined in src/operator/regression_output.cc:L120
- Parameters
- Returns
out – The output of this function.
- Return type
NDArray or list of NDArrays
-
mxnet.ndarray.sparse.
abs
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise absolute value of the input.
Example:
abs([-2, 0, 3]) = [2, 0, 3]
The storage type of
abs
output depends upon the input storage type:abs(default) = default
abs(row_sparse) = row_sparse
abs(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L720
-
mxnet.ndarray.sparse.
adagrad_update
(weight=None, grad=None, history=None, lr=_Null, epsilon=_Null, wd=_Null, rescale_grad=_Null, clip_gradient=_Null, out=None, name=None, **kwargs)¶ Update function for AdaGrad optimizer.
Referenced from Adaptive Subgradient Methods for Online Learning and Stochastic Optimization, and available at http://www.jmlr.org/papers/volume12/duchi11a/duchi11a.pdf.
Updates are applied by:
rescaled_grad = clip(grad * rescale_grad, clip_gradient) history = history + square(rescaled_grad) w = w - learning_rate * rescaled_grad / sqrt(history + epsilon)
Note that non-zero values for the weight decay option are not supported.
Defined in src/operator/optimizer_op.cc:L908
- Parameters
weight (NDArray) – Weight
grad (NDArray) – Gradient
history (NDArray) – History
lr (float, required) – Learning rate
epsilon (float, optional, default=1.00000001e-07) – epsilon
wd (float, optional, default=0) – weight decay
rescale_grad (float, optional, default=1) – Rescale gradient to grad = rescale_grad*grad.
clip_gradient (float, optional, default=-1) – Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
out (NDArray, optional) – The output NDArray to hold the result.
- Returns
out – The output of this function.
- Return type
NDArray or list of NDArrays
-
mxnet.ndarray.sparse.
adam_update
(weight=None, grad=None, mean=None, var=None, lr=_Null, beta1=_Null, beta2=_Null, epsilon=_Null, wd=_Null, rescale_grad=_Null, clip_gradient=_Null, lazy_update=_Null, out=None, name=None, **kwargs)¶ Update function for Adam optimizer. Adam is seen as a generalization of AdaGrad.
Adam update consists of the following steps, where g represents gradient and m, v are 1st and 2nd order moment estimates (mean and variance).
\[\begin{split}g_t = \nabla J(W_{t-1})\\ m_t = \beta_1 m_{t-1} + (1 - \beta_1) g_t\\ v_t = \beta_2 v_{t-1} + (1 - \beta_2) g_t^2\\ W_t = W_{t-1} - \alpha \frac{ m_t }{ \sqrt{ v_t } + \epsilon }\end{split}\]It updates the weights using:
m = beta1*m + (1-beta1)*grad v = beta2*v + (1-beta2)*(grad**2) w += - learning_rate * m / (sqrt(v) + epsilon)
However, if grad’s storage type is
row_sparse
,lazy_update
is True and the storage type of weight is the same as those of m and v, only the row slices whose indices appear in grad.indices are updated (for w, m and v):for row in grad.indices: m[row] = beta1*m[row] + (1-beta1)*grad[row] v[row] = beta2*v[row] + (1-beta2)*(grad[row]**2) w[row] += - learning_rate * m[row] / (sqrt(v[row]) + epsilon)
Defined in src/operator/optimizer_op.cc:L687
- Parameters
weight (NDArray) – Weight
grad (NDArray) – Gradient
mean (NDArray) – Moving mean
var (NDArray) – Moving variance
lr (float, required) – Learning rate
beta1 (float, optional, default=0.899999976) – The decay rate for the 1st moment estimates.
beta2 (float, optional, default=0.999000013) – The decay rate for the 2nd moment estimates.
epsilon (float, optional, default=9.99999994e-09) – A small constant for numerical stability.
wd (float, optional, default=0) – Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight.
rescale_grad (float, optional, default=1) – Rescale gradient to grad = rescale_grad*grad.
clip_gradient (float, optional, default=-1) – Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
lazy_update (boolean, optional, default=1) – If true, lazy updates are applied if gradient’s stype is row_sparse and all of w, m and v have the same stype
out (NDArray, optional) – The output NDArray to hold the result.
- Returns
out – The output of this function.
- Return type
NDArray or list of NDArrays
-
mxnet.ndarray.sparse.
add_n
(*args, **kwargs)¶ Adds all input arguments element-wise.
\[add\_n(a_1, a_2, ..., a_n) = a_1 + a_2 + ... + a_n\]add_n
is potentially more efficient than callingadd
by n times.The storage type of
add_n
output depends on storage types of inputsadd_n(row_sparse, row_sparse, ..) = row_sparse
add_n(default, csr, default) = default
add_n(any input combinations longer than 4 (>4) with at least one default type) = default
otherwise,
add_n
falls all inputs back to default storage and generates default storage
Defined in src/operator/tensor/elemwise_sum.cc:L155
-
mxnet.ndarray.sparse.
arccos
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise inverse cosine of the input array.
The input should be in range [-1, 1]. The output is in the closed interval \([0, \pi]\)
\[arccos([-1, -.707, 0, .707, 1]) = [\pi, 3\pi/4, \pi/2, \pi/4, 0]\]The storage type of
arccos
output is always denseDefined in src/operator/tensor/elemwise_unary_op_trig.cc:L233
-
mxnet.ndarray.sparse.
arccosh
(data=None, out=None, name=None, **kwargs)¶ Returns the element-wise inverse hyperbolic cosine of the input array, computed element-wise.
The storage type of
arccosh
output is always denseDefined in src/operator/tensor/elemwise_unary_op_trig.cc:L535
-
mxnet.ndarray.sparse.
arcsin
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise inverse sine of the input array.
The input should be in the range [-1, 1]. The output is in the closed interval of [\(-\pi/2\), \(\pi/2\)].
\[arcsin([-1, -.707, 0, .707, 1]) = [-\pi/2, -\pi/4, 0, \pi/4, \pi/2]\]The storage type of
arcsin
output depends upon the input storage type:arcsin(default) = default
arcsin(row_sparse) = row_sparse
arcsin(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L187
-
mxnet.ndarray.sparse.
arcsinh
(data=None, out=None, name=None, **kwargs)¶ Returns the element-wise inverse hyperbolic sine of the input array, computed element-wise.
The storage type of
arcsinh
output depends upon the input storage type:arcsinh(default) = default
arcsinh(row_sparse) = row_sparse
arcsinh(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L494
-
mxnet.ndarray.sparse.
arctan
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise inverse tangent of the input array.
The output is in the closed interval \([-\pi/2, \pi/2]\)
\[arctan([-1, 0, 1]) = [-\pi/4, 0, \pi/4]\]The storage type of
arctan
output depends upon the input storage type:arctan(default) = default
arctan(row_sparse) = row_sparse
arctan(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L282
-
mxnet.ndarray.sparse.
arctanh
(data=None, out=None, name=None, **kwargs)¶ Returns the element-wise inverse hyperbolic tangent of the input array, computed element-wise.
The storage type of
arctanh
output depends upon the input storage type:arctanh(default) = default
arctanh(row_sparse) = row_sparse
arctanh(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L579
-
mxnet.ndarray.sparse.
broadcast_add
(lhs=None, rhs=None, out=None, name=None, **kwargs)¶ Returns element-wise sum of the input arrays with broadcasting.
broadcast_plus is an alias to the function broadcast_add.
Example:
x = [[ 1., 1., 1.], [ 1., 1., 1.]] y = [[ 0.], [ 1.]] broadcast_add(x, y) = [[ 1., 1., 1.], [ 2., 2., 2.]] broadcast_plus(x, y) = [[ 1., 1., 1.], [ 2., 2., 2.]]
Supported sparse operations:
broadcast_add(csr, dense(1D)) = dense broadcast_add(dense(1D), csr) = dense
Defined in src/operator/tensor/elemwise_binary_broadcast_op_basic.cc:L57
-
mxnet.ndarray.sparse.
broadcast_div
(lhs=None, rhs=None, out=None, name=None, **kwargs)¶ Returns element-wise division of the input arrays with broadcasting.
Example:
x = [[ 6., 6., 6.], [ 6., 6., 6.]] y = [[ 2.], [ 3.]] broadcast_div(x, y) = [[ 3., 3., 3.], [ 2., 2., 2.]]
Supported sparse operations:
broadcast_div(csr, dense(1D)) = csr
Defined in src/operator/tensor/elemwise_binary_broadcast_op_basic.cc:L186
-
mxnet.ndarray.sparse.
broadcast_minus
(lhs=None, rhs=None, out=None, name=None, **kwargs)¶ Returns element-wise difference of the input arrays with broadcasting.
broadcast_minus is an alias to the function broadcast_sub.
Example:
x = [[ 1., 1., 1.], [ 1., 1., 1.]] y = [[ 0.], [ 1.]] broadcast_sub(x, y) = [[ 1., 1., 1.], [ 0., 0., 0.]] broadcast_minus(x, y) = [[ 1., 1., 1.], [ 0., 0., 0.]]
Supported sparse operations:
broadcast_sub/minus(csr, dense(1D)) = dense broadcast_sub/minus(dense(1D), csr) = dense
Defined in src/operator/tensor/elemwise_binary_broadcast_op_basic.cc:L105
-
mxnet.ndarray.sparse.
broadcast_mul
(lhs=None, rhs=None, out=None, name=None, **kwargs)¶ Returns element-wise product of the input arrays with broadcasting.
Example:
x = [[ 1., 1., 1.], [ 1., 1., 1.]] y = [[ 0.], [ 1.]] broadcast_mul(x, y) = [[ 0., 0., 0.], [ 1., 1., 1.]]
Supported sparse operations:
broadcast_mul(csr, dense(1D)) = csr
Defined in src/operator/tensor/elemwise_binary_broadcast_op_basic.cc:L145
-
mxnet.ndarray.sparse.
broadcast_plus
(lhs=None, rhs=None, out=None, name=None, **kwargs)¶ Returns element-wise sum of the input arrays with broadcasting.
broadcast_plus is an alias to the function broadcast_add.
Example:
x = [[ 1., 1., 1.], [ 1., 1., 1.]] y = [[ 0.], [ 1.]] broadcast_add(x, y) = [[ 1., 1., 1.], [ 2., 2., 2.]] broadcast_plus(x, y) = [[ 1., 1., 1.], [ 2., 2., 2.]]
Supported sparse operations:
broadcast_add(csr, dense(1D)) = dense broadcast_add(dense(1D), csr) = dense
Defined in src/operator/tensor/elemwise_binary_broadcast_op_basic.cc:L57
-
mxnet.ndarray.sparse.
broadcast_sub
(lhs=None, rhs=None, out=None, name=None, **kwargs)¶ Returns element-wise difference of the input arrays with broadcasting.
broadcast_minus is an alias to the function broadcast_sub.
Example:
x = [[ 1., 1., 1.], [ 1., 1., 1.]] y = [[ 0.], [ 1.]] broadcast_sub(x, y) = [[ 1., 1., 1.], [ 0., 0., 0.]] broadcast_minus(x, y) = [[ 1., 1., 1.], [ 0., 0., 0.]]
Supported sparse operations:
broadcast_sub/minus(csr, dense(1D)) = dense broadcast_sub/minus(dense(1D), csr) = dense
Defined in src/operator/tensor/elemwise_binary_broadcast_op_basic.cc:L105
-
mxnet.ndarray.sparse.
cast_storage
(data=None, stype=_Null, out=None, name=None, **kwargs)¶ Casts tensor storage type to the new type.
When an NDArray with default storage type is cast to csr or row_sparse storage, the result is compact, which means:
for csr, zero values will not be retained
for row_sparse, row slices of all zeros will not be retained
The storage type of
cast_storage
output depends on stype parameter:cast_storage(csr, ‘default’) = default
cast_storage(row_sparse, ‘default’) = default
cast_storage(default, ‘csr’) = csr
cast_storage(default, ‘row_sparse’) = row_sparse
cast_storage(csr, ‘csr’) = csr
cast_storage(row_sparse, ‘row_sparse’) = row_sparse
Example:
dense = [[ 0., 1., 0.], [ 2., 0., 3.], [ 0., 0., 0.], [ 0., 0., 0.]] # cast to row_sparse storage type rsp = cast_storage(dense, 'row_sparse') rsp.indices = [0, 1] rsp.values = [[ 0., 1., 0.], [ 2., 0., 3.]] # cast to csr storage type csr = cast_storage(dense, 'csr') csr.indices = [1, 0, 2] csr.values = [ 1., 2., 3.] csr.indptr = [0, 1, 3, 3, 3]
Defined in src/operator/tensor/cast_storage.cc:L71
-
mxnet.ndarray.sparse.
cbrt
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise cube-root value of the input.
\[cbrt(x) = \sqrt[3]{x}\]Example:
cbrt([1, 8, -125]) = [1, 2, -5]
The storage type of
cbrt
output depends upon the input storage type:cbrt(default) = default
cbrt(row_sparse) = row_sparse
cbrt(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_pow.cc:L270
-
mxnet.ndarray.sparse.
ceil
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise ceiling of the input.
The ceil of the scalar x is the smallest integer i, such that i >= x.
Example:
ceil([-2.1, -1.9, 1.5, 1.9, 2.1]) = [-2., -1., 2., 2., 3.]
The storage type of
ceil
output depends upon the input storage type:ceil(default) = default
ceil(row_sparse) = row_sparse
ceil(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L817
-
mxnet.ndarray.sparse.
clip
(data=None, a_min=_Null, a_max=_Null, out=None, name=None, **kwargs)¶ Clips (limits) the values in an array. Given an interval, values outside the interval are clipped to the interval edges. Clipping
x
between a_min and a_max would be:: .. math:clip(x, a_min, a_max) = \max(\min(x, a_max), a_min))
- Example::
x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] clip(x,1,8) = [ 1., 1., 2., 3., 4., 5., 6., 7., 8., 8.]
The storage type of
clip
output depends on storage types of inputs and the a_min, a_max parameter values:clip(default) = default
clip(row_sparse, a_min <= 0, a_max >= 0) = row_sparse
clip(csr, a_min <= 0, a_max >= 0) = csr
clip(row_sparse, a_min < 0, a_max < 0) = default
clip(row_sparse, a_min > 0, a_max > 0) = default
clip(csr, a_min < 0, a_max < 0) = csr
clip(csr, a_min > 0, a_max > 0) = csr
Defined in src/operator/tensor/matrix_op.cc:L676
-
mxnet.ndarray.sparse.
concat
(*data, **kwargs)¶ Joins input arrays along a given axis.
Note
Concat is deprecated. Use concat instead.
The dimensions of the input arrays should be the same except the axis along which they will be concatenated. The dimension of the output array along the concatenated axis will be equal to the sum of the corresponding dimensions of the input arrays.
The storage type of
concat
output depends on storage types of inputsconcat(csr, csr, …, csr, dim=0) = csr
otherwise,
concat
generates output with default storage
Example:
x = [[1,1],[2,2]] y = [[3,3],[4,4],[5,5]] z = [[6,6], [7,7],[8,8]] concat(x,y,z,dim=0) = [[ 1., 1.], [ 2., 2.], [ 3., 3.], [ 4., 4.], [ 5., 5.], [ 6., 6.], [ 7., 7.], [ 8., 8.]] Note that you cannot concat x,y,z along dimension 1 since dimension 0 is not the same for all the input arrays. concat(y,z,dim=1) = [[ 3., 3., 6., 6.], [ 4., 4., 7., 7.], [ 5., 5., 8., 8.]]
Defined in src/operator/nn/concat.cc:L384
-
mxnet.ndarray.sparse.
cos
(data=None, out=None, name=None, **kwargs)¶ Computes the element-wise cosine of the input array.
The input should be in radians (\(2\pi\) rad equals 360 degrees).
\[cos([0, \pi/4, \pi/2]) = [1, 0.707, 0]\]The storage type of
cos
output is always denseDefined in src/operator/tensor/elemwise_unary_op_trig.cc:L90
-
mxnet.ndarray.sparse.
cosh
(data=None, out=None, name=None, **kwargs)¶ Returns the hyperbolic cosine of the input array, computed element-wise.
\[cosh(x) = 0.5\times(exp(x) + exp(-x))\]The storage type of
cosh
output is always denseDefined in src/operator/tensor/elemwise_unary_op_trig.cc:L409
-
mxnet.ndarray.sparse.
degrees
(data=None, out=None, name=None, **kwargs)¶ Converts each element of the input array from radians to degrees.
\[degrees([0, \pi/2, \pi, 3\pi/2, 2\pi]) = [0, 90, 180, 270, 360]\]The storage type of
degrees
output depends upon the input storage type:degrees(default) = default
degrees(row_sparse) = row_sparse
degrees(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L332
-
mxnet.ndarray.sparse.
dot
(lhs=None, rhs=None, transpose_a=_Null, transpose_b=_Null, forward_stype=_Null, out=None, name=None, **kwargs)¶ Dot product of two arrays.
dot
’s behavior depends on the input array dimensions:1-D arrays: inner product of vectors
2-D arrays: matrix multiplication
N-D arrays: a sum product over the last axis of the first input and the first axis of the second input
For example, given 3-D
x
with shape (n,m,k) andy
with shape (k,r,s), the result array will have shape (n,m,r,s). It is computed by:dot(x,y)[i,j,a,b] = sum(x[i,j,:]*y[:,a,b])
Example:
x = reshape([0,1,2,3,4,5,6,7], shape=(2,2,2)) y = reshape([7,6,5,4,3,2,1,0], shape=(2,2,2)) dot(x,y)[0,0,1,1] = 0 sum(x[0,0,:]*y[:,1,1]) = 0
The storage type of
dot
output depends on storage types of inputs, transpose option and forward_stype option for output storage type. Implemented sparse operations include:dot(default, default, transpose_a=True/False, transpose_b=True/False) = default
dot(csr, default, transpose_a=True) = default
dot(csr, default, transpose_a=True) = row_sparse
dot(csr, default) = default
dot(csr, row_sparse) = default
dot(default, csr) = csr (CPU only)
dot(default, csr, forward_stype=’default’) = default
dot(default, csr, transpose_b=True, forward_stype=’default’) = default
If the combination of input storage types and forward_stype does not match any of the above patterns,
dot
will fallback and generate output with default storage.Note
If the storage type of the lhs is “csr”, the storage type of gradient w.r.t rhs will be “row_sparse”. Only a subset of optimizers support sparse gradients, including SGD, AdaGrad and Adam. Note that by default lazy updates is turned on, which may perform differently from standard updates. For more details, please check the Optimization API at: https://mxnet.incubator.apache.org/api/python/optimization/optimization.html
Defined in src/operator/tensor/dot.cc:L77
- Parameters
lhs (NDArray) – The first input
rhs (NDArray) – The second input
transpose_a (boolean, optional, default=0) – If true then transpose the first input before dot.
transpose_b (boolean, optional, default=0) – If true then transpose the second input before dot.
forward_stype ({None, 'csr', 'default', 'row_sparse'},optional, default='None') – The desired storage type of the forward output given by user, if thecombination of input storage types and this hint does not matchany implemented ones, the dot operator will perform fallback operationand still produce an output of the desired storage type.
out (NDArray, optional) – The output NDArray to hold the result.
- Returns
out – The output of this function.
- Return type
NDArray or list of NDArrays
-
mxnet.ndarray.sparse.
elemwise_add
(lhs=None, rhs=None, out=None, name=None, **kwargs)¶ Adds arguments element-wise.
The storage type of
elemwise_add
output depends on storage types of inputselemwise_add(row_sparse, row_sparse) = row_sparse
elemwise_add(csr, csr) = csr
elemwise_add(default, csr) = default
elemwise_add(csr, default) = default
elemwise_add(default, rsp) = default
elemwise_add(rsp, default) = default
otherwise,
elemwise_add
generates output with default storage
-
mxnet.ndarray.sparse.
elemwise_div
(lhs=None, rhs=None, out=None, name=None, **kwargs)¶ Divides arguments element-wise.
The storage type of
elemwise_div
output is always dense
-
mxnet.ndarray.sparse.
elemwise_mul
(lhs=None, rhs=None, out=None, name=None, **kwargs)¶ Multiplies arguments element-wise.
The storage type of
elemwise_mul
output depends on storage types of inputselemwise_mul(default, default) = default
elemwise_mul(row_sparse, row_sparse) = row_sparse
elemwise_mul(default, row_sparse) = row_sparse
elemwise_mul(row_sparse, default) = row_sparse
elemwise_mul(csr, csr) = csr
otherwise,
elemwise_mul
generates output with default storage
-
mxnet.ndarray.sparse.
elemwise_sub
(lhs=None, rhs=None, out=None, name=None, **kwargs)¶ Subtracts arguments element-wise.
The storage type of
elemwise_sub
output depends on storage types of inputselemwise_sub(row_sparse, row_sparse) = row_sparse
elemwise_sub(csr, csr) = csr
elemwise_sub(default, csr) = default
elemwise_sub(csr, default) = default
elemwise_sub(default, rsp) = default
elemwise_sub(rsp, default) = default
otherwise,
elemwise_sub
generates output with default storage
-
mxnet.ndarray.sparse.
exp
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise exponential value of the input.
\[exp(x) = e^x \approx 2.718^x\]Example:
exp([0, 1, 2]) = [1., 2.71828175, 7.38905621]
The storage type of
exp
output is always denseDefined in src/operator/tensor/elemwise_unary_op_logexp.cc:L64
-
mxnet.ndarray.sparse.
expm1
(data=None, out=None, name=None, **kwargs)¶ Returns
exp(x) - 1
computed element-wise on the input.This function provides greater precision than
exp(x) - 1
for small values ofx
.The storage type of
expm1
output depends upon the input storage type:expm1(default) = default
expm1(row_sparse) = row_sparse
expm1(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_logexp.cc:L244
-
mxnet.ndarray.sparse.
fix
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise rounded value to the nearest integer towards zero of the input.
Example:
fix([-2.1, -1.9, 1.9, 2.1]) = [-2., -1., 1., 2.]
The storage type of
fix
output depends upon the input storage type:fix(default) = default
fix(row_sparse) = row_sparse
fix(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L874
-
mxnet.ndarray.sparse.
floor
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise floor of the input.
The floor of the scalar x is the largest integer i, such that i <= x.
Example:
floor([-2.1, -1.9, 1.5, 1.9, 2.1]) = [-3., -2., 1., 1., 2.]
The storage type of
floor
output depends upon the input storage type:floor(default) = default
floor(row_sparse) = row_sparse
floor(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L836
-
mxnet.ndarray.sparse.
ftrl_update
(weight=None, grad=None, z=None, n=None, lr=_Null, lamda1=_Null, beta=_Null, wd=_Null, rescale_grad=_Null, clip_gradient=_Null, out=None, name=None, **kwargs)¶ Update function for Ftrl optimizer. Referenced from Ad Click Prediction: a View from the Trenches, available at http://dl.acm.org/citation.cfm?id=2488200.
It updates the weights using:
rescaled_grad = clip(grad * rescale_grad, clip_gradient) z += rescaled_grad - (sqrt(n + rescaled_grad**2) - sqrt(n)) * weight / learning_rate n += rescaled_grad**2 w = (sign(z) * lamda1 - z) / ((beta + sqrt(n)) / learning_rate + wd) * (abs(z) > lamda1)
If w, z and n are all of
row_sparse
storage type, only the row slices whose indices appear in grad.indices are updated (for w, z and n):for row in grad.indices: rescaled_grad[row] = clip(grad[row] * rescale_grad, clip_gradient) z[row] += rescaled_grad[row] - (sqrt(n[row] + rescaled_grad[row]**2) - sqrt(n[row])) * weight[row] / learning_rate n[row] += rescaled_grad[row]**2 w[row] = (sign(z[row]) * lamda1 - z[row]) / ((beta + sqrt(n[row])) / learning_rate + wd) * (abs(z[row]) > lamda1)
Defined in src/operator/optimizer_op.cc:L875
- Parameters
weight (NDArray) – Weight
grad (NDArray) – Gradient
z (NDArray) – z
n (NDArray) – Square of grad
lr (float, required) – Learning rate
lamda1 (float, optional, default=0.00999999978) – The L1 regularization coefficient.
beta (float, optional, default=1) – Per-Coordinate Learning Rate beta.
wd (float, optional, default=0) – Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight.
rescale_grad (float, optional, default=1) – Rescale gradient to grad = rescale_grad*grad.
clip_gradient (float, optional, default=-1) – Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
out (NDArray, optional) – The output NDArray to hold the result.
- Returns
out – The output of this function.
- Return type
NDArray or list of NDArrays
-
mxnet.ndarray.sparse.
gamma
(data=None, out=None, name=None, **kwargs)¶ Returns the gamma function (extension of the factorial function to the reals), computed element-wise on the input array.
The storage type of
gamma
output is always dense
-
mxnet.ndarray.sparse.
gammaln
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise log of the absolute value of the gamma function of the input.
The storage type of
gammaln
output is always dense
-
mxnet.ndarray.sparse.
log
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise Natural logarithmic value of the input.
The natural logarithm is logarithm in base e, so that
log(exp(x)) = x
The storage type of
log
output is always denseDefined in src/operator/tensor/elemwise_unary_op_logexp.cc:L77
-
mxnet.ndarray.sparse.
log10
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise Base-10 logarithmic value of the input.
10**log10(x) = x
The storage type of
log10
output is always denseDefined in src/operator/tensor/elemwise_unary_op_logexp.cc:L94
-
mxnet.ndarray.sparse.
log1p
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise
log(1 + x)
value of the input.This function is more accurate than
log(1 + x)
for smallx
so that \(1+x\approx 1\)The storage type of
log1p
output depends upon the input storage type:log1p(default) = default
log1p(row_sparse) = row_sparse
log1p(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_logexp.cc:L199
-
mxnet.ndarray.sparse.
log2
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise Base-2 logarithmic value of the input.
2**log2(x) = x
The storage type of
log2
output is always denseDefined in src/operator/tensor/elemwise_unary_op_logexp.cc:L106
-
mxnet.ndarray.sparse.
make_loss
(data=None, out=None, name=None, **kwargs)¶ Make your own loss function in network construction.
This operator accepts a customized loss function symbol as a terminal loss and the symbol should be an operator with no backward dependency. The output of this function is the gradient of loss with respect to the input data.
For example, if you are a making a cross entropy loss function. Assume
out
is the predicted output andlabel
is the true label, then the cross entropy can be defined as:cross_entropy = label * log(out) + (1 - label) * log(1 - out) loss = make_loss(cross_entropy)
We will need to use
make_loss
when we are creating our own loss function or we want to combine multiple loss functions. Also we may want to stop some variables’ gradients from backpropagation. See more detail inBlockGrad
orstop_gradient
.The storage type of
make_loss
output depends upon the input storage type:make_loss(default) = default
make_loss(row_sparse) = row_sparse
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L358
-
mxnet.ndarray.sparse.
mean
(data=None, axis=_Null, keepdims=_Null, exclude=_Null, out=None, name=None, **kwargs)¶ Computes the mean of array elements over given axes.
Defined in src/operator/tensor/./broadcast_reduce_op.h:L83
- Parameters
data (NDArray) – The input
axis (Shape or None, optional, default=None) –
The axis or axes along which to perform the reduction.
The default, axis=(), will compute over all elements into a scalar array with shape (1,).
If axis is int, a reduction is performed on a particular axis.
If axis is a tuple of ints, a reduction is performed on all the axes specified in the tuple.
If exclude is true, reduction will be performed on the axes that are NOT in axis instead.
Negative values means indexing from right to left.
keepdims (boolean, optional, default=0) – If this is set to True, the reduced axes are left in the result as dimension with size one.
exclude (boolean, optional, default=0) – Whether to perform reduction on axis that are NOT in axis instead.
out (NDArray, optional) – The output NDArray to hold the result.
- Returns
out – The output of this function.
- Return type
NDArray or list of NDArrays
-
mxnet.ndarray.sparse.
negative
(data=None, out=None, name=None, **kwargs)¶ Numerical negative of the argument, element-wise.
The storage type of
negative
output depends upon the input storage type:negative(default) = default
negative(row_sparse) = row_sparse
negative(csr) = csr
-
mxnet.ndarray.sparse.
norm
(data=None, ord=_Null, axis=_Null, out_dtype=_Null, keepdims=_Null, out=None, name=None, **kwargs)¶ Computes the norm on an NDArray.
This operator computes the norm on an NDArray with the specified axis, depending on the value of the ord parameter. By default, it computes the L2 norm on the entire array. Currently only ord=2 supports sparse ndarrays.
Examples:
x = [[[1, 2], [3, 4]], [[2, 2], [5, 6]]] norm(x, ord=2, axis=1) = [[3.1622777 4.472136 ] [5.3851647 6.3245554]] norm(x, ord=1, axis=1) = [[4., 6.], [7., 8.]] rsp = x.cast_storage('row_sparse') norm(rsp) = [5.47722578] csr = x.cast_storage('csr') norm(csr) = [5.47722578]
Defined in src/operator/tensor/broadcast_reduce_norm_value.cc:L88
- Parameters
data (NDArray) – The input
ord (int, optional, default='2') – Order of the norm. Currently ord=1 and ord=2 is supported.
axis (Shape or None, optional, default=None) –
- The axis or axes along which to perform the reduction.
The default, axis=(), will compute over all elements into a scalar array with shape (1,). If axis is int, a reduction is performed on a particular axis. If axis is a 2-tuple, it specifies the axes that hold 2-D matrices, and the matrix norms of these matrices are computed.
out_dtype ({None, 'float16', 'float32', 'float64', 'int32', 'int64', 'int8'},optional, default='None') – The data type of the output.
keepdims (boolean, optional, default=0) – If this is set to True, the reduced axis is left in the result as dimension with size one.
out (NDArray, optional) – The output NDArray to hold the result.
- Returns
out – The output of this function.
- Return type
NDArray or list of NDArrays
-
mxnet.ndarray.sparse.
radians
(data=None, out=None, name=None, **kwargs)¶ Converts each element of the input array from degrees to radians.
\[radians([0, 90, 180, 270, 360]) = [0, \pi/2, \pi, 3\pi/2, 2\pi]\]The storage type of
radians
output depends upon the input storage type:radians(default) = default
radians(row_sparse) = row_sparse
radians(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L351
-
mxnet.ndarray.sparse.
relu
(data=None, out=None, name=None, **kwargs)¶ Computes rectified linear activation.
\[max(features, 0)\]The storage type of
relu
output depends upon the input storage type:relu(default) = default
relu(row_sparse) = row_sparse
relu(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L85
-
mxnet.ndarray.sparse.
retain
(data=None, indices=None, out=None, name=None, **kwargs)¶ Pick rows specified by user input index array from a row sparse matrix and save them in the output sparse matrix.
Example:
data = [[1, 2], [3, 4], [5, 6]] indices = [0, 1, 3] shape = (4, 2) rsp_in = row_sparse_array(data, indices) to_retain = [0, 3] rsp_out = retain(rsp_in, to_retain) rsp_out.data = [[1, 2], [5, 6]] rsp_out.indices = [0, 3]
The storage type of
retain
output depends on storage types of inputsretain(row_sparse, default) = row_sparse
otherwise,
retain
is not supported
Defined in src/operator/tensor/sparse_retain.cc:L53
-
mxnet.ndarray.sparse.
rint
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise rounded value to the nearest integer of the input.
Note
For input
n.5
rint
returnsn
whileround
returnsn+1
.For input
-n.5
bothrint
andround
returns-n-1
.
Example:
rint([-1.5, 1.5, -1.9, 1.9, 2.1]) = [-2., 1., -2., 2., 2.]
The storage type of
rint
output depends upon the input storage type:rint(default) = default
rint(row_sparse) = row_sparse
rint(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L798
-
mxnet.ndarray.sparse.
round
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise rounded value to the nearest integer of the input.
Example:
round([-1.5, 1.5, -1.9, 1.9, 2.1]) = [-2., 2., -2., 2., 2.]
The storage type of
round
output depends upon the input storage type:round(default) = default
round(row_sparse) = row_sparse
round(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L777
-
mxnet.ndarray.sparse.
rsqrt
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise inverse square-root value of the input.
\[rsqrt(x) = 1/\sqrt{x}\]Example:
rsqrt([4,9,16]) = [0.5, 0.33333334, 0.25]
The storage type of
rsqrt
output is always denseDefined in src/operator/tensor/elemwise_unary_op_pow.cc:L221
-
mxnet.ndarray.sparse.
sgd_mom_update
(weight=None, grad=None, mom=None, lr=_Null, momentum=_Null, wd=_Null, rescale_grad=_Null, clip_gradient=_Null, lazy_update=_Null, out=None, name=None, **kwargs)¶ Momentum update function for Stochastic Gradient Descent (SGD) optimizer.
Momentum update has better convergence rates on neural networks. Mathematically it looks like below:
\[\begin{split}v_1 = \alpha * \nabla J(W_0)\\ v_t = \gamma v_{t-1} - \alpha * \nabla J(W_{t-1})\\ W_t = W_{t-1} + v_t\end{split}\]It updates the weights using:
v = momentum * v - learning_rate * gradient weight += v
Where the parameter
momentum
is the decay rate of momentum estimates at each epoch.However, if grad’s storage type is
row_sparse
,lazy_update
is True and weight’s storage type is the same as momentum’s storage type, only the row slices whose indices appear in grad.indices are updated (for both weight and momentum):for row in gradient.indices: v[row] = momentum[row] * v[row] - learning_rate * gradient[row] weight[row] += v[row]
Defined in src/operator/optimizer_op.cc:L564
- Parameters
weight (NDArray) – Weight
grad (NDArray) – Gradient
mom (NDArray) – Momentum
lr (float, required) – Learning rate
momentum (float, optional, default=0) – The decay rate of momentum estimates at each epoch.
wd (float, optional, default=0) – Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight.
rescale_grad (float, optional, default=1) – Rescale gradient to grad = rescale_grad*grad.
clip_gradient (float, optional, default=-1) – Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
lazy_update (boolean, optional, default=1) – If true, lazy updates are applied if gradient’s stype is row_sparse and both weight and momentum have the same stype
out (NDArray, optional) – The output NDArray to hold the result.
- Returns
out – The output of this function.
- Return type
NDArray or list of NDArrays
-
mxnet.ndarray.sparse.
sgd_update
(weight=None, grad=None, lr=_Null, wd=_Null, rescale_grad=_Null, clip_gradient=_Null, lazy_update=_Null, out=None, name=None, **kwargs)¶ Update function for Stochastic Gradient Descent (SGD) optimizer.
It updates the weights using:
weight = weight - learning_rate * (gradient + wd * weight)
However, if gradient is of
row_sparse
storage type andlazy_update
is True, only the row slices whose indices appear in grad.indices are updated:for row in gradient.indices: weight[row] = weight[row] - learning_rate * (gradient[row] + wd * weight[row])
Defined in src/operator/optimizer_op.cc:L523
- Parameters
weight (NDArray) – Weight
grad (NDArray) – Gradient
lr (float, required) – Learning rate
wd (float, optional, default=0) – Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight.
rescale_grad (float, optional, default=1) – Rescale gradient to grad = rescale_grad*grad.
clip_gradient (float, optional, default=-1) – Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
lazy_update (boolean, optional, default=1) – If true, lazy updates are applied if gradient’s stype is row_sparse.
out (NDArray, optional) – The output NDArray to hold the result.
- Returns
out – The output of this function.
- Return type
NDArray or list of NDArrays
-
mxnet.ndarray.sparse.
sigmoid
(data=None, out=None, name=None, **kwargs)¶ Computes sigmoid of x element-wise.
\[y = 1 / (1 + exp(-x))\]The storage type of
sigmoid
output is always denseDefined in src/operator/tensor/elemwise_unary_op_basic.cc:L119
-
mxnet.ndarray.sparse.
sign
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise sign of the input.
Example:
sign([-2, 0, 3]) = [-1, 0, 1]
The storage type of
sign
output depends upon the input storage type:sign(default) = default
sign(row_sparse) = row_sparse
sign(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L758
-
mxnet.ndarray.sparse.
sin
(data=None, out=None, name=None, **kwargs)¶ Computes the element-wise sine of the input array.
The input should be in radians (\(2\pi\) rad equals 360 degrees).
\[sin([0, \pi/4, \pi/2]) = [0, 0.707, 1]\]The storage type of
sin
output depends upon the input storage type:sin(default) = default
sin(row_sparse) = row_sparse
sin(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L47
-
mxnet.ndarray.sparse.
sinh
(data=None, out=None, name=None, **kwargs)¶ Returns the hyperbolic sine of the input array, computed element-wise.
\[sinh(x) = 0.5\times(exp(x) - exp(-x))\]The storage type of
sinh
output depends upon the input storage type:sinh(default) = default
sinh(row_sparse) = row_sparse
sinh(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L371
-
mxnet.ndarray.sparse.
slice
(data=None, begin=_Null, end=_Null, step=_Null, out=None, name=None, **kwargs)¶ Slices a region of the array. .. note::
crop
is deprecated. Useslice
instead. This function returns a sliced array between the indices given by begin and end with the corresponding step. For an input array ofshape=(d_0, d_1, ..., d_n-1)
, slice operation withbegin=(b_0, b_1...b_m-1)
,end=(e_0, e_1, ..., e_m-1)
, andstep=(s_0, s_1, ..., s_m-1)
, where m <= n, results in an array with the shape(|e_0-b_0|/|s_0|, ..., |e_m-1-b_m-1|/|s_m-1|, d_m, ..., d_n-1)
. The resulting array’s k-th dimension contains elements from the k-th dimension of the input array starting from indexb_k
(inclusive) with steps_k
until reachinge_k
(exclusive). If the k-th elements are None in the sequence of begin, end, and step, the following rule will be used to set default values. If s_k is None, set s_k=1. If s_k > 0, set b_k=0, e_k=d_k; else, set b_k=d_k-1, e_k=-1. The storage type ofslice
output depends on storage types of inputs - slice(csr) = csr - otherwise,slice
generates output with default storage .. note:: When input data storage type is csr, it only supportsstep=(), or step=(None,), or step=(1,) to generate a csr output. For other step parameter values, it falls back to slicing a dense tensor.
- Example::
- x = [[ 1., 2., 3., 4.],
[ 5., 6., 7., 8.], [ 9., 10., 11., 12.]]
- slice(x, begin=(0,1), end=(2,4)) = [[ 2., 3., 4.],
[ 6., 7., 8.]]
- slice(x, begin=(None, 0), end=(None, 3), step=(-1, 2)) = [[9., 11.],
[5., 7.], [1., 3.]]
Defined in src/operator/tensor/matrix_op.cc:L481
- Parameters
data (NDArray) – Source input
begin (Shape(tuple), required) – starting indices for the slice operation, supports negative indices.
end (Shape(tuple), required) – ending indices for the slice operation, supports negative indices.
step (Shape(tuple), optional, default=[]) – step for the slice operation, supports negative values.
out (NDArray, optional) – The output NDArray to hold the result.
- Returns
out – The output of this function.
- Return type
NDArray or list of NDArrays
-
mxnet.ndarray.sparse.
sqrt
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise square-root value of the input.
\[\textrm{sqrt}(x) = \sqrt{x}\]Example:
sqrt([4, 9, 16]) = [2, 3, 4]
The storage type of
sqrt
output depends upon the input storage type:sqrt(default) = default
sqrt(row_sparse) = row_sparse
sqrt(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_pow.cc:L170
-
mxnet.ndarray.sparse.
square
(data=None, out=None, name=None, **kwargs)¶ Returns element-wise squared value of the input.
\[square(x) = x^2\]Example:
square([2, 3, 4]) = [4, 9, 16]
The storage type of
square
output depends upon the input storage type:square(default) = default
square(row_sparse) = row_sparse
square(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_pow.cc:L119
-
mxnet.ndarray.sparse.
stop_gradient
(data=None, out=None, name=None, **kwargs)¶ Stops gradient computation.
Stops the accumulated gradient of the inputs from flowing through this operator in the backward direction. In other words, this operator prevents the contribution of its inputs to be taken into account for computing gradients.
Example:
v1 = [1, 2] v2 = [0, 1] a = Variable('a') b = Variable('b') b_stop_grad = stop_gradient(3 * b) loss = MakeLoss(b_stop_grad + a) executor = loss.simple_bind(ctx=cpu(), a=(1,2), b=(1,2)) executor.forward(is_train=True, a=v1, b=v2) executor.outputs [ 1. 5.] executor.backward() executor.grad_arrays [ 0. 0.] [ 1. 1.]
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L325
-
mxnet.ndarray.sparse.
sum
(data=None, axis=_Null, keepdims=_Null, exclude=_Null, out=None, name=None, **kwargs)¶ Computes the sum of array elements over given axes.
Note
sum and sum_axis are equivalent. For ndarray of csr storage type summation along axis 0 and axis 1 is supported. Setting keepdims or exclude to True will cause a fallback to dense operator.
Example:
data = [[[1, 2], [2, 3], [1, 3]], [[1, 4], [4, 3], [5, 2]], [[7, 1], [7, 2], [7, 3]]] sum(data, axis=1) [[ 4. 8.] [ 10. 9.] [ 21. 6.]] sum(data, axis=[1,2]) [ 12. 19. 27.] data = [[1, 2, 0], [3, 0, 1], [4, 1, 0]] csr = cast_storage(data, 'csr') sum(csr, axis=0) [ 8. 3. 1.] sum(csr, axis=1) [ 3. 4. 5.]
Defined in src/operator/tensor/broadcast_reduce_sum_value.cc:L66
- Parameters
data (NDArray) – The input
axis (Shape or None, optional, default=None) –
The axis or axes along which to perform the reduction.
The default, axis=(), will compute over all elements into a scalar array with shape (1,).
If axis is int, a reduction is performed on a particular axis.
If axis is a tuple of ints, a reduction is performed on all the axes specified in the tuple.
If exclude is true, reduction will be performed on the axes that are NOT in axis instead.
Negative values means indexing from right to left.
keepdims (boolean, optional, default=0) – If this is set to True, the reduced axes are left in the result as dimension with size one.
exclude (boolean, optional, default=0) – Whether to perform reduction on axis that are NOT in axis instead.
out (NDArray, optional) – The output NDArray to hold the result.
- Returns
out – The output of this function.
- Return type
NDArray or list of NDArrays
-
mxnet.ndarray.sparse.
tan
(data=None, out=None, name=None, **kwargs)¶ Computes the element-wise tangent of the input array.
The input should be in radians (\(2\pi\) rad equals 360 degrees).
\[tan([0, \pi/4, \pi/2]) = [0, 1, -inf]\]The storage type of
tan
output depends upon the input storage type:tan(default) = default
tan(row_sparse) = row_sparse
tan(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L140
-
mxnet.ndarray.sparse.
tanh
(data=None, out=None, name=None, **kwargs)¶ Returns the hyperbolic tangent of the input array, computed element-wise.
\[tanh(x) = sinh(x) / cosh(x)\]The storage type of
tanh
output depends upon the input storage type:tanh(default) = default
tanh(row_sparse) = row_sparse
tanh(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_trig.cc:L451
-
mxnet.ndarray.sparse.
trunc
(data=None, out=None, name=None, **kwargs)¶ Return the element-wise truncated value of the input.
The truncated value of the scalar x is the nearest integer i which is closer to zero than x is. In short, the fractional part of the signed number x is discarded.
Example:
trunc([-2.1, -1.9, 1.5, 1.9, 2.1]) = [-2., -1., 1., 1., 2.]
The storage type of
trunc
output depends upon the input storage type:trunc(default) = default
trunc(row_sparse) = row_sparse
trunc(csr) = csr
Defined in src/operator/tensor/elemwise_unary_op_basic.cc:L856
-
mxnet.ndarray.sparse.
where
(condition=None, x=None, y=None, out=None, name=None, **kwargs)¶ Return the elements, either from x or y, depending on the condition.
Given three ndarrays, condition, x, and y, return an ndarray with the elements from x or y, depending on the elements from condition are true or false. x and y must have the same shape. If condition has the same shape as x, each element in the output array is from x if the corresponding element in the condition is true, and from y if false.
If condition does not have the same shape as x, it must be a 1D array whose size is the same as x’s first dimension size. Each row of the output array is from x’s row if the corresponding element from condition is true, and from y’s row if false.
Note that all non-zero values are interpreted as
True
in condition.Examples:
x = [[1, 2], [3, 4]] y = [[5, 6], [7, 8]] cond = [[0, 1], [-1, 0]] where(cond, x, y) = [[5, 2], [3, 8]] csr_cond = cast_storage(cond, 'csr') where(csr_cond, x, y) = [[5, 2], [3, 8]]
Defined in src/operator/tensor/control_flow_op.cc:L56
-
mxnet.ndarray.sparse.
zeros_like
(data=None, out=None, name=None, **kwargs)¶ Return an array of zeros with the same shape, type and storage type as the input array.
The storage type of
zeros_like
output depends on the storage type of the inputzeros_like(row_sparse) = row_sparse
zeros_like(csr) = csr
zeros_like(default) = default
Examples:
x = [[ 1., 1., 1.], [ 1., 1., 1.]] zeros_like(x) = [[ 0., 0., 0.], [ 0., 0., 0.]]