Symbol and Automatic Differentiation

The computational unit NDArray requires a way to construct neural networks. MXNet provides a symbolic interface, named Symbol, to do this. Symbol combines both flexibility and efficiency.

Basic Composition of Symbols

The following code creates a two-layer perceptron network:

## [1] "Rcpp_MXSymbol"
## attr(,"package")
## [1] "mxnet"

Each symbol takes a (unique) string name. Variable often defines the inputs, or free variables. Other symbols take a symbol as the input (data), and may accept other hyper parameters, such as the number of hidden neurons (num_hidden) or the activation type (act_type).

We can also specify the names explicitly:

data <- mx.symbol.Variable("data")
w <- mx.symbol.Variable("myweight")
net <- mx.symbol.FullyConnected(data=data, weight=w, name="fc1", num_hidden=128)
## [1] "data"     "myweight" "fc1_bias"

More Complicated Composition of Symbols

MXNet provides well-optimized symbols for commonly used layers in deep learning. You can also define new operators in Python. The following example first performs an element-wise add between two symbols, then feeds them to the fully connected operator:

lhs <- mx.symbol.Variable("data1")
rhs <- mx.symbol.Variable("data2")
net <- mx.symbol.FullyConnected(data=lhs + rhs, name="fc1", num_hidden=128)
## [1] "data1"      "data2"      "fc1_weight" "fc1_bias"

We can construct a symbol more flexibly than by using the single forward composition, for example:

net <- mx.symbol.Variable("data")
net <- mx.symbol.FullyConnected(data=net, name="fc1", num_hidden=128)
net2 <- mx.symbol.Variable("data2")
net2 <- mx.symbol.FullyConnected(data=net2, name="net2", num_hidden=128) <- mx.apply(net, data=net2, name="compose")
## [1] "data2"       "net2_weight" "net2_bias"   "fc1_weight"  "fc1_bias"

In the example, net is used as a function to apply to an existing symbol net. The resulting will replace the original argument data with net2 instead.

Training a Neural Net

The model API is a thin wrapper around the symbolic executors to support neural net training.

How Efficient Is the Symbolic API?

The Symbolic API brings the efficient C++ operations in powerful toolkits, such as CXXNet and Caffe, together with the flexible dynamic NDArray operations. All of the memory and computation resources are allocated statically during bind operations, to maximize runtime performance and memory utilization.

The coarse-grained operators are equivalent to CXXNet layers, which are extremely efficient. We also provide fine-grained operators for more flexible composition. Because MXNet does more in-place memory allocation, it can be more memory efficient than CXXNet and gets to the same runtime with greater flexibility.