Executor and Executor Manager¶
The executor and executor manager are internal classes for managing symbolic graph execution. This document is only intended for reference for advanced users.
.. note:: Direct interactions with executor and executor manager are dangerous and not recommended.
Executor¶
Executor |
Executor is the object providing efficient symbolic graph execution and optimization. |
Executor Manager¶
DataParallelExecutorGroup |
A group of executors living on different devices, for data parallelization. |
DataParallelExecutorManager |
Helper class to manage multiple executors for data parallelism. |
API Reference¶
Symbolic Executor component of MXNet.
-
class
mxnet.executor.
Executor
(handle, symbol, ctx, grad_req, group2ctx)[source]¶ Executor is the object providing efficient symbolic graph execution and optimization.
Examples
>>> # typical approach to create an executor is to bind symbol >>> a = mx.sym.Variable('a') >>> b = mx.sym.Variable('b') >>> c = 2 * a + b >>> texec = c.bind(mx.cpu(), {'a': mx.nd.array([1,2]), 'b':mx.nd.array([2,3])})
-
forward
(is_train=False, **kwargs)[source]¶ Calculate the outputs specified by the bound symbol.
Parameters: - is_train (bool, optional) – Whether this forward is for evaluation purpose. If True, a backward call is expected to follow.
- **kwargs – Additional specification of input arguments.
Examples
>>> # doing forward by specifying data >>> texec.forward(is_train=True, data=mydata) >>> # doing forward by not specifying things, but copy to the executor before hand >>> mydata.copyto(texec.arg_dict['data']) >>> texec.forward(is_train=True) >>> # doing forward by specifying data and get outputs >>> outputs = texec.forward(is_train=True, data=mydata) >>> print(outputs[0].asnumpy())
-
backward
(out_grads=None, is_train=True)[source]¶ Do backward pass to get the gradient of arguments.
Parameters: - out_grads (NDArray or list of NDArray or dict of str to NDArray, optional) – Gradient on the outputs to be propagated back. This parameter is only needed when bind is called on outputs that are not a loss function.
- is_train (bool, default True) – Whether this backward is for training or inference. Note that in rare cases you want to call backward with is_train=False to get gradient during inference.
Examples
>>> # Example for binding on loss function symbol, which gives the loss value of the model. >>> # Equivalently it gives the head gradient for backward pass. >>> # In this example the built-in SoftmaxOutput is used as loss function. >>> # MakeLoss can be used to define customized loss function symbol. >>> net = mx.sym.Variable('data') >>> net = mx.sym.FullyConnected(net, name='fc', num_hidden=6) >>> net = mx.sym.Activation(net, name='relu', act_type="relu") >>> net = mx.sym.SoftmaxOutput(net, name='softmax')
>>> args = {'data': mx.nd.ones((1, 4)), 'fc_weight': mx.nd.ones((6, 4)), >>> 'fc_bias': mx.nd.array((1, 4, 4, 4, 5, 6)), 'softmax_label': mx.nd.ones((1))} >>> args_grad = {'fc_weight': mx.nd.zeros((6, 4)), 'fc_bias': mx.nd.zeros((6))} >>> texec = net.bind(ctx=mx.cpu(), args=args, args_grad=args_grad) >>> out = texec.forward(is_train=True)[0].copy() >>> print out.asnumpy() [[ 0.00378404 0.07600445 0.07600445 0.07600445 0.20660152 0.5616011 ]] >>> texec.backward() >>> print(texec.grad_arrays[1].asnumpy()) [[ 0.00378404 0.00378404 0.00378404 0.00378404] [-0.92399555 -0.92399555 -0.92399555 -0.92399555] [ 0.07600445 0.07600445 0.07600445 0.07600445] [ 0.07600445 0.07600445 0.07600445 0.07600445] [ 0.20660152 0.20660152 0.20660152 0.20660152] [ 0.5616011 0.5616011 0.5616011 0.5616011 ]] >>> >>> # Example for binding on non-loss function symbol. >>> # Here the binding symbol is neither built-in loss function >>> # nor customized loss created by MakeLoss. >>> # As a result the head gradient is not automatically provided. >>> a = mx.sym.Variable('a') >>> b = mx.sym.Variable('b') >>> # c is not a loss function symbol >>> c = 2 * a + b >>> args = {'a': mx.nd.array([1,2]), 'b':mx.nd.array([2,3])} >>> args_grad = {'a': mx.nd.zeros((2)), 'b': mx.nd.zeros((2))} >>> texec = c.bind(ctx=mx.cpu(), args=args, args_grad=args_grad) >>> out = texec.forward(is_train=True)[0].copy() >>> print(out.asnumpy()) [ 4. 7.] >>> # out_grads is the head gradient in backward pass. >>> # Here we define 'c' as loss function. >>> # Then 'out' is passed as head gradient of backward pass. >>> texec.backward(out) >>> print(texec.grad_arrays[0].asnumpy()) [ 8. 14.] >>> print(texec.grad_arrays[1].asnumpy()) [ 4. 7.]
-
set_monitor_callback
(callback, monitor_all=False)[source]¶ Install callback for monitor.
Parameters: - callback (function) – Takes a string and an NDArrayHandle.
- monitor_all (bool, default False) – If true, monitor both input and output, otherwise monitor output only.
Examples
>>> def mon_callback(*args, **kwargs): >>> print("Do your stuff here.") >>> >>> texe.set_monitor_callback(mon_callback)
-
arg_dict
¶ Get dictionary representation of argument arrrays.
Returns: arg_dict – The dictionary that maps the names of arguments to NDArrays. Return type: dict of str to NDArray Raises: ValueError : if there are duplicated names in the arguments.
-
grad_dict
¶ Get dictionary representation of gradient arrays.
Returns: grad_dict – The dictionary that maps name of arguments to gradient arrays. Return type: dict of str to NDArray
-
aux_dict
¶ Get dictionary representation of auxiliary states arrays.
Returns: aux_dict – The dictionary that maps name of auxiliary states to NDArrays. Return type: dict of str to NDArray Raises: ValueError : if there are duplicated names in the auxiliary states.
-
output_dict
¶ Get dictionary representation of output arrays.
Returns: output_dict – The dictionary that maps name of output names to NDArrays. Return type: dict of str to NDArray Raises: ValueError : if there are duplicated names in the outputs.
-
copy_params_from
(arg_params, aux_params=None, allow_extra_params=False)[source]¶ Copy parameters from arg_params, aux_params into executor’s internal array.
Parameters: - arg_params (dict of str to NDArray) – Parameters, dict of name to NDArray of arguments.
- aux_params (dict of str to NDArray, optional) – Parameters, dict of name to NDArray of auxiliary states.
- allow_extra_params (boolean, optional) – Whether allow extra parameters that are not needed by symbol. If this is True, no error will be thrown when arg_params or aux_params contain extra parameters that is not needed by the executor.
Raises: ValueError
– If there is additional parameters in the dict butallow_extra_params=False
.Examples
>>> # set parameters with existing model checkpoint >>> model_prefix = 'mx_mlp' >>> sym, arg_params, aux_params = mx.model.load_checkpoint(model_prefix, 0) >>> texec.copy_params_from(arg_params, aux_params)
-
reshape
(partial_shaping=False, allow_up_sizing=False, **kwargs)[source]¶ Return a new executor with the same symbol and shared memory, but different input/output shapes. For runtime reshaping, variable length sequences, etc. The returned executor shares state with the current one, and cannot be used in parallel with it.
Parameters: - partial_shaping (bool) – Whether to allow changing the shape of unspecified arguments.
- allow_up_sizing (bool) – Whether to allow allocating new ndarrays that’s larger than the original.
- kwargs (dict of string to tuple of int) – New shape for arguments.
Returns: exec – A new executor that shares memory with self.
Return type: Examples
>>> a = mx.sym.Variable('a') >>> b = mx.sym.Variable('b') >>> c = 2 * a + b >>> texec = c.bind(mx.cpu(), {'a': mx.nd.zeros((2, 1)), 'b': mx.nd.ones((2,1))}) >>> new_shape = {'a': (4, 2), 'b': (4, 2)} >>> texec.reshape(allow_up_sizing=True, **new_shape)
-
debug_str
()[source]¶ Get a debug string about internal execution plan.
Returns: debug_str – Debug string of the executor. Return type: string Examples
>>> a = mx.sym.Variable('a') >>> b = mx.sym.sin(a) >>> c = 2 * a + b >>> texec = c.bind(mx.cpu(), {'a': mx.nd.array([1,2]), 'b':mx.nd.array([2,3])}) >>> print(texec.debug_str()) Symbol Outputs: output[0]=_plus0(0) Variable:a -------------------- Op:_mul_scalar, Name=_mulscalar0 Inputs: arg[0]=a(0) version=0 Attrs: scalar=2 -------------------- Op:sin, Name=sin0 Inputs: arg[0]=a(0) version=0 -------------------- Op:elemwise_add, Name=_plus0 Inputs: arg[0]=_mulscalar0(0) arg[1]=sin0(0) Total 0 MB allocated Total 11 TempSpace resource requested
-
Executor manager.
-
class
mxnet.executor_manager.
DataParallelExecutorGroup
(sym, arg_names, param_names, ctx, slices, train_data, shared_group=None)[source]¶ A group of executors living on different devices, for data parallelization.
Parameters: - sym (Symbol) – The network configuration.
- arg_names (list of str) – Equals sym.list_arguments()
- param_names (list of str) – List of names of all trainable parameters.
- ctx (list of Context) – List of devices for training (data parallelization).
- slices (list of int) – Describes how the data parallelization splits data into different devices.
- train_data (DataIter (or DataBatch)) – The dataset for training. It could be any object with provide_data and provide_label properties. Loading of actual data is not necessarily needed at this stage.
- shared_grop (DataParallelExecutorGroup) – An existing executor group, if to share parameters with it.
-
class
mxnet.executor_manager.
DataParallelExecutorManager
(symbol, ctx, train_data, arg_names, param_names, aux_names, work_load_list=None, logger=None, sym_gen=None)[source]¶ Helper class to manage multiple executors for data parallelism.
Parameters: - symbol (Symbol) – Output symbol.
- ctx (list of Context) – Devices to run on.
- param_names (list of str) – Name of all trainable parameters of the network.
- arg_names (list of str) – Name of all arguments of the network.
- aux_names (list of str) – Name of all auxiliary states of the network.
- train_data (DataIter) – Training data iterator.
- work_load_list (list of float or int, optional) – The list of work load for different devices, in the same order as ctx.
- logger (logging logger) – When not specified, default logger will be used.
- sym_gen (A function that generate new Symbols depending on different) – input shapes. Used only for bucketing.
-
set_params
(arg_params, aux_params)[source]¶ Set parameter and aux values.
Parameters: - arg_params (list of NDArray) – Source parameter arrays
- aux_params (list of NDArray) – Source aux arrays.
-
copy_to
(arg_params, aux_params)[source]¶ Copy data from each executor to
`arg_params
andaux_params
.Parameters: - arg_params (list of NDArray) – Target parameter arrays.
- aux_params (list of NDArray) – Target aux arrays.
Notes
- This function will inplace update the NDArrays in arg_params and aux_params.
-
param_arrays
¶ Shared parameter arrays.
-
grad_arrays
¶ Shared gradient arrays.
-
aux_arrays
¶ Shared aux states.