: Symbol definition.
Input data names.
Input label names
Default is cpu().
Default None
, indicating uniform workload.
Default None
, indicating no network parameters are fixed.
Backward computation.
Backward computation.
Gradient on the outputs to be propagated back. This parameter is only needed when bind is called on outputs that are not a loss function.
Bind the symbols to construct executors.
Bind the symbols to construct executors. This is necessary before one can perform computation with the module.
Typically is dataIter.provideData
.
Typically is data_iter.provide_label
.
Default is true
. Whether the executors should be bind for training.
Default is false
.
Whether the gradients to the input data need to be computed.
Typically this is not needed.
But this might be needed when implementing composition of modules.
Default is false
.
This function does nothing if the executors are already binded.
But with this true
, the executors will be forced to rebind.
Default is None
. This is used in bucketing.
When not None
, the shared module essentially corresponds to
a different bucket -- a module with different symbol
but with the same sets of parameters
(e.g. unrolled RNNs with different lengths).
Requirement for gradient accumulation (globally). Can be 'write', 'add', or 'null' (default to 'write').
Borrow optimizer from a shared module.
Borrow optimizer from a shared module. Used in bucketing, where exactly the same optimizer (esp. kvstore) is used.
Input data names.
Input data names.
Train the module parameters.
Train the module parameters.
If not None
, will be used as validation set and evaluate
the performance after each epoch.
Number of epochs to run training.
Extra parameters for training.
Forward computation.
Forward computation.
input data
Default is None
, which means is_train
takes the value of for_training
.
Get the gradients to the inputs, computed in the previous backward computation.
Get the gradients to the inputs, computed in the previous backward computation.
In the case when data-parallelism is used,
the grads will be collected from multiple devices.
The results will look like grad1_dev2], [grad2_dev1, grad2_dev2
,
those NDArray
might live on different devices.
Get the gradients to the inputs, computed in the previous backward computation.
Get the gradients to the inputs, computed in the previous backward computation.
In the case when data-parallelism is used,
the grads will be merged from multiple devices,
as they look like from a single executor.
The results will look like [grad1, grad2]
Get outputs of the previous forward computation.
Get outputs of the previous forward computation.
In the case when data-parallelism is used,
the outputs will be collected from multiple devices.
The results will look like out1_dev2], [out2_dev1, out2_dev2
,
those NDArray
might live on different devices.
Get outputs of the previous forward computation.
Get outputs of the previous forward computation.
In the case when data-parallelism is used,
the outputs will be merged from multiple devices,
as they look like from a single executor.
The results will look like [out1, out2]
Get current parameters.
Get current parameters.
(arg_params, aux_params)
, each a dictionary of name to parameters (in
NDArray
) mapping.
(argParams, auxParams)
, a pair of dictionary of name to value mapping.
Install and initialize optimizers.
Install and initialize optimizers.
Default True
, indicating whether we should set rescaleGrad
& idx2name
for optimizer according to executorGroup
Default False
, indicating whether we should force re-initializing
the optimizer in the case an optimizer is already installed.
Initialize the parameters and auxiliary states.
Initialize the parameters and auxiliary states.
Called to initialize parameters if needed.
If not None, should be a dictionary of existing arg_params. Initialization will be copied from that.
If not None, should be a dictionary of existing aux_params. Initialization will be copied from that.
If true, params could contain missing values, and the initializer will be called to fill those missing params.
If true, will force re-initialize even if already initialized.
Whether allow extra parameters that are not needed by symbol. If this is True, no error will be thrown when argParams or auxParams contain extra parameters that is not needed by the executor.
A list of (name, shape) pairs specifying the label inputs to this module.
A list of (name, shape) pairs specifying the label inputs to this module.
If this module does not accept labels -- either it is a module without loss
function, or it is not binded for training, then this should return an empty
list []
.
Load optimizer (updater) state from file
Load optimizer (updater) state from file
Path to input states file.
Load model parameters from file.
Load model parameters from file.
Path to input param file.
IOException
if param file is invalid
Run prediction and collect the outputs.
Run prediction and collect the outputs.
Default is -1, indicating running all the batches in the data iterator.
Default is True
, indicating whether we should reset the data iter before start
doing prediction.
The return value will be a list [out1, out2, out3]
.
Where each element is concatenation of the outputs for all the mini-batches.
Run prediction and collect the outputs.
Run prediction and collect the outputs.
Default is -1, indicating running all the batches in the data iterator.
Default is True
, indicating whether we should reset the data iter before start
doing prediction.
The return value will be a nested list like
out2_batch1, ...], [out1_batch2, out2_batch2, ...
This mode is useful because in some cases (e.g. bucketing),
the module does not necessarily produce the same number of outputs.
Reshapes the module for new input shapes.
Reshapes the module for new input shapes.
Typically is dataIter.provideData
.
Typically is dataIter.provideLabel
.
Save current progress to checkpoint.
Save current progress to checkpoint. Use mx.callback.module_checkpoint as epoch_end_callback to save during training.
The file prefix to checkpoint to
The current epoch number
Whether to save optimizer states for continue training
Save optimizer (updater) state to file
Save optimizer (updater) state to file
Path to output states file.
Save model parameters to file.
Run prediction on eval_data
and evaluate the performance according to eval_metric
.
Run prediction on eval_data
and evaluate the performance according to eval_metric
.
: DataIter
: EvalMetric
Number of batches to run. Default is Integer.MAX_VALUE
,
indicating run until the DataIter
finishes.
Could also be a list of functions.
Default True
,
indicating whether we should reset eval_data
before starting evaluating.
Default 0. For compatibility, this will be passed to callbacks (if any). During training, this will correspond to the training epoch number.
Assign parameter and aux state values.
Assign parameter and aux state values.
argParams : dict
Dictionary of name to value (NDArray
) mapping.
auxParams : dict
Dictionary of name to value (NDArray
) mapping.
allowMissing : bool
If true, params could contain missing values, and the initializer will be
called to fill those missing params.
forceInit : bool
If true, will force re-initialize even if already initialized.
allowExtra : bool
Whether allow extra parameters that are not needed by symbol.
If this is True, no error will be thrown when argParams or auxParams
contain extra parameters that is not needed by the executor.
Evaluate and accumulate evaluation metric on outputs of the last forward computation.
Evaluate and accumulate evaluation metric on outputs of the last forward computation.
Module is a basic module that wrap a
Symbol
. It is functionally the same as theFeedForward
model, except under the module API.