BaseModule

The base class of a modules. A module represents a computation component. The design purpose of a module is that it abstract a computation "machine", that one can run forward, backward, update parameters, etc. We aim to make the APIs easy to use, especially in the case when we need to use imperative API to work with multiple modules (e.g. stochastic depth network).

A module has several states:

- Initial state. Memory is not allocated yet, not ready for computation yet. - Binded. Shapes for inputs, outputs, and parameters are all known, memory allocated, ready for computation. - Parameter initialized. For modules with parameters, doing computation before initializing the parameters might result in undefined outputs. - Optimizer installed. An optimizer can be installed to a module. After this, the parameters of the module can be updated according to the optimizer after gradients are computed (forward-backward).

In order for a module to interactive with others, a module should be able to report the following information in its raw stage (before binded)

data_names: list of string indicating the names of required data.
output_names: list of string indicating the names of required outputs.

And also the following richer information after binded:

state information
- binded: bool, indicating whether the memory buffers needed for computation has been allocated.
- forTraining: whether the module is binded for training (if binded).
- paramsInitialized: bool, indicating whether the parameters of this modules has been initialized.
- optimizerInitialized: bool, indicating whether an optimizer is defined and initialized.
- inputsNeedGrad: bool, indicating whether gradients with respect to the input data is needed. Might be useful when implementing composition of modules.
input/output information
- dataShapes: a list of (name, shape). In theory, since the memory is allocated, we could directly provide the data arrays. But in the case of data parallelization, the data arrays might not be of the same shape as viewed from the external world.
- labelShapes: a list of (name, shape). This might be [] if the module does not need labels (e.g. it does not contains a loss function at the top), or a module is not binded for training.
- outputShapes: a list of (name, shape) for outputs of the module.
parameters (for modules with parameters)
- getParams(): return a tuple (argParams, auxParams). Each of those is a dictionary of name to NDArray mapping. Those NDArray always lives on CPU. The actual parameters used for computing might live on other devices (GPUs), this function will retrieve (a copy of) the latest parameters. Therefore, modifying
- setParams(argParams, auxParams): assign parameters to the devices doing the computation.
- initParams(...): a more flexible interface to assign or initialize the parameters.
setup
- bind(): prepare environment for computation.
- initOptimizer(): install optimizer for parameter updating.
computation
- forward(dataBatch): forward operation.
- backward(outGrads=None): backward operation.
- update(): update parameters according to installed optimizer.
- getOutputs(): get outputs of the previous forward operation.
- getInputGrads(): get the gradients with respect to the inputs computed in the previous backward operation.
- updateMetric(metric, labels): update performance metric for the previous forward computed results.
other properties (mostly for backward compatibility)
- symbol: the underlying symbolic graph for this module (if any) This property is not necessarily constant. For example, for BucketingModule, this property is simply the *current* symbol being used. For other modules, this value might not be well defined.

When those intermediate-level API are implemented properly, the following high-level API will be automatically available for a module:

fit: train the module parameters on a data set
predict: run prediction on a data set and collect outputs
score: run prediction on a data set and evaluate performance

Linear Supertypes

AnyRef, Any

Known Subclasses

BucketingModule, Module, SequentialModule

Instance Constructors

new BaseModule()

Abstract Value Members

abstract def backward(outGrads: Array[NDArray] = null): Unit

Backward computation.
Backward computation.
outGrads
Gradient on the outputs to be propagated back. This parameter is only needed when bind is called on outputs that are not a loss function.
abstract def bind(dataShapes: IndexedSeq[DataDesc], labelShapes: Option[IndexedSeq[DataDesc]] = None, forTraining: Boolean = true, inputsNeedGrad: Boolean = false, forceRebind: Boolean = false, sharedModule: Option[BaseModule] = None, gradReq: String = "write"): Unit

Bind the symbols to construct executors.
Bind the symbols to construct executors. This is necessary before one can perform computation with the module.
dataShapes
Typically is DataIter.provideData.
labelShapes
Typically is DataIter.provideLabel.
forTraining
Default is True. Whether the executors should be bind for training.
inputsNeedGrad
Default is False. Whether the gradients to the input data need to be computed. Typically this is not needed. But this might be needed when implementing composition of modules.
forceRebind
Default is False. This function does nothing if the executors are already binded. But with this True, the executors will be forced to rebind.
sharedModule
Default is None. This is used in bucketing. When not None, the shared module essentially corresponds to a different bucket -- a module with different symbol but with the same sets of parameters (e.g. unrolled RNNs with different lengths).
gradReq
Requirement for gradient accumulation (globally). Can be 'write', 'add', or 'null' (default to 'write').
abstract def dataNames: IndexedSeq[String]
abstract def dataShapes: IndexedSeq[DataDesc]
abstract def forward(dataBatch: DataBatch, isTrain: Option[Boolean] = None): Unit

Forward computation.
Forward computation.
dataBatch
Could be anything with similar API implemented.
isTrain
Default is None, which means isTrain takes the value of this.forTraining.
abstract def getInputGrads(): IndexedSeq[IndexedSeq[NDArray]]

Get the gradients to the inputs, computed in the previous backward computation.
Get the gradients to the inputs, computed in the previous backward computation.
returns
In the case when data-parallelism is used, the grads will be collected from multiple devices. The results will look like [ [grad1_dev1, grad1_dev2], [grad2_dev1, grad2_dev2] ], those NDArray might live on different devices.
abstract def getInputGradsMerged(): IndexedSeq[NDArray]

Get the gradients to the inputs, computed in the previous backward computation.
Get the gradients to the inputs, computed in the previous backward computation.
returns
In the case when data-parallelism is used, the grads will be merged from multiple devices, as they look like from a single executor. The results will look like [grad1, grad2]
abstract def getOutputs(): IndexedSeq[IndexedSeq[NDArray]]

Get outputs of the previous forward computation.
Get outputs of the previous forward computation.
returns
In the case when data-parallelism is used, the outputs will be collected from multiple devices. The results will look like [ [out1_dev1, out1_dev2], [out2_dev1, out2_dev2] ], those NDArray might live on different devices.
abstract def getOutputsMerged(): IndexedSeq[NDArray]

Get outputs of the previous forward computation.
Get outputs of the previous forward computation.
returns
In the case when data-parallelism is used, the outputs will be merged from multiple devices, as they look like from a single executor. The results will look like [out1, out2]
abstract def getParams: (Map[String, NDArray], Map[String, NDArray])

Get parameters, those are potentially copies of the the actual parameters used to do computation on the device.
Get parameters, those are potentially copies of the the actual parameters used to do computation on the device.
returns
(argParams, auxParams), a pair of dictionary of name to value mapping.
abstract def initOptimizer(kvstore: String = "local", optimizer: Optimizer = new SGD(), resetOptimizer: Boolean = true, forceInit: Boolean = false): Unit
abstract def initParams(initializer: Initializer = new Uniform(0.01f), argParams: Map[String, NDArray] = null, auxParams: Map[String, NDArray] = null, allowMissing: Boolean = false, forceInit: Boolean = false, allowExtra: Boolean = false): Unit

Initialize the parameters and auxiliary states.
Initialize the parameters and auxiliary states.
initializer
: Initializer Called to initialize parameters if needed. argParams : dict If not None, should be a dictionary of existing arg_params. Initialization will be copied from that. auxParams : dict If not None, should be a dictionary of existing aux_params. Initialization will be copied from that. allowMissing : bool If true, params could contain missing values, and the initializer will be called to fill those missing params. forceInit : bool If true, will force re-initialize even if already initialized. allowExtra : bool Whether allow extra parameters that are not needed by symbol. If this is True, no error will be thrown when argParams or auxParams contain extra parameters that is not needed by the executor.
abstract def installMonitor(monitor: Monitor): Unit
abstract def labelShapes: IndexedSeq[DataDesc]

A list of (name, shape) pairs specifying the label inputs to this module.
A list of (name, shape) pairs specifying the label inputs to this module. If this module does not accept labels -- either it is a module without loss function, or it is not binded for training, then this should return an empty list [].
abstract def outputNames: IndexedSeq[String]
abstract def outputShapes: IndexedSeq[(String, Shape)]
abstract def update(): Unit
abstract def updateMetric(evalMetric: EvalMetric, labels: IndexedSeq[NDArray]): Unit

Evaluate and accumulate evaluation metric on outputs of the last forward computation.
Evaluate and accumulate evaluation metric on outputs of the last forward computation.
evalMetric
labels
Typically DataBatch.label.

Concrete Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def bind(forTraining: Boolean, inputsNeedGrad: Boolean, forceRebind: Boolean, dataShape: DataDesc*): Unit

Bind the symbols to construct executors.
Bind the symbols to construct executors. This is necessary before one can perform computation with the module.
forTraining
Default is True. Whether the executors should be bind for training.
inputsNeedGrad
Default is False. Whether the gradients to the input data need to be computed. Typically this is not needed. But this might be needed when implementing composition of modules.
forceRebind
Default is False. This function does nothing if the executors are already binded. But with this True, the executors will be forced to rebind.
dataShape
Typically is DataIter.provideData.

Annotations
@varargs()
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
def fit(trainData: DataIter, evalData: Option[DataIter] = None, numEpoch: Int = 1, fitParams: FitParams = new FitParams): Unit

Train the module parameters.
Train the module parameters.
trainData
evalData
If not None, will be used as validation set and evaluate the performance after each epoch.
numEpoch
Number of epochs to run training.
fitParams
Extra parameters for training.
def forward(dataBatch: DataBatch, isTrain: Boolean): Unit

Forward computation.
Forward computation.
dataBatch
a batch of data.
isTrain
Whether it is for training or not.
def forwardBackward(dataBatch: DataBatch): Unit
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def getSymbol: Symbol
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def loadParams(fname: String): Unit

Load model parameters from file.
Load model parameters from file.
fname
Path to input param file.

Annotations
@throws( classOf[IOException] )
Exceptions thrown
IOException if param file is invalid
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
def predict(evalData: DataIter, numBatch: Int = 1, reset: Boolean = true): IndexedSeq[NDArray]

Run prediction and collect the outputs.
Run prediction and collect the outputs.
evalData
dataIter to do the Inference
numBatch
Default is -1, indicating running all the batches in the data iterator.
reset
Default is True, indicating whether we should reset the data iter before start doing prediction.
returns
The return value will be a list [out1, out2, out3]. The concatenation process will be like
```
outputBatches = [
  [a1, a2, a3], // batch a
  [b1, b2, b3]  // batch b
]
result = [
  NDArray, // [a1, b1]
  NDArray, // [a2, b2]
  NDArray, // [a3, b3]
]
```
Where each element is concatenation of the outputs for all the mini-batches.
def predict(batch: DataBatch): IndexedSeq[NDArray]
def predictEveryBatch(evalData: DataIter, numBatch: Int = 1, reset: Boolean = true): IndexedSeq[IndexedSeq[NDArray]]

Run prediction and collect the outputs.
Run prediction and collect the outputs.
evalData
numBatch
Default is -1, indicating running all the batches in the data iterator.
reset
Default is True, indicating whether we should reset the data iter before start doing prediction.
returns
The return value will be a nested list like [ [out1_batch1, out2_batch1, ...], [out1_batch2, out2_batch2, ...] ] This mode is useful because in some cases (e.g. bucketing), the module does not necessarily produce the same number of outputs.
def saveParams(fname: String): Unit

Save model parameters to file.
Save model parameters to file.
fname
Path to output param file.
def score(evalData: DataIter, evalMetric: EvalMetric, numBatch: Int = Integer.MAX_VALUE, batchEndCallback: Option[BatchEndCallback] = None, scoreEndCallback: Option[BatchEndCallback] = None, reset: Boolean = true, epoch: Int = 0): EvalMetric

Run prediction on eval_data and evaluate the performance according to eval_metric.
Run prediction on eval_data and evaluate the performance according to eval_metric.
evalData
: DataIter
evalMetric
: EvalMetric
numBatch
Number of batches to run. Default is Integer.MAX_VALUE, indicating run until the DataIter finishes.
batchEndCallback
Could also be a list of functions.
reset
Default True, indicating whether we should reset eval_data before starting evaluating.
epoch
Default 0. For compatibility, this will be passed to callbacks (if any). During training, this will correspond to the training epoch number.
def setParams(argParams: Map[String, NDArray], auxParams: Map[String, NDArray], allowMissing: Boolean = false, forceInit: Boolean = true, allowExtra: Boolean = false): Unit

Assign parameter and aux state values.
Assign parameter and aux state values. argParams : dict Dictionary of name to value (NDArray) mapping. auxParams : dict Dictionary of name to value (NDArray) mapping. allowMissing : bool If true, params could contain missing values, and the initializer will be called to fill those missing params. forceInit : bool If true, will force re-initialize even if already initialized. allowExtra : bool Whether allow extra parameters that are not needed by symbol. If this is True, no error will be thrown when argParams or auxParams contain extra parameters that is not needed by the executor.
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Docs: object BaseModule | package module

abstract class BaseModule extends AnyRef

Instance Constructors

new BaseModule()

Abstract Value Members

abstract def backward(outGrads: Array[NDArray] = null): Unit

abstract def bind(dataShapes: IndexedSeq[DataDesc], labelShapes: Option[IndexedSeq[DataDesc]] = None, forTraining: Boolean = true, inputsNeedGrad: Boolean = false, forceRebind: Boolean = false, sharedModule: Option[BaseModule] = None, gradReq: String = "write"): Unit

abstract def dataNames: IndexedSeq[String]

abstract def dataShapes: IndexedSeq[DataDesc]

abstract def forward(dataBatch: DataBatch, isTrain: Option[Boolean] = None): Unit

abstract def getInputGrads(): IndexedSeq[IndexedSeq[NDArray]]

abstract def getInputGradsMerged(): IndexedSeq[NDArray]

abstract def getOutputs(): IndexedSeq[IndexedSeq[NDArray]]

abstract def getOutputsMerged(): IndexedSeq[NDArray]

abstract def getParams: (Map[String, NDArray], Map[String, NDArray])

abstract def initOptimizer(kvstore: String = "local", optimizer: Optimizer = new SGD(), resetOptimizer: Boolean = true, forceInit: Boolean = false): Unit

abstract def initParams(initializer: Initializer = new Uniform(0.01f), argParams: Map[String, NDArray] = null, auxParams: Map[String, NDArray] = null, allowMissing: Boolean = false, forceInit: Boolean = false, allowExtra: Boolean = false): Unit

abstract def installMonitor(monitor: Monitor): Unit

abstract def labelShapes: IndexedSeq[DataDesc]

abstract def outputNames: IndexedSeq[String]

abstract def outputShapes: IndexedSeq[(String, Shape)]

abstract def update(): Unit

abstract def updateMetric(evalMetric: EvalMetric, labels: IndexedSeq[NDArray]): Unit

Concrete Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

def bind(forTraining: Boolean, inputsNeedGrad: Boolean, forceRebind: Boolean, dataShape: DataDesc*): Unit

def clone(): AnyRef

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

def fit(trainData: DataIter, evalData: Option[DataIter] = None, numEpoch: Int = 1, fitParams: FitParams = new FitParams): Unit

def forward(dataBatch: DataBatch, isTrain: Boolean): Unit

def forwardBackward(dataBatch: DataBatch): Unit

final def getClass(): Class[_]

def getSymbol: Symbol

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

def loadParams(fname: String): Unit

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

def predict(evalData: DataIter, numBatch: Int = 1, reset: Boolean = true): IndexedSeq[NDArray]

def predict(batch: DataBatch): IndexedSeq[NDArray]

def predictEveryBatch(evalData: DataIter, numBatch: Int = 1, reset: Boolean = true): IndexedSeq[IndexedSeq[NDArray]]

def saveParams(fname: String): Unit

def score(evalData: DataIter, evalMetric: EvalMetric, numBatch: Int = Integer.MAX_VALUE, batchEndCallback: Option[BatchEndCallback] = None, scoreEndCallback: Option[BatchEndCallback] = None, reset: Boolean = true, epoch: Int = 0): EvalMetric

def setParams(argParams: Map[String, NDArray], auxParams: Map[String, NDArray], allowMissing: Boolean = false, forceInit: Boolean = true, allowExtra: Boolean = false): Unit

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from AnyRef

Inherited from Any

Ungrouped