{"nbformat": 4, "cells": [{"source": "# Linear Regression\n\nIn this tutorial we'll walk through how one can implement *linear regression* using MXNet APIs.\n\nThe function we are trying to learn is: *y = x_{1} + 2x_{2}*, where *(x_{1},x_{2})* are input features and *y* is the corresponding label.\n\n## Prerequisites\n\nTo complete this tutorial, we need: \n\n- MXNet. See the instructions for your operating system in [Setup and Installation](http://mxnet.io/install/index.html). \n\n- [Jupyter Notebook](http://jupyter.org/index.html).\n\n```\n$ pip install jupyter\n```\n\nTo begin, the following code imports the necessary packages we'll need for this exercise.", "cell_type": "markdown", "metadata": {}}, {"source": "import mxnet as mx\nimport numpy as np\n\nimport logging\nlogging.getLogger().setLevel(logging.DEBUG)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "## Preparing the Data\n\nIn MXNet, data is input via **Data Iterators**. Here we will illustrate\nhow to encode a dataset into an iterator that MXNet can use. The data used in the example is made up of 2D data points with corresponding integer labels. ", "cell_type": "markdown", "metadata": {}}, {"source": "#Training data\ntrain_data = np.random.uniform(0, 1, [100, 2])\ntrain_label = np.array([train_data[i][0] + 2 * train_data[i][1] for i in range(100)])\nbatch_size = 1\n\n#Evaluation Data\neval_data = np.array([[7,2],[6,10],[12,2]])\neval_label = np.array([11,26,16])", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "Once we have the data ready, we need to put it into an iterator and specify\nparameters such as `batch_size` and `shuffle`. `batch_size` specifies the number\nof examples shown to the model each time we update its parameters and `shuffle`\ntells the iterator to randomize the order in which examples are shown to the model.", "cell_type": "markdown", "metadata": {}}, {"source": "train_iter = mx.io.NDArrayIter(train_data,train_label, batch_size, shuffle=True,label_name='lin_reg_label')\neval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "In the above example, we have made use of `NDArrayIter`, which is useful for iterating\nover both numpy ndarrays and MXNet NDArrays. In general, there are different types of iterators in\nMXNet and you can use one based on the type of data you are processing.\nDocumentation for iterators can be found [here](http://mxnet.io/api/python/io/io.html).\n\n## MXNet Classes\n\n1. **IO:** The IO class as we already saw works on the data and carries out\n operations such as feeding data in batches and shuffling.\n \n2. **Symbol:** The actual MXNet neural network is composed using symbols. MXNet has\n different types of symbols, including variable placeholders for input data,\n neural network layers, and operators that manipulate NDArrays.\n\n3. **Module:** The module class in MXNet is used to define the overall computation.\n\tIt is initialized with the model we want to train, the training inputs (data and labels)\n\tand some additional parameters such as learning rate and the optimization\n\talgorithm to use.\n\n## Defining the Model\n\nMXNet uses **Symbols** for defining a model. Symbols are the building blocks \nand make up various components of the model. Symbols are used to define:\n\n1. **Variables:** A variable is a placeholder for future data. This symbol is used\n to define a spot which will be filled with training data/labels in the future\n when we commence training.\n2. **Neural Network Layers:** The layers of a network or any other type of model are\n also defined by Symbols. Such a symbol takes one or more previous symbols as\n inputs, performs some transformations on them, and creates one or more outputs.\n One such example is the `FullyConnected` symbol which specifies a fully connected\n layer of a neural network.\n3. **Outputs:** Output symbols are MXNet's way of defining a loss. They are\n suffixed with the word \"Output\" (eg. the `SoftmaxOutput` layer). You can also\n [create your own loss function](https://github.com/dmlc/mxnet/blob/master/docs/tutorials/r/CustomLossFunction.md#how-to-use-your-own-loss-function).\n Some examples of existing losses are: `LinearRegressionOutput`, which computes\n the l2-loss between it's input symbol and the labels provided to it;\n `SoftmaxOutput`, which computes the categorical cross-entropy.\n\nThe ones described above and other symbols are chained together with the output of\none symbol serving as input to the next to build the network topology. More information\nabout the different types of symbols can be found [here](http://mxnet.io/api/python/symbol/symbol.html).", "cell_type": "markdown", "metadata": {}}, {"source": "X = mx.sym.Variable('data')\nY = mx.symbol.Variable('lin_reg_label')\nfully_connected_layer = mx.sym.FullyConnected(data=X, name='fc1', num_hidden = 1)\nlro = mx.sym.LinearRegressionOutput(data=fully_connected_layer, label=Y, name=\"lro\")", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "The above network uses the following layers:\n\n1. `FullyConnected`: The fully connected symbol represents a fully connected layer\n of a neural network (without any activation being applied), which in essence,\n is just a linear regression on the input attributes. It takes the following\n parameters:\n\n - `data`: Input to the layer (specifies the symbol whose output should be fed here)\n - `num_hidden`: Number of hidden neurons in the layer, which is same as the dimensionality\n of the layer's output\n\n2. `LinearRegressionOutput`: Output layers in MXNet compute training loss, which is\n\tthe measure of inaccuracy in the model's predictions. The goal of training is to minimize the\n\ttraining loss. In our example, the `LinearRegressionOutput` layer computes the *l2* loss against\n\tits input and the labels provided to it. The parameters to this layer are:\n\n - `data`: Input to this layer (specifies the symbol whose output should be fed here)\n - `label`: The training labels against which we will compare the input to the layer for calculation of l2 loss\n\n**Note on naming convention:** the label variable's name should be the same as the\n`label_name` parameter passed to your training data iterator. The default value of\nthis is `softmax_label`, but we have updated it to `lin_reg_label` in this\ntutorial as you can see in `Y = mx.symbol.Variable('lin_reg_label')` and\n`train_iter = mx.io.NDArrayIter(..., label_name='lin_reg_label')`.\n\nFinally, the network is input to a *Module*, where we specify the symbol\nwhose output needs to be minimized (in our case, `lro` or the `lin_reg_output`), the\nlearning rate to be used while optimization and the number of epochs we want to\ntrain our model for.", "cell_type": "markdown", "metadata": {}}, {"source": "model = mx.mod.Module(\n symbol = lro ,\n data_names=['data'],\n label_names = ['lin_reg_label']# network structure\n)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "We can visualize the network we created by plotting it:", "cell_type": "markdown", "metadata": {}}, {"source": "mx.viz.plot_network(symbol=lro)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "## Training the model\n\nOnce we have defined the model structure, the next step is to train the\nparameters of the model to fit the training data. This is accomplished using the\n`fit()` function of the `Module` class.", "cell_type": "markdown", "metadata": {}}, {"source": "model.fit(train_iter, eval_iter,\n optimizer_params={'learning_rate':0.005, 'momentum': 0.9},\n num_epoch=20,\n eval_metric='mse',\n batch_end_callback = mx.callback.Speedometer(batch_size, 2))\t ", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "## Using a trained model: (Testing and Inference)\n\nOnce we have a trained model, we can do a couple of things with it - we can either\nuse it for inference or we can evaluate the trained model on test data. The latter is shown below:", "cell_type": "markdown", "metadata": {}}, {"source": "model.predict(eval_iter).asnumpy()", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "We can also evaluate our model according to some metric. In this example, we are\nevaluating our model's mean squared error (MSE) on the evaluation data.", "cell_type": "markdown", "metadata": {}}, {"source": "metric = mx.metric.MSE()\nmodel.score(eval_iter, metric)\nassert model.score(eval_iter, metric)[0][1] < 0.01001, \"Achieved MSE (%f) is larger than expected (0.01001)\" % model.score(eval_iter, metric)[0][1]", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "Let us try and add some noise to the evaluation data and see how the MSE changes:", "cell_type": "markdown", "metadata": {}}, {"source": "eval_data = np.array([[7,2],[6,10],[12,2]])\neval_label = np.array([11.1,26.1,16.1]) #Adding 0.1 to each of the values\neval_iter = mx.io.NDArrayIter(eval_data, eval_label, batch_size, shuffle=False)\nmodel.score(eval_iter, metric)", "cell_type": "code", "execution_count": null, "outputs": [], "metadata": {}}, {"source": "\nWe can also create a custom metric and use it to evaluate a model. More\ninformation on metrics can be found in the [API documentation](http://mxnet.io/api/python/model.html#evaluation-metric-api-reference).\n\n\n\n", "cell_type": "markdown", "metadata": {}}], "metadata": {"display_name": "", "name": "", "language": "python"}, "nbformat_minor": 2}