# How to visualize Gradient Descent using Contour plot in Python Linear Regression typically is the introductory chapter of Machine Leaning and Gradient Descent in all probability is the primary optimization method anybody learns. Most of the time, the trainer uses a Contour Plot in order to explain the path of the Gradient Descent optimization algorithm. I used to marvel how to create those Contour plot. Immediately I will attempt to show how to visualize Gradient Descent using Contour plot in Python.

Contour Plot is like a 3D floor plot, where the 3rd dimension (Z) will get plotted as constant slices (contour) on a 2 Dimensional floor. The left plot at the picture under exhibits a 3D plot and the correct one is the Contour plot of the same 3D plot. You’ll be able to see how the 3rd dimension (Y here) has been transformed to contours of colors ( and features ). The essential part is, the worth of Y is all the time similar throughout the contour line for all the values of X1 & X2. Before leaping into gradient descent, lets understand how to truly plot Contour plot using Python. Right here we will probably be using Python’s hottest knowledge visualization library matplotlib.

## Knowledge Preparation:

I’ll create two vectors ( numpy array ) using np.linspace perform. I will unfold 100 factors between -100 and +100 evenly.

If we simply make a scatter plot using x1 and x2, it’ll seem like following: Now, in order to create a contour plot, we’ll use np.meshgrid to convert x1 and x2 from ( 1 X 100 ) vector to ( 100 X 100 ) matrix.

## np.meshgrid():

Lets seems to be at what np.meshgrid() truly does. It takes 2 parameters, in this case will move 2 vectors. So lets create a 1X3 vector and invoke the np.meshgrid() perform. By the best way, it returns 2 matrix again and not only one.

Should you take a look at a1 and a2, you will notice now they both are 3X3 matrix and a1 has repeated rows and a2 has repeated cols. The np.meshgrid() perform, create a grid of values where every intersection is a mixture of two values.

So as to perceive this visually, when you take a look at the 3D plot in the primary image, we now have now created the underside aircraft of that 3D plot, a mesh/grid.

Once the mesh/grid values have been created, we will now create the info for the 3rd (digital) dimension. Right here I’m simply using an eclipse perform. Y may also be a 100 X 100 matrix.

[[[[
y=x1^2 + x2^2
]

Before even creating a proper contour plot, if we simply plot the values of X1 & X2 and choose the colour scale according to the values of Y, we will simply visualize the graph as following: ## plt.contour() and plt.contourf():

We’ll use matplotlib’s contour() and contourf() perform to create the contour plot. We just want to call the perform by passing Three matrix. You possibly can see the scatter plot and contour plots seems type of similar. Nevertheless, we get rather more control which creating the Contour plot over the scatter plot.

### Fill Contour Plot:

The contourf() perform can be used to fill the contour plot. We will also change the line type and width. Please refer the matplotlib’s developer documentation for other out there choices. ### Choose custom ranges:

We’ll take a look at yet one more necessary function of the plotting library. We will outline the degrees the place we would like to draw the contour strains using the level or 4th parameter of the each contour() and contourf() perform. The under code units fixed ranges at totally different Y values. • We might be using the Promoting knowledge for our demo right here.
• We’ll load the info first using pandas library
• The gross sales would be the response/target variable
• TV and radio will be the predictors.
• Using StandardScaler to normalize the info ( ( mu = zero ) and ( sigma = 1 ) )

## Calculate Gradient and MSE:

Using the following perform to calculate the mse and derivate w.r.t w

Subsequent, choosing a place to begin for w, setting the training price hyper-parameter to zero.1 and convergence tolerance to 1e-3

Also, creating two more arrays, one for storing all of the intermediate w and mse.

## Gradient Descent Loop:

Under is the loop for Gradient Descent the place we update w based mostly on the training price. We are also capturing the w and mse values at each 10 iterations.

That’s all, you possibly can see that w is converging at the following values.

Notice: You possibly can refer my different tutorial on gradient descent, where I have defined the maths and program step by step.

Univariate Linear Regression using Octave – Machine Learning Step by Step

Earlier than we begin writing the code for the Contour plot, we’d like to deal with few things. Convert the listing (old_w) to a numpy array.

Then I am adding 5 further levels manually just to make the Contour plot look better. You’ll be able to skip them.

Finally, converting the errors record to numpy array, sorting it and saving it as the levels variable. We’d like to type the level values from small to larger since that the best way the contour() perform expects.

Its all the time useful to see first before going via the code. Right here is the plot of our gradient descent algorithm we will probably be creating next. ## Prepare Axis (w0, w1)

As we’ve finished earlier, we’d like to create the w0 and w1 (X1 and X2) vector ( 1 X 100). Last time we used the np.linspace() perform and randomly select some values. Right here we’ll use the converged values of w to create an area round it.

Our w0 array can be equally spaced 100 values between -w * 5 and +w * 5. Similar for the w1.

The mse_vals variable is just a placeholder.

Last time use have used the eclipse components to create the 3rd dimension, nevertheless right here want to manually calculate the mse for each combination of w0 and w1.

Observe: There’s shortcut obtainable for the under code, nevertheless needed to maintain it like this manner since its straightforward to see whats happening.

## Put together the 3rd Dimension :

We’ll loop by way of each values of w0 and w1, then calculate the msg for every combination. This manner might be populating our 100 X 100 mse_vals matrix.

This time we aren’t using the meshgrid, nevertheless the concept is identical.

## Remaining Plot:

We’ve w0, w1 and mse_vals (the third dimension), now its fairly straightforward to create the contour plot like we saw earlier.

• Use the contourf() perform first. Cross the levels we created earlier.
• Plot two axis line at w0=zero and w1=1
• Name the plt.annotate() perform in loops to create the arrow which exhibits the convergence path of the gradient descent. We’ll use the saved w values for this. The mse for these w values have already been calculated.
• Invoke the contour() perform for the contour line plot.

Discover the mse values are getting lowered from 732 -> 256 -> 205 -> … and so on. Gradient Descent has converged simply right here.

Contour plot could be very helpful to visualize complicated construction in a simple approach. Later we’ll use this similar methodology for Ridge and Lasso regression.

I hope this How to visualize Gradient Descent using Contour plot in Python tutorial will make it easier to build rather more complicated visualization.

Blog