An Overview Of Backpropagation

Introduction

When training your model, you are bound to get errors – I mean that’s the whole point of training the model – to fix those errors. So how exactly do we fix those errors and make them right next time, how do we know what to change, and by how much, to get the correct result? Well, in neural networks, we use backpropagation. This is the concept talked about in Cassie Kozyrokov’s video from the course, Making Friends with Machine Learning. But before we get into backpropagation we need a little bit of background knowledge on neural networks…

Background Knowledge

Backpropagation is found in neural networks and it helps to correct errors in the training process. To understand this, we need a little bit of information on how neural networks work. Neural networks consist of several layers each with several different nodes. The nodes of each layer are linked to nodes of the next layer. In these links, you have different weights that will alter the value going from one node to the next. So when you feed in a data point, the first node that it reaches will put it through an algorithm(usually linear regression, or logistic regression) and then return a value, and on the way to the next node the value returned by the first node will be multiplied by a weight, and the new value(the final value of the first node multiplied by the weight in the connection between nodes) will be fed as input for the next node. This keeps happening until an output layer is reached in which the final output is predicted and shown to the user. Alright, now that we have background out of the way we can start to talk about what backpropagation is. If you are looking to get a more in-depth overview of neural networks here is an article on the subject.

Summary of video

First Cassie starts out by talking about forward propagation and which is when you feed a data point into a neural network and it passes through all the nodes and weights and finally reaches an output value. In other words – forward propagation is just a fancy way for saying running the neural network. But let’s say get an error in the training process. You now have to adjust the weights to try to get it right next time. Only one problem(well 2 actually) – which weight do I adjust and by how much? Doing this manually would take a lot of time and a lot of effort. So, the backpropagation algorithms help do this in a fast and efficient manner. They work by starting from the error and working their way backward through the network to find out where the error started and will then adjust the corresponding weights by an adequate amount. For those of you interested in the technicality behind how it works – it’s a lot of chain rules from calculus. For those of you who want to know more here is an article that explains it in more detail – but be warned it’s very technical(Link to article). Then the model gets fed another instance and the process gets repeated over again if it still returns an error, but if it gets it right it gets fed another instance until the training is complete.

My Take

Overall, I really liked this video. It was concise yet informative; technical yet understandable; and funny yet professional. The video helped me understand – at a high level albeit – backpropagation. It helped me understand not only what it was or how it was but also where it fits into the overall category of machine learning. I always like when videos/articles help you contextualize a concept because then you know how it fits into the whole picture, and where it gets used. I also liked the visuals in the video. Some were funny and others informative, but all of them helped in making the talk more fun and understandable. This combination of funny and informative visuals made the talk still informative and serious but also added a little bit of informality as a break in the middle of all the technical content. I was also fascinated by the overall concept, and how it really is the thing that drives neural networks – and deep learning – at their fundamental level. This allows the models to learn! Without this, there would be no neural networks or deep learning for that matter. One thing I would like to know more about is how exactly the model figures out how much to change the weights each time. I am aware that in some software you can set it, but is that it? Is it just the user saying to change it by this much? Or is there any leniency for the model to decide how much it thinks would be an optimal change to the weight?

Conclusion

All in all, I really liked this video. It was extremely informative while still being concise. It also helped me gain context into the idea – something that not a lot of videos do. I also really enjoyed the topic, and how fundamental it is to neural networks and deep learning. I highly recommend you watch this video when you get the chance!