"99 little bugs in the code, 99 little bugs. Take one down, patch it around, 117 little bugs in the code."
In this blog post we delve into the actual process of machine learning through what is known as the optimization process. Here we will touch on the concepts of Loss and Gradient Descent.
In the previous post, we discussed the way data is managed when training an AI Model. In this post I would like to go back to the machine learning process and get into detail on the key aspect of optimization. This is the mechanism that adjusts the parameters of a model to improve it's accuracy and is likely what comes to mind when thinking of the concept of machine learning.
At the core of every model there are it's parameters. These are the variables that the model adjusts during the training process in order to achieve more accurate predictions. The adjusting of these parameters begins with what is known as a Loss Function. The loss function calculates the difference in the prediction of the model versus the actual values in the training data. This value is known as loss. The goal of optimization is to minimize the loss value as much as possible as a lower loss indicates less of a difference between the models prediction and the actual values which therefore means the model is more accurate.
The optimization process relies on the models predictions and the loss function. During training the model makes predictions on what the models parameters should be and the loss function calculates the loss value. Then using the loss value as a reference point the model then makes a revised prediction on the parameters with the goal of lowering the loss value. This process is repeated many times until we get a model where the loss is minimized and the models predictions are as accurate as possible. One of the fundamental algorithms used in machine learning is known as gradient descent.
Gradient descent is an algorithm that relies on the concept of derivatives from calculus to be used in calculating parameters and finding the optimal parameters. The gradient represents the direction of the steepest ascent, and by moving in the opposite direction, the optimizer seeks to descend to the minimum loss.
This is just a small glimpse into the world of machine learning but it sets the foundation for future concepts. Stay tuned for my next post in which we will look into neural networks. Be sure to leave any feedback in the comments below!
0 Comments