Polynomial Curve Fitting in Machine Learning
In this article, we will attempt Polynomial Curve Fitting. The GitHub repository for the same is given at the end of the article and all the code required is included in this article as well. Polynomial curve fitting, here, is done from scratch in Python.
INTRODUCTION
So, what is Polynomial Curve Fitting? Basically, we will try to fit a polynomial function into some custom dataset and check the results. The custom dataset, which we will create in a moment, will be non-linear and we will try to fit a 3-degree polynomial on the data. We will start by importing some of the required modules.
import numpy as np
import matplotlib.pyplot as plt
import math
import time
CREATING THE DATASET
We all know the sin(x) function. We will use sin(x) as the base of our dataset. For each data point, we will take the sine of the value and add some random noise to it. We are creating a dataset of 20 data points. We can say that sin(x) will approximate our target function pretty well. So, it will be the benchmark for us. Below is the code for creating the dataset.

The above image shows the dataset. The blue points are basically the data-points. The green curve represents sin(x). To display this, we can plot sin(x) against x (that part hasn’t been shown in the article, but is there in the GitHub link given at the end). As we discussed, sin(x) is the benchmark for this problem.
THEORY
Let us dive into a bit of theory before moving into the code. I couldn’t include LaTeX code to write the mathematical equations and hence I have pasted an image of the theory. You’ll find this on the GitHub link given at the end as well.

Basically, this is the theory. Now, we have to find the optimal values of w for given x. That will help us to find the best fit curve. We will talk about the optimization of the w vector later, first we need some helper functions.
HELPER FUNCTIONS
We will need a few functions to help us with the optimization process. First, we need to design a function to predict the value, given x, which comes from the theory given above. Next, we need a loss function. A loss function is one that helps us find out how far our prediction is from the actual output for a certain x. We are using the Root Mean Squared Error Function for the same. It basically finds the difference between the actual and predicted values and returns its square. Finally, we will need a function to find the gradient or the derivative of the function with respect to any variable. It will use the formal definition of a derivative:

Below is the code for the helper functions:
OPTIMIZATION
Now comes the most important part. We will have to find the best set of values for w, for which the loss will be the least. We will use the Gradient Descent Optimizer. What Gradient Descent does is first a random set of values will be assigned to w, a learning rate and a number of epochs will be assigned as well. Then, a prediction will be made and the gradient of the loss function will be calculated with respect to each of the w’s and will be stored in a list called grads. Finally, the w’s will be updated by subtracting the learning rate times the gradient corresponding to the w from the intial w. This would continue for the number of epochs assigned initially. The code for the same would look like this:
I have chosen the learning rate to be 1e-6 or 10 raised to the power of -6. The process will run for 100000 epochs. l is the final loss and yhat corresponds to the predictions. Now, we have an optimal set of values for w. So, let us check our predictions.
CHECKING THE PREDICTIONS
So, here we finally check our predictions(mentioned as yhat earlier). So, we check our predictions by plotting them. Again the black points are the actual data-points. The green curve is the sine curve, the benchmark for the dataset. The blue line is our predictions with the red dots showing the actual location of each prediction. Here is the code for it and the graph is below as well:

Now, let us analyze our curve with respect to the sine curve. Yes, it doesn’t look much like a sine curve but it has generalized well. If we compare data-point for data-point, we will find that the error is actually very less. The curve has some behavior like a sine curve as well, only that the crests and troughs are much smaller. It would have performed better if: either we would have had more data or taken a higher degree polynomial.
GITHUB LINK
Here is the GitHub Link for the repository . Before getting there, I must mention a few things. The file is in the form of a notebook or a .ipynb file. It contains everything this article contains and more. I have used another loss function in it, in excess of the loss function mentioned here. The full code and their outputs, with some theory is present there.
The same problem has been solved using the Julia language can be found here done by Deeptendu Santra.
Thank You for reading this article. Hope it helped you in getting an idea of Polynomial Curve Fitting.
Don’t forget to follow The Lean Programmer Publication for more such articles, and subscribe to our newsletter tinyletter.com/TheLeanProgrammer