Lets us try to implement logistic regression from scratch in python.
Recommended to be read after the Neural Networks release.
Importing necessary libraries
import numpy as np # linear algebra import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv) import matplotlib as plt |
The dataset we will be using is Pima-Indians-diabetes-database
Whose objective is to predict whether or not a patient has diabetes diagnostically.
y = data.Outcome.values x_data = data.drop([“Outcome”],axis=1) |
After this, the data is divided into y, which is the desired classification value column, and x_data, which refers to the various feature of the dataset.
Normalization of the data
The value of column DiabetesPedigreeFunction varies from 0.08 to 2.48, while Insulin varies from 0 to 848. We normalize data to give equal weightage to both the columns
x = (x_data – np.min(x_data)) / (np.max(x_data) – np.min(x_data)).values |
Upon normalization, the data is converted to the range of 0 – 1.
After this, data is split into training and testing datasets
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x,y,test_size = 0.25, random_state = 42) x_train = x_train.T |
We do this using the built-in method inside the sklearn library.
Defining Necessary Functions
Initialize the weights and biases
The weights and biases are called the parameter of the model. Each feature in the training dataset is given a certain weight. Let’s start by assigning them some random value, i.e., 0.01.
def initialize_weights_and_bias(dimension):
w = np.full((dimension,1),0.01) |
Sigmoid function
Logistic regression uses regression to predict the label by making a linear decision boundary. It would make no sense to get a value greater than 1 or less than 0. Here the sigmoid function comes into play. This function ranges between 0 to 1 and is stated as
def sigmoid(z):
y_head = 1 / (1+np.exp(-z)) return y_head |
To get the value of the parameter z used in the sigmoid function, we use the formula where x is the features array, w weights, and b is the bias. Now to make our model learn, we need to punish it for the losses and penalize it for wrong predictions this is done by the loss function, which is stated as
And cost function is effectively the sum of loss across all the cases.
This process of penalizing the model and moving forward with the predictions is known as forward propagation.
If we recall, we allotted random weights to the parameter; they now will be updated based on our loss function and cost function. To minimize loss and cost function, we use gradient descent which is
Where w denotes the weights, denotes the stepsize or the factor by which to change the gradient to find the local minima, which are multiplied by the derivative of the loss function to sum it all up, the algorithm works as follows:-
We assume a random datapoint in our graph and calculate its slope; then, we find the direction in which the loss function decreases and update the weights using the gradient descent formula this process is known as backpropagation; we then select a point by taking a stepsize of and repeat this entire process once again
sometimes is also referred to as learning rate.
Finally, we write the code for forward and backward propagation combined, as backward propagation also uses the same z found in forward propagation.
def forward_backward_propagation(w,b,x_train,y_head):
z = np.dot(w.T,x_train) + b #backward propogation |
After we have calculated our parameters, we need to update the randomly assigned weights to do this; we use another update function
def update(w, b, x_train, y_train, learning_rate,number_of_iterarion): cost_list = [] cost_list2 = [] index = [] # updating(learning) parameters is number_of_iterarion times for i in range(number_of_iterarion): # make forward and backward propagation and find cost and gradients cost,gradients = forward_backward_propagation(w,b,x_train,y_train) cost_list.append(cost) # lets update w = w – learning_rate * gradients[“derivative_weight”] b = b – learning_rate * gradients[“derivative_bias”] if i % 10 == 0: cost_list2.append(cost) index.append(i) print (“Cost after iteration %i: %f” %(i, cost)) parameters = {“weight”: w,”bias”: b}return parameters, gradients, cost_list |
After this function is called, our model has successfully calculated the weights and biases values using forward and backward propagation methods. This process is also used in neural networks i.e. they repeatedly update weights using forward and backward propagation.
Now onto the most awaited part of generating the predictions; we do that by defining a predict function as follows
def predict(w,b,x_test): # x_test is a input for forward propagation z = sigmoid(np.dot(w.T,x_test)+b) Y_prediction = np.zeros((1,x_test.shape[1])) # if z is bigger than 0.5, our prediction is one means has diabete (y_head=1), # if z is smaller than 0.5, our prediction is zero means does not have diabete (y_head=0), for i in range(z.shape[1]): if z[0,i]<= 0.5: Y_prediction[0,i] = 0 else: Y_prediction[0,i] = 1return Y_prediction |
Combining all the functions
def logistic_regression(x_train, y_train, x_test, y_test, learning_rate , num_iterations): # initialize dimension = x_train.shape[0] w,b = initialize_weights_and_bias(dimension)parameters, gradients, cost_list = update(w, b, x_train, y_train, learning_rate,num_iterations) y_prediction_test = predict(parameters[“weight”],parameters[“bias”],x_test) # Print train/test Errors |
from sklearn.linear_model import LogisticRegression lr = LogisticRegression() lr.fit(x_train.T,y_train.T) print(“Test Accuracy {}”.format(lr.score(x_test.T,y_test.T))) |
That was all about Logistic Regression without inbuilt python libraries (sklearn).