A0

Labels

Showing posts with label ML. Show all posts
Showing posts with label ML. Show all posts

4) An Overview on Logistic Regression in ML

 An Overview on Logistic Regression in ML

Introduction:

                               This is a supervised ML algorithm and also known to solve the binary classification problems. Binary means two and classification means natures and this means that we use it, to predict the nature between two natures.  In simple words here we have two classes it can be true(presence) or false(absence) or may be 1 or 0,  Like Linear Regression problem which we discussed in blog 4, we give and input x and model (discuss it later) will predict output y. In of the point is false categories called negative class and true categories called positive class.

How to achieve Logistic Regression:

                             To Achieve logistic regression we use sigmoid function which gives us the value of 0 and 1. We apply condition here to get the prediction in 0 and 1 form like if f(x) which is output greater than 0.5 then real output will be 1 and if less than 0.5 then real output will be 0. This we cannot achieve if we start using the linear function for this kind of problem also like linear regression problems because linear function can give us the output in wide variety of numbers which can make difficult to get classification number like 0 and 1. 

Sigmoid Formula:
               

Sigmoid Result





An overview of Logistic Regression with single neuron:


Logistic regression neuron


As you can see x is input which will multiply by weight(w) and add by bias(b) and sigma is activation
function sigmoid after applying sigmoid we will get the output y. Here after getting the output y we will compare thus output with the actual output if both are same or close then good we test this thing using the loss function if loss is less then good otherwise we will change the value of weights and bias using gradient decent. This we will do until we reach the closest value.

Loss Function Formula:

                         L(f w[ i ] ,b ) ,y1 ) =- ( yi ) log ( fwb ( x[i] ) ) - ( 1 - yi ) log ( 1 - fw[i] , b ( x[i] ) )


Loss function Logistic regression



Cost Function Formula (Gradient Decent Formula):
                         J ( w[i] , b ) = 1 / m ( sigma ( L ( fw [i] , b ( x[i] , yi ) ) )

Gradient Decent Logistic regression



Notes:

Logistic Regression


Code:

from distutils.log import warn
import numpy as np
import matplotlib.pyplot as py
x_train=np.array([2,3,5,6,9,1])
y_train=np.array([0,0,1,1,1,0])
np.seterr(divide = 'ignore') 
# np.seterr(divide='warn')
def calculate(x,w,b):
    fx=np.dot(x,w)+b
    return fx
def sigmoid(z):
    fx=1/(1+2.71**(-z))
    return fx
def cost(x,y,w,b):
    m=len(x)
    t_cost=0
    sum_cost=0
    for i in range(m):
        fx=sigmoid(np.dot(x[i],w)+b)
        print("here is the issue ",fx)
        t_cost=-(x[i]*np.log(fx))-((1-x[i])*np.log(1-fx))
        sum_cost+=t_cost
    total_cost=(1/m)*sum_cost
    return total_cost
def gradientLoss(x,y,w,b):
    m=len(x)
    lossW=0
    lossB=0
    sum_loss_w=0
    sum_loss_b=0
    historyW=np.zeros(m)
    historyB=np.zeros(m)
    for i in range(m):
        fx=sigmoid(np.dot(x[i],w)+b)
        loss=fx-y[i]
        lossW=loss*x[i]
        lossB=loss
        historyW[i]=lossW
        historyB[i]=lossB
        sum_loss_w+=lossW
        sum_loss_b+=lossB
    sum_loss_w=(1/m)*sum_loss_w
    sum_loss_b=(1/m)*sum_loss_b
    return sum_loss_w,sum_loss_b

def gradient(x,y,w,b,lr,iterations):
    historyW=np.zeros(iterations)
    historyB=np.zeros(iterations)
    for i in range(iterations):
        gradientW,gradientB=gradientLoss(x,y,w,b)
        w=w-(lr*(gradientW))
        b=b-(lr*(gradientB))
        historyW[i]=w
        historyB[i]=b
    return historyW,historyB
def predict(x,w,b):
    result=np.zeros(len(x))
    for i in range(len(x)):
        output=calculate(x_train[i],w,b)
        if(output>0):
            result[i]=1
        else:
            result[1]=0
    return result    
        
    
w=0.1
b=1
lr=0.1
historyW,historyB=gradient(x_train,y_train,w,b,lr,1000)
print("Weights are ",historyW)
print("Bias are ",historyB)
w=historyW[-1]
b=historyB[-1]
# total_cost=cost(x_train,y_train,w,b)
# print("Cost is ",float(total_cost))

output=predict(x_train,w,b)
print("Output is ",output)
output2=sigmoid(np.dot(x_train,w+b))
print("Sigmoid output is ",output2)
py.scatter(output,output2,c='r')
py.title("Comparison")
py.xlabel("predict")
py.ylabel("sigmoid")
py.show()

    

    
    

3) Multiple Linear Regression

 Multiple Linear Regression

Here we have multiple features such as (x1,x2,x3...) instead of one feature (x1) and after feed these features to model we can get the output (y). Before proceeding away we should understand some of the terminologies which we use for ML. let's assume a[i][j] here will the help of j we can proceed toward columns and with the help of i we can proceed toward rows. Furthermore, n is the total number features.

How Model find value for Multiple Features:

let’s assume we have 3 features x1, x2 and x3 with three weights w1, w2, w3 and one bias b and by using these values we can find the output f(x).

f(x)= ((x1*w1)+(x2*w2)+(x3*w3))+b

After getting the f(x) value we go ahead by finding the loss with the help of loss function and apply gradient decent to get the right parameters (weight and bias) for the model. This all we will understand with time. But for now we should have little bit of understanding for basic terminologies, methods and approaches.

=> We have bias b, weights [w1,w2,w3..] and features [x1,x2,x3…]
=> Model f(x)= pred_y[ i ] =w1*x1+w2*x2+…+w(n)*x(n)
=> cost function J(w[i],b)=((sigma(pred_y[i]- actual_y[i]))²)/2m
=>update weight w[i]= w[i]- learning rate * ( J(w[i], b ) * x[i]
=>update bias b= b- learning rate * (J(w[i],b)

Note: Vectorization is good approach to perform any operation as compare to use loops

Here we have Model named as f(x) and with the help of model we predict our output (y). After predicting the value (y) we will find the error using cost function called Gradient decent. Our parameters will perfectly tuned if our cost function is close to zero. With the help of cost function we will find the cost between actual_y (output) and predicted_y (output). If our error is high then we will update the weight and bias and make the cost close to 0.

Feature Scaling:

In simple word this is normalization of independent features. This helps us make gradient fast. We use normalization if either feature difference is too small or too large. Therefore, in normalization we scale the values between 0 and 1 most of the case. We can take the example of images while training model for images we normalize the images to make training fast for model. There are too much types of normalization and mostly common two of them are given below

1) Z-Score Normalization: Here data (features) will normalize in such that mean of your data will be zero and standard derivation will become 1.

               Formula:          x1= ( x1 - mean( u1 ) ) / ( standard deviation )

2) Mean Normalization:   Here we normalize the data with respect to mean
             
               Formula:           x1= ( x1 - mean( u1 ) ) / ( max - min )



Note: Now a days features engineering is one of the most useful thing in ML because with the help of it we identifies news features for model


An overview Diagram of Linear Regression:

Linear Regression



Notes for Linear Regression:


Linear Regression formulas



I made very simplified version linear regression Code (Python) which is given below:

import numpy as np
import matplotlib.pyplot as plt
xFeatures=np.array([2,3])
yFeatures=np.array([330,400])
# plt.scatter(xFeatures,yFeatures,marker='x',c='r')
# plt.title("Actual Values")
# plt.xlabel("Size")
# plt.ylabel("Price")
# plt.show()
lr=0.1
m=len(xFeatures)
def calculate(x,w,b):
    y=w*x+b
    return y
def cost(x,y,w,b):
    JSumWeight=0
    m=len(x)
    for i in range(m):
        f_wb=(w*x[i])+b
        JSumWeight=JSumWeight+np.power((f_wb-y[i]),2)
    totalCostW=(1/2*m)*JSumWeight
    return totalCostW

def gDecent(x,y,w,b):
    wid=0
    bs=0
    sumWeights=0
    sumBias=0
    for i in range(len(x)):
        f_wb=(w*x[i])+b
        wid=(f_wb-y[i])*x[i]
        bs=(f_wb-y[i])
        sumWeights+=wid
        sumBias+=bs
        
    sumWeights=(1/m)*(sumWeights)
    sumBias=(1/m)*(sumBias)
    return sumWeights,sumBias
            
def resultGDecent(x,y,w,b,iterations):
    recordW=np.zeros(iterations)
    recordB=np.zeros(iterations)
    for i in range(iterations):
        errorW,errorB=gDecent(x,y,w,b)
        newW=w-(lr*errorW)
        newB=b-(lr*errorB)
        w=newW
        b=newB
        recordW[i]=w
        recordB[i]=b
    
    return recordW,recordB

bias=1
weight=0.25
w,b=resultGDecent(xFeatures,yFeatures,weight,bias,1000)
print("1st Weight is ",w)
print("1st Bias is ",b)

# w,b=resultGDecent(xFeatures,yFeatures,w,b)
# print("Last Weight is ",w)
# print("Last Bias is ",b)
# w,b=resultGDecent(xFeatures,yFeatures,w,b)
# print("Last 1 Weight is ",w)
# print("Last 1 Bias is ",b)
getCost=cost(xFeatures,yFeatures,w,b)
print("Cost is ",getCost)
# plt.scatter(w,b,marker='x',c='r')
# plt.title("Actual Values")
# plt.xlabel("Width")
# plt.ylabel("Height")
# plt.show()
print("Our real value is ",xFeatures*w[-1]+b[-1])




Output:
After updating weights and bias we will have predicted value below
Our predicted value (y) is [328.882757 400.78734259]
You can check the actual y above in the code. For features x our model is predicting these values y after updating weights and bias with the help of gradient decent.
For Now this is enough. Today we learn some of the terminologies and functions which helps us for training model and updating weights and bias. We also see the basic code for linear regression.

2) Linear Regression Model

 Linear Regression Model

This is the type of Supervised Learning Algorithms. Here we will predict a value after getting some inputs. We have some features x (if you have multiple features then x will become x1, x2, x3 and so on)as input. We will take these features and feed them to a trained model and our model will predict a value as output. In this way linear Regression Model works. Now let’s try to go in depth of model.
If we break the model as initial stage then it will be like this
Feature (x) — — — — — — — — → Model (f(x)) — — —— — — — -> Prediction (y)
Here f(x)= w*x +b, Maybe you are thinking what is this so let me explain. Here w is some weight, x is feature and b is bias. Let’s assume you bought 3 eggs (egg is feature) and price of a egg is 10 rupees (here price is weight and bias is 0(you can also take 1 as a bias). We know that weight and bias can be high or low according to the situation) Now let’s predict the price of egg as output by putting detail in the equation which we named as model
w=10; x=3; b=0
f(x)=w* x + b — — — — — — — — — -> 10 * 3 + 0 — — — — — — — — — -> 30 (this is prediction which we can say y)


Our actual price of eggs (actual-y) is 30. This means that we have 2 y’s first one is predictive y and second one is actual y. Predictive y denotes to that price which we predict by our own with the help of our model using that information which we have. And actual y denotes the actual or real price of eggs according to the market. After comparing the actual y with predictive y if both are near to each other means about the same then our model prediction is good if they are not same or near to each other then our model prediction is not good. Like other mathematic scenarios we should have some equation to measure this error between actual-y and predictive-y and here it is, to measure this error we have a method called cost function. formula is given below. 


Here n is number of training examples and Y[i] is actual value and Y^ is predicted value. We will put the values inside this function if output will be close to zero then our output is accurate otherwise we have to change the parameters (weight and bias) to make prediction accurate. Now maybe you are confusing how parameters (we are using parameter word for w and b)are changing, actually for training purpose when we are trying to train the model with the help of features x then we will check the actual-y with predictive-y at every iteration every error is high then we will change the weight and bias or update the weight and bias according to the situation. There are lot of the techniques to update the weights and bias for now you have to take the overview of the concept. We will cover the complete process how model trains itself in upcoming tutorials. At this stage you should familiar with Gradient Decent technique to update the weights and bias. Formula to update weight and bias using gradient decent is given below

Gradient Decent

Here alpha is learning rate which is also a parameter. You have to set the learning rate accurately. If Learning rate will be too small then weight and bias will update slowly toward target weight and bias (target weight and bias is that value where your model will predict the right output). If Learning rate alpha will have very large value them may be we cannot reach toward the target weight and bias. So you have to set the right value for learning rate so that our model will learn accurately and predict right value.
Learning rate

Example of Linear Regression using Single Neuron:

Linear Regression

This Diagram shows the method of prediction that how model predicts the value. For today’s blog your understanding for predicting values should be clear.








1) Machine Learning

 Machine Learning

Introduction:

Machine Learning is the subset of Artificial Intelligence. Now a days this is very buzz word. If you ask from anyone what do you want to become than most of the people answer it as Machine Learning Engineer. This is not easy as sound like here you have a huge grip on Math and stats to become a good ML Engineer. Actually, This is explicit program which Help us to get label or predict label, recommend videos to anyone according to his/her searching data, predict spam email or not etc. There are various things which are using Machine Learning such as  Self Driving Cars, Speech Recognition Systems and Google Maps etc. There is another buzz word called General AI which helps us to act the AI like Humans.

Machine Learning Types:

ML is explicit program to create intelligence e.g while playing checkers ML Algorithm works better as much as ML Algorithm gets the Data. There are some types of Machine Learning Algorithms which are Given Below: 

1) Supervised Learning (It is using in real life applications)
2) Unsupervised Learning
3) Reinforcement Learning
4) Recommendation Systems

We will choose the right tool for a specific problem after identifying the problem and after getting identification we can select the right algorithm.

1) Supervised Learning:

About 99% problems are solving now a days using supervised learning. e.g

                    => email     ------------->        spam(0/1)       ---------->              spam filtering
                    => audio     ------------->        text transcripts         --->              speech recognition
                    => English  ------------->        Spanish        ------------>              Machine Translation
                    => ad, user info    ------>        click(0/1)         --------->              online advertising
                    => image, radar info  -->       position of other cars  ->              self driving cars
                    => image of phone  ---->       defect(0/1)      ---------->              visual inspection

Regression and classification are the type of supervised learning. 
  => classification  problem predict categories (it should not be number). For example you are classifying the two categories for email like spam or not spam. You can say class A is spam and class B is not spam. In classification problem your model will be able to tell the difference between these two classes after training. You should not limit the categories I choose the example of binary classification because here we are dealing with 2 classes which are spam(1) and not spam(0). There can be multiple classes in the classification problem. (Here I have one of the most important point which is your model can only understand the numbers. You have to convert everything into numbers e.g images, text). Here we also have to convert the categories into numbers like 0 and 1 if you have more than one category then convert it into 0,1,2,3 on so on. This is really a fun.

Classification

  => Regression problems predict the infinitely possible numbers. This is also one of the famous type of supervise learning. Here we can take the example of House Price Prediction System where you have to estimate the price of the house and this can be infinite possible numbers. Here you are not dealing with categories or classes here you are dealing with numbers. You will feed the training x features (features are attributes of the house like location, size, rooms, bathroom etc.) to the model with training y feature of the model price. After feeding the data your model will start finding the pattern by adjusting the parameters weight and bias(we will see the whole mechanism later in the blogs which can help you understanding the whole mechanism). Here weights and bias will adjust in this way that when features x will multiply by weight (w) and add with bias (b) you will get your predicted price (y=w*x+ b). I know this is little bit hard to understand at this stage but don’t worry we will cover every thing in detail later. We will also understand the model training mechanism, testing mechanism ans so on. I will also provide you the code which can help you for better understanding.

Regression



2) Unsupervised Learning:

Here we find something interesting with unlabeled data e.g clustering, k-mean etc.
Note: Google also use clustering in Google News to show related articles by finding similar tags or words.
Here you do not has input x with output y to train model so here we find structure in data by clustering, anomaly detection(find unusual data points), dimensionality reduction(compress big data set to small one using fewer number).

Supervised vs Unsupervised


Note: Remaining two types of Algorithms we will write in upcoming tutorials