Machine Learning through Predictive Analysis using Multi Linear Regression in R with an example

Rajnilari2015
Posted by in R Language category on for Beginner level | Points: 250 | Views : 1051 red flag

In the current topic, we will learn how to perform Machine Learning through Predictive Analysis using Multi Linear Regression in R with an example.


 Download source code for Machine Learning through Predictive Analysis using Multi Linear Regression in R with an example

Introduction

In the previous article, we have seen how to use Machine Learning through Predictive Analysis using simple Linear Regression in R with an example.In the current topic, we will learn how to perform Machine Learning through Predictive Analysis using Multi Linear Regression in R with an example.

Multi Linear Regression can be defined as

Multiple linear regression attempts to model the relationship between two or more explanatory variables and a response variable by fitting a linear equation to observed data. Every value of the independent variable x is associated with a value of the dependent variable y.

We must bear in our mind that multiple linear regression has more than one independent variables. What it means will be clarified below.

We will use RStudio for this purpose.

Let's start with an example

Let us consider that we have the below set of data in "Car Sample Data.csv" file

Let's say Price is the response variable(Y). And CylinderVolume(x1),Year(x2),MileagePerKM(x3) are the predictor variables.This is a linear relationship among one response variable and multiple predictor variables of the form

 y = a + b1x1 + b2x2 + b3x3 + ... + bnxn .

where,

	Y - dependent/predictor variable
	x1/x2/x3..xn - independent/response variable(s) 
	a,b1,b2,b3...bn - are co-efficients.

What we are going to solve ?

The above data presented to us is a set of training data / historical data. Using our training data, we have to train our Predictive Model by using Multiple Linear Regression algorithm. Once, our algorithm is trained i.e. the machine has learnt what to do, we will predict Y given a new value of predictor variables.

Straight to experiment

Open RStudio. First we will establish a Relationship Model between x1/x2/x3[Predictor Variables] and Y(Response Variable) and obtain the Coefficients values. For this we will use the lm function of R that creates a relationship model between the predictor and the response variable.

# Load data from csv
input <- read.csv('d:/Car Sample Data.csv', sep = ',', quote="\"", check.names=F)

# Create the relationship model.
model <- lm(Price~CylinderVolume+Year+MileagePerKM, data=input)

#print the relation
print(model)

Thus we obtain the mathematical equation for Multi Linear Regression Model based on the above intercept and coefficient values which is

y = -4.683e+07 + x1 * 2.534e+02 + x2 * 2.339e+04 + x3 * -2.854e-01

So far we have trained our Model using the training dataset. Means our machine has learnt the algorithm. The next step is to predict. Say, we would like to predict the price of a TOYOTA car whose CylinderVolume = 2000, Year=2020, MileagePerKM =90000.

It will be wrong if we directly use the mathematical equation obtained above to predict the result. First we have to filter the record based on the MAKE to obtain a right Mathematical model and then apply the predict function as shown below.

# Load all data from csv
input <- read.csv('d:/Car Sample Data.csv', sep = ',', quote="\"", check.names=F)

#Filter data based on Make
filterRecord <-input[input$Make == 'TOYOTA',]

#print the filtered result
print(filterRecord)

# Create the relationship model.
RelationModel <- lm(Price~CylinderVolume+Year+MileagePerKM, data=filterRecord)

#print the relation
print(RelationModel)

Thus we obtain the mathematical equation which is

y = -4.838e+07 + x1 * 3.630e+02 + x2 * 2.415e+04 + x3 * -1.432e+00

To obtain the predicted value let us run the below program

# Load all data from csv
input <- read.csv('d:/Car Sample Data.csv', sep = ',', quote="\"", check.names=F)

#Filter data based on Make
filterRecord <-input[input$Make == 'TOYOTA',]

# Create the relationship model.
RelationModel <- lm(Price~CylinderVolume+Year+MileagePerKM, data=filterRecord)

# predict the price of a TOYOTA car whose CylinderVolume = 2000, Year=2020, MileagePerKM =90000
PriceY <- data.frame(CylinderVolume=2000,Year=2020,MileagePerKM=90000)

#display the value
print( predict(RelationModel,PriceY) )

And we found the answer which is 9,89,693.7

Reference

  1. Machine Learning
  2. Predictive Analytics

Conclusion

In this article we have learnt Machine Learning through Predictive Analysis using Multi Linear Regression methodology by using the language R with a simple example.Hope this helps. Thanks for reading.

Page copy protected against web site content infringement by Copyscape

About the Author

Rajnilari2015
Full Name: Niladri Biswas (RNA Team)
Member Level: Platinum
Member Status: Member,Microsoft_MVP,MVP
Member Since: 3/17/2015 2:41:06 AM
Country: India
-- Thanks & Regards, RNA Team


Login to vote for this post.

Comments or Responses

Posted by: Annaeverson on: 3/15/2018 | Points: 25
well that is pretty nice, thanks
Posted by: Stonemaddox on: 3/16/2018 | Points: 25
thanks for guide
Posted by: Kerryfuller on: 4/20/2018 | Points: 25
NICE POST

Login to post response

Comment using Facebook(Author doesn't get notification)