regression analysis 2 in Excel with extended multiple regression analysis example

regression analysis 2 in Excel with extended multiple regression analysis example


We take the model that was presented in the first video and run the linear regression in excel. We look at the p values and do hypothesis testing. We then extend the model to a multiple regression model.
Closed Caption:

ok next next I'd like to continue with
our example problem and insert our data
recall that we had hours of instruction
and units produced in the regression
analysis one video so we'll go ahead and
put in our data and the units produced
five 4687 and on a previous video we
calculated the regression equation by
hand but now we're going to do it with
Xcel first thing you need to ensure is
that you've got your analysis toolpak
added and if you don't have it you need
to go to adams and click on that so that
it brings it in
so let's click on data analysis and
we'll go down to regression analysis and
say okay and this window pops here and
it's saying input the wide range while
the wider range is your dependent
variable so will just highlight that
and then we'll say and put your ex range
which is your independent variable hours
of instruction and let's see we want the
labels i suppose know let's let's have
any worksheet get some residuals
residual plots line fit plus ok that's
pretty good and you say okay and here's
the new sheet is our line fit plot and
saw this before and there's a residual
plot that we saw on the first video
ok again the important thing is to make
sure it looks linear which does
approximately and look at the residual
plots to make sure that your errors are
independent from each other so there's
no patterns here and make sure that you
got variants approximately normally
distributed
ok the things that you want to look at
here first of all the regression
equation itself there's the intercept
3.6 and there's the slope . eight so y
hat is equal to 3.6 plus point eight
times X ok second thing you probably
want to look at is your R squared guide
our scribe is right there . 64 that's
the coefficient of determination
coefficient of determination is always
between 0 and 1 and defined as that's
the percentage of variation in Y that's
explained by X so in this case
sixty-four percent of the variation in Y
is which is the the units produced can
be explained by hours of instruction if
you take the square root of the
r-squared you get the correlation
coefficient correlation coefficients are
between negative 1 positive 1 so if
you've got a negative slope then it's a
negative correlation
catback attach a negative sign to it
next thing you're probably going to want
to look at is the key or the P ok
because we want to do a hypothesis test
to see if this is statistically
significant
we got the p-value right there so on the
previous video we said you reject the
null hypothesis and conclude that there
is a statistically significant
relationship if the p is less than your
healthy loving see that he is . one of
four it's higher than ten percent higher
than five percent higher than 1% so here
we have to claim that no this is not
statistically significant
there's a over ten percent probability
ten percent chance of finding this
result given that there's no
relationship between these variables so
that's probability of that happening by
chance which is pretty high so it's not
statistically significant
ok let's go back to our sheet and try
this
I think the reason wasn't statistically
significant is because our sample size
is so small there's only a sample size
five
so let's so let's try it again except
now let's use a sample of size 10
ok we'll use the same exact data will
just pop it in there and let's run it
again so we'll go to tools data analysis
again regression ok
we need to redefine our dependent
variable include all 10 data points are
independent variable the same
and let's see we want the residuals
again the worksheet say ok there's a new
one
ok notice that the residual plot and the
light get that look exactly the same as
expected because the same exact date
just in there twice
notice that the the intercept and the
slope are exactly the same as we expect
the r-squared is exactly 7 64 percent of
the variation in Y is explained by X
what's different now is the feedbag look
at the value here and that higher cheese
statistic value is point zero five it's
a about a half of one percent so now we
can conclude that this model with the
sample size can't is statistically
significant because the probability of
seeing this by chance given there's no
relationship between these variables is
only a half percent so this key value
point 05 was less than 101
it's less than 145 it's less than $MONEY
. know all the common levels of
significance that you might use alpha
levels so therefore we can conclude that
yes there is a relation servant
statistically significant relationship
between hours of instruction and units
produced and you could go ahead and
you see what okay let's
change this model into a multiple
regression and on the earlier video I
said what we do is include another
independent variable
let's just say that we got the age of
the people just randomly getting some
ages
ok now we got to do independent
variables and still one dependent
variable and let's see what happens
go to analysis again regression say ok
the dependent variable is the same but
now the we're going to have two
independent variables select the range
so when you're doing this at home you
might want to put your independent
variables together on the worksheet so
you can just swept the range and let's
say okay so there's a residual plots
line plots again you're gonna want to
take a look
those but there's just there's your
intercept and then the two slopes for 14
h and the other four hours of
instruction notice that the r-squared
went up a little bit . 66 now whenever
you're throwing a new variable the
r-squared is always going to go up so
what you want to do is look at the
adjusted r-squared we're talking about
multiple regression and compare that the
previous model notice the adjusted
r-square went down from . it was
originally points explored announced
that . 569 it's gone down so if that's
the first indicator that this isn't a
very good model the next indicator my
first look at the overall f the f test
is now important here with multiple
regression and that's the p-value for
the f-test because that test whether
data one is you look beta 2 is equal to
0 so all of your betas are equal 0 vs it
the alternative hypothesis that this one
is different so that's significant at
the five percent level and the ten
percent level but not at the one so
something significant here and if you go
to look at the p-values you can see that
the p-value on age is quite 489 so
there's a 48.9 percent chance of this
happening by chance given the personal
relationship so clearly age is not a
good variable but still hours of
instruction is still a good variable
it's significant at the one
% level so we would reject the
hypothesis on hours of instruction we
would not reject the null hypothesis on
age so what you want to do in this case
is dump eh and go back to your original
model so your objective is to to use as
few variables as possible but just keep
the good ones

Video Length: 09:55
Uploaded By: pomscm
View Count: 109,100

Related Software Products
Multiple Regression Analysis and Forecasting
Multiple Regression Analysis and Forecasting

Published By:
Business Spreadsheets

Description:
The Excel Multiple Regression Analysis and Forecasting template enables the confident identification of value drivers and forecasting of business plan or scientific data. The multiple regression process utilizes commonly employed statistical measures to test the validity of the analysis and results are summarized in textual form to be easily understood and interpreted. Feature selection automatically identifies the best variables for predictive analytics and accurate forecasting results. br ...


Related Videos
Multiple Regression - Dummy variables and interactions - example in Excel
Multiple Regression - Dummy variables and interactions - example in Excel

In this video, I present an example of a multiple regression analysis of website visit duration data using both quantitative and qualitative variables. Variables used include gender, browser, mobile/non-mobile, and years of education. Gender and mobile each require a single dummy variable, while browser requires several dummy variables. I also present models that include interactions between the dummy variables and years of education to analyze intercept effects, slope effects, and fully ...
Video Length: 30:31
Uploaded By: Jason Delaney
View Count: 156,361

How to Run a Multiple Regression in Excel 2007
How to Run a Multiple Regression in Excel 2007

I take some pulse rate data and relate it to incline and walking speed on a treadmill via a multiple regression analysis. In addition to showing you how to get the multiple regression output, I interpret the results. hr / bClosed Caption:/b hello this is a second video in a series on multiple regression in this video I'll show you how to run the multiple regression using Excel 2007 we're using the same data is in the first video pulse ...
Video Length: 08:22
Uploaded By: ProfTDub
View Count: 155,149

Copyright © 2025, Ivertech. All rights reserved.