- Descriptive statistics of the two sample:
and
Then and
Based on the above value we see that , that means we can not reject the null hypothesis or the driving distances of the current balls and new balls are equal.
P-value = 2P(Z>1.33) = 2*0.092 = 2*(0.5- 0.4082)= 0.184 > 0.05 () → we can not reject the null hypothesis or the driving distances of the current balls and new balls are equal.
- 95% CI for the population means of the two population:
= 2.775 – 1.962.775 + 1.96
= (-1.319 ; 6.869)
- 95% CI for the population means of each model:
Current model is:
here tn-1,a/2 = 2.02 then substitute the values we have
= (267.479; 273.071)
New model is:
here tn-1,a/2 = 2.02 then substitute the values we have
= (264.339; 270.661)
Case 2: Skaff Appliance Company
Skaff Appliance is considering opening several additional stores in other large metropolitan areas. Paul Skaff, the president, would like to study the relationship between the sales at existing locations and several factors regarding the existing stores or its region. The factors are the population and the unemployment in the region, and the advertising expense of the stores. Another variable considered is “mall”. Mall refers to whether the existing store is located in an enclosed shopping mall or not. A “1” indicates a mall location; a “0” indicates the store is not located in a mall. A random sample of 20 stores is selected.
The purpose of this managerial report is to provide a deep statistics analysis so that Paul Skaff can rely on.
- CI for the mean of sales of stores
We use the following formula to find out the CI for the mean of sales:
here tn-1,a/2 = t19,0.025 = 2.09 then substitute the values we have
= (7.290; 9.476)
- Mean of sales are different between of stores
- Mean of Advertising Expense are different between of stores
- CI for proportion of the stores having Adv. Expense larger or equal than USD 50,000
We use the following formula to find out the CI for the mean of sales:
here tn-1,a/2 = t9,0.025 = 2.26 then substitute the values we have
= (58.043; 61.417)
- Linear regression equation
- The population regression model:
Where: Y : Estimated or predicted value
: Y-intercept
, , , : Population slopes
: Random Error
For this case I replace dependent variable (Y) with Sales and Independent variables () with Population, () with Percent Unemployed, () with Adv.Expense, and () with Mall Location, then I have the Multiple Regression Equation as follow:
Based on the above ANOVA table’s value I replace it on Multiple Regression Model then we have:
- Coefficient of determination and explain its meaning
= 0.458513: Sales will increase, on average, by 0.458513 pies per week for each 1,000,000 person increase in population, net of the effects of changes due to %unemployed, Adv.Expense and Mall
= 0.3115: Sales will increase, on average, by 0.3115 pies per week for each 1% increase in Unemployment, net of the effects of changes due to Population, Adv.Expense and Mall
= 0.001679: Sales will increase, on average, by 0.001679 pies per week for each $1,000 increase in Adv.Expense, net of the effects of changes due to Population, %unemployed, and Mall
= 0.382228: Sales will increase, on average, by 0.382228 pies per week for each 1 store located in mall increase in Mall, net of the effects of changes due to Population, %unemployed, and Adv.Expense
Coefficient of Determination,
74,52% of the variation in pie sales is explained by the variation in Population, %Unemployment, Adv.Expense and Mall Location.
- Which independent variables is not significant in the model?
We use hypothesis testing to evaluate the significant of independent variables
Null Hypothesis : = 0 (the population is not significant)
Alternative Hypothesis : 0 (the population is significant)
From the above ANOVA table’s P-value we can see that:
P-value of Population is 0.00004049837 < 0.05 so we reject null hypothesis, it means that Population is not significant.
Null Hypothesis : = 0 (the %Unemployed is not significant)
Alternative Hypothesis : 0 (the %Unemployed is significant)
From the above ANOVA table’s P-value we can see that:
P-value of %Unemployed is 0.1258 > 0.05 so we can not reject null hypothesis, it means that %Unemployed is significant.
Null Hypothesis : = 0 (the Adv.Expense is not significant)
Alternative Hypothesis : 0 (the Adv.Expense is significant)
From the above ANOVA table’s P-value we can see that:
P-value of Adv.Expense is 0.9325> 0.05 so we can not reject null hypothesis, it means that Adv.Expense is significant.
Null Hypothesis : = 0 (the Mall Location is not significant)
Alternative Hypothesis : 0 (the Mall Location is significant)
From the above ANOVA table’s P-value we can see that:
P-value of Mall Location is 0.5621> 0.05 so we can not reject null hypothesis, it means that 0.5621 is significant.
- Testing for overall significance of the model
Hypothesis:
Null Hypothesis : ==== 0 (no linear relationship)
Alternative Hypothesis : at least one 0 (at least one independent variable affects Y)
We can use F-Test or P-value to evaluate the significance of overall model
In this case K=4 (numerator) and (n-K-1)=(n-5) (denominator) d.f.
The decision rule is:
Reject if F>
From the above ANOVA table’s value we can see that:
< 10.24 → reject . There is evidence that at least one independent variable affects Y.
Or we can also use P-value (significance F) to evaluate the significance of overall model
From the above ANOVA table’s value we can see that:
P-value (significance F) = 0.0004 <0.05 → reject
- Whether or not Mall Location affect on sales of the stores?
From the above ANOVA table’s P-value we can see that:
T-value for Mall Location is 0.59389586 with P-value is 0.5621> 0.05 it means that 0.5621 is significant or Mall Location affects on sales of the stores.
- What is the estimated value of sales of a store which is located in a mall, in a region with %unemployed 4%, population 18,000,000 and this store invests 40,000USD for Advertising?
From the Multiple regression model:
We replace data (numbers) then we have
Assignment No.2 – Case No.1&2 Page of