Econometrics Ordinary Least Squares Coursework (year 2)
Econometrics
COURSEWORK 1
Module Code: ECON22100
Level: 2
Deadline for submitting: 16th April 2010
. (12.5%) Explain in detail the difference between the following concepts.
a) Population Regression Function vs Sample Regression Function.
The Population Regression Function (PRF) is a description of the model that is thought to be generating the actual data and it represents the true relationship between the variables. The PRF embodies the true values of ß1 and ß2, and is expressed as:
where Yi is the actual value obtained by adding the error term ui; or as
where E(Y) may be regarded as the average or expected value of Y for a given value of X. The PRF tells us how the mean of Y varies for different values of X.
The population is the total collection of all objects to be studied. The population may be either finite or infinite, while a sample is a selection of just some items from the population. In general either all of the observations for the entire population will not be available, or they may be so many in number that it is infeasible to work with them, in which case a sample of data is taken for analysis. (Brooks, 2002, pg.112)
The Sample Regression Function (SRF)
It allows us to calculate the estimated value of Y for a given value of X.
b) Error terms vs residuals.
The error term - is the disturbance term of the PRF. It represents factors other than X that affect Y and is calculated as the difference between the actual data point and the estimated value of Y from the PRF.
The residual is the difference between the actual value of Y and its fitted value:
In most cases we cannot calculate the parameters for PRF, therefore we cannot calculate the error term, which is why the residual is used more often in empirical studies.
c) Regression coefficients vs estimators.
An estimator is also known as a statistic, is simply a rule or formula or method that tells how to estimate the population parameter from the information provided by the sample at hand for example is an estimator of , is an estimator of(Gujarati, 2003, pg.49)
,,andare regression coefficients; also know as intercept and slope coefficients.
Estimators are like proxies of the real values, as opposed to coefficients which are the parameters of an equation.
2. (12.5%) Consider the following model:
where Y = the annual salary of ...
This is a preview of the whole essay
c) Regression coefficients vs estimators.
An estimator is also known as a statistic, is simply a rule or formula or method that tells how to estimate the population parameter from the information provided by the sample at hand for example is an estimator of , is an estimator of(Gujarati, 2003, pg.49)
,,andare regression coefficients; also know as intercept and slope coefficients.
Estimators are like proxies of the real values, as opposed to coefficients which are the parameters of an equation.
2. (12.5%) Consider the following model:
where Y = the annual salary of a college teacher; X = years of teaching experience;
D2 = 1 if male and 0 otherwise; D3 = 1 if white and 0 otherwise.
a) What does the term () mean?
Using a new variable the interaction effect is taken into account, it shows the multiplicative effect of the variables and on mean Y.
b) What is the meaning of ß4?
ß4 is the parameter that shows the extent of the expected change of the annual salary of college teachers if they are white males.
c) Find E(YijD2 = 1;D3 = 1;Xi) and interpret it.
E(Yi|D2=1, D3=1,Xi)=ß1+ß2(1)+ß3(1)+ß4(1*1)+ß5Xi=ß1+ß2+ß3+ß4+ß5Xi
(ß1+ß2+ß3+ß4+ß5Xi) is the expected annual salary of white male college teacher (i) with Xi years of teaching experience, which is different by (ß2+ß3+ß4) on average than the annual salary of non-white female college teacher with the same number of years of teaching experience, representing the reference salary.
The following diagram illustrates a possible combination of parameters, for simplicity we assume that all parameters are positive:
3. (20%) Consider the following regression
= 50 -0.1
se= (10.7509)
|t| = 18.73
r2=0.935 n=17
Fill in the missing numbers and establish a 95% confidence interval for ß2. Give a proper
economic interpretation. Would you reject the null hypothesis that the true ß2 is zero at
?= 5%? Tell whether you are using a one-tailed or two-tailed test and why. (Critical
values and )
Let the intercept (50) be ß1 and the slope (-0.1) be ß2.
To find the t-value for ß1 the following formula will be used:
Assuming ß2 to be 0 we obtain:
For finding the standard error of ß2 we rearrange the formula; to keep the standard error non-negative the assumption that t-value for ß2 is negative is made.
In order to establish a 95% confidence interval we need to use the following formula:
where ?=5%; n-2=15; ; ;
replacing these values in the above formula we obtain
We can state that there is 95% chance that the value of ß2 is in this interval [-0.1114; -0.0886]. In other words in 95 cases out of 100 the true ß2 value will fall within the interval.
Null Hypothesis H0: ß2=0
Alternative hypothesis H1: ß2=0
The two-tailed test is going to be used in this situation because the alternative hypothesis (H1) has two possibilities ß2<0 or ß2>0.
Now we need to compare the absolute value of the t-statistic |t|=18.73 with the critical value.
8.73>2.131 => We reject the null hypothesis, which means that ß2 is statistically significant.
4. (25%) Based on 20 annual observations, the following regressions were obtained
Model A: = 4.00 -0.5 r2=0.70
se =(0.1216) (0.1140)
Model B: ln = 1.78 -0.37 ln r2= 0.7448
se = (0.0152) (0.0494)
where Y= the cups of tea consumed per person per day and X= the price of tea in pounds per kilo.
a) Interpret the slope coefficients in the two models.
The slope in model A is -0.5, this means that when the price of tea goes up by £1 per kilo the consumption of tea will go down by 0.5 cups per person per day. In model B, because it's logarithmic the meaning of slope is different: the slope is -0.37, meaning that an increase in price of tea by 1% will result in a decrease by 0.37% in the number of cups of tea consumed per person per day.
b) You are told that = 5 and = 3.6. At these mean values, estimate the price elasticity for model A.
To calculate the price elasticity we need to observe the relative change in Y over the relative change in X, as the slope represents the change in Ywhen X changes by one unit we use this formula for price elasticity:
c) What is the price elasticity for model B?
For model B price elasticity is equal to the slope because the logarithms are used for both X and Y. Price elasticity is -0.37.
d) From the estimated elasticities, can you say that the demand for tea is price inelastic?
Yes, the demand for tea is price inelastic because the price elasticities in both cases are less than 1.
e) Since the r2 of Model B is larger than that of Model A, Model B is preferable to Model
A. Comment on this statement.
Despite the fact that r2, the measure of goodness of fit of a regression, is different in the two models we cannot state that one is preferable to the other because the controlled variables are different in models A and B, therefore we cannot compare the r2 values directly.
5. (30%) You are given the following data based on 20 pairs of observations on Y and X.
Assuming all the assumptions of CLRM are fulfilled, obtain
a)
Based on the assumptions of the Classical Linear Regression Model we can write the general form of the function:
For calculating the regression coefficients we use:
Mean of Y is:
Mean of X is:
b) standard error of
(5.b.1)(Gujarati, pg. 58)
(5.b.2)
Now we combine (5.b.2) with (5.b.1):
c) Establish 95% confidence interval for the parameters
n-2=20-2=18
?=5%
d) On the basis of the confidence intervals established in (c), can you reject the hypothesis
that ß2 is statistically significant?
Confidence interval established for ß2 :
This confidence interval [0.5216; 0.7317] doesn't contain zero, therefore ß2 is statistically significant or is statistically different from zero. That is why we cannot reject the hypothesis of ß2 being statistically significant.
e) Test whether ß2 > 1.
The appropriate type of hypothesis in this case is the left-tail test:
Null hypothesis
H0:ß2>1
Alternative hypothesis
H1:ß2?1
Decision rule
-7.4667<-1.734 -True => we reject the null hypothesis that ß2>1.
Bibliography
BROOKS, Chris, 2002. Introductory econometrics for finance. New York: Cambridge University Press.
GUJARATI, Damodar N., 2003. Basic Econometrics. 4th ed. New York: McGraw-Hill.
KOOP, Gary, . Analysis of Economic Data. Chichester, England: John Wiley & Sons.
WOOLDRIDGE, Jeffrey M., 2006. Introductory Econometrics. 3rd ed. Manson, USA: Thomson South-Western.
2010 | ECON22100 | Econometrics | page 2