• Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

Expenditure per Student in High Schools :Estimation Using Cross-sectional Regression Analysis

Extracts from this document...

Introduction

Expenditure per Student in High Schools :Estimation Using Cross-sectional Regression Analysis

Introduction

The purpose of this essay is to search for a model that explains the expenditure per student in high schools. In my model I will try to answer the question that :what factors determine the expenditure per student in a district. I will use cross sectional regression to find a first order relationship between my dependent variable and the parameters it depends upon.

        The structure of the essay will be as following: First I will explain the data and its statistics. Then I will perform the regression and test for the CLRM assumptions. Then I will interpret the regression results. Next I perform the joint hypothesis on regressors coefficients.. Finally I will conclude the model.

Data and Summary Statistics

The data comes from a survey on High schools in US for different districts across different counties. In includes 1001 observation on Expenditure per student in a district, numbers of schools in the district, Student/Teacher ratio and mean score of students in tenth grade.

...read more.

Middle

28

1318.941

1.343219

7.872708

NUMBEROFSCHOOLS

2.964036

1

26

1

5.123544

3.379529

14.08522

STUDENTRATIO

13.24296

12.9

21.9

4.8

3.15131

0.205833

2.516564

MATHSCORE

45.51449

46

58

29

4.079711

-0.466202

3.70439

Table2 :Summary statistics

This table contains the descriptive statistics of variables which will be used to construct the regression model later

Model Estimation

The results of the regression can be found in table 3 and 4 below. As we can see from the results that two of the explanatory variables are significant at 99% confidence level and one explanatory variable (MATHSCORE) is significant at 90% confidence level. Also  the explanatory variables are jointly significant  as shown by the high value of F statistic (For the test of null hypothesis where all coefficient estimators are zero simultaneously except zero). The model equation is given as follows:

EXPENPUPIL = 9341.56  +  34.71*NUMBEROFSCHOOLS  - 312.213* STUDENRATIO  -  12.39529*MATHSCORE

As we can see he standard errors of the coefficients is quite high which may be because of the correlation between the explanatory variables. Also the model shows heteroscedasticity as shown by the white test. F-statistic has a p-value of 0.00, which means that our hypothesis of  errors being  homoscedastic is rejected. As a result we make a second regression with HAC errors so that standard error for the coefficients are more efficient.

Table 3


Below are two tables  showing the parameters related to the linear regression of the data. The important values are all displayed with R-square, Observations, X Variable (coefficients, t-values,       p-values

Variable

Coefficients

Std. Error

t-Statistic

P-Value

C

9341.564

380.1826

24.57126

0

...read more.

Conclusion

Covariance between error terms is zero

As our data is cross-sectional. So we do not have issue of covariance between error terms

The error is not correlated with regressors

The error term of in regression explains the variance which is not explained by the model. I assume the assumption is not violated as I do not have any data or theory to explain this.

The Disturbances are normally distributed

We do not assume violation of this assumption as for large sample sizes. As we have a large sample size so violation of this assumption is not consequential.

Joint Hypothesis Test (Wald Test)

The Wald test is used to test the joint hypothesis that our last two coefficients are zero, which is our null hypothesis. The result of the test is a F-value of 389 and probability of 0.00.This means that the probability of our last two coefficients being zero is 0.00. So we will reject our null hypothesis. Conclusion is that at least one of the last two coefficients is zero.

...read more.

This student written piece of work is one of many that can be found in our University Degree Statistics section.

Found what you're looking for?

  • Start learning 29% faster today
  • 150,000+ documents available
  • Just £6.99 a month

Not the one? Search for your essay title...
  • Join over 1.2 million students every month
  • Accelerate your learning by 29%
  • Unlimited access from just £6.99 per month

See related essaysSee related essays

Related University Degree Statistics essays

  1. Stochastic Applications of Actuarial Models with R coding

    in 3, 5 and 10 years as at the beginning of 1995. The code and further results can be found in the Appendix 1.3. Rating category 3 year 5 year 10 year AAA 1.30646*10-9 1.57133*10-8 7.24003*10-7 AA 3.67033*10-7 2.55266*10-6 4.45156*10-5 A 1.86696*10-5 7.64309*10-5 0.000550073 BBB 0.000606301 0.001414843 0.003918558 BB 0.005423701

  2. Quality analysis for AllRepairs mechanics

    When taking the customers who did not give any response into account, the amount of customers who are satisfied or very satisfied with the service takes up nearly 77.1%, which do not meet the target of the company. When wiping out the part with no response, the percentage rises up to 80.7%.

  1. Correlation and regression and time series analysis

    This means that there is a strong linear relationship between the number of sales and employees. 0,8511 or 85% (r�) of the variation in sales can be explained by a linear relationship with the number of employees. The remaining 15% can be explained with other factors such as economy, market influence, training and development approaches etc.

  2. Analysing Cross-sectional Data

    2.1 Times a month Question 2: How many times a month would you use the cafes/restaurants in the Market Square if they served one type of food you have identified in question 1? People were asked how many times a month they would use cafes/restaurants.

  1. Dress code study. The method of random sampling in this investigation was cluster ...

    Both of the calculated value indicates that the chance of majority of the female ABC University students who wish to abolish this particular restriction is significantly small. This is supported by the fact that the estimated true proportion of female ABC University who wish to abolish the restriction of wearing

  2. Discriminant Analysis on Determing if an MLB team will make the playoffs

    group codes 0 .0 At least one missing discriminating variable 0 .0 Both missing or out-of-range group codes and at least one missing discriminating variable 0 .0 Total 0 .0 Total 30 100.0 In addition to the requirement for the ratio of cases to independent variables, discriminant analysis requires that

  1. Contents Table

    Therefore using the flow chart below it can be seen that in order to do a descriptive analysis of data using interval method must first calculate mean and standard deviation then carry out either a z or t test. Interval Nominal Ordinal 1. Descriptive a. Central Tendency b. Dispersion 2.

  2. Statistical Analysis: Homework at IU

    The alternate hypothesis is that I spend more than 2 hours on homework. H0: � ? 120 min. H1: � > 120 min. Step 2: Traditionally, the .05 level is selected for consumer research projects. ? = .05 Step 3: The t distribution is the test statistic for a population

  • Over 160,000 pieces
    of student written work
  • Annotated by
    experienced teachers
  • Ideas and feedback to
    improve your own work