He observes that 30 people came in on Monday, 14 on Tuesday, 34 on Wednesday, 45 on Thursday, 57 on Friday, and 20 on Saturday.
We will assume the owners distribution is correct.
Total of 200 Customers throughout the week.
Chi-Square Statistic = χ2
To find this, you would take the (Observed Value - Expected Number) squared, and divide all by the Expected number and do this for each one of your values
So,
χ2 = (30-20)2/20 + (14-20) 2/20 + (34-30) 2/30 + (45-40) 2/40 + (57-60) 2/60 + (20-30) 2/30
χ2= 100/20 + 36/20 + 16/30 + 25/40 + 9/60 + 100/30
χ2 = 11.44
So what is the probability of getting a result this extreme?
To do so, we must find out the critical chi-square value. Is 11.44 a more extreme result than the critical chi-square value that there is a 5% chance of getting a result that extreme? If 11.44 is more extreme than 5%, we will reject our null hypothesis.
Critical Chi-Square Value:
Determine the Degrees of Freedom -
We’re taking 6 sums (6 days), so you may be tempted to say Degrees of Freedom =6, but if you have the first 5 values, you could find out the 6th, so we only have 5 Degrees of Freedom. We can say Degrees of Freedom = n-1
So,
Our degree of freedom (D.F) is 5, our P value/significance is .05 or 5%,
We can determine that our Critical chi-square value (χ2c) is 11.07.
Our D.F. is 5, so k=5 (magenta)
If we continued the line, we would see that as the D.F. increases, the probability decreases.
T-Tests
The t- test is a method of statistical analysis used for comparison of sets of data. A t-test asks and answers the question, “Which set of data is higher?”. The comparison must take into consideration the variability of data.
Example:
Hypothesis: Fluoride water prevents tooth decay.
To test, the researcher looks at 5 communities that do have fluoride water, and 5 that do not. The dependent variable is the percent of the population with tooth decay, so a higher number is bad.
This hypothesis would translate to:
f= fluoride
nf = not fluoride
Xf < Xnf
Tooth decay is greater in communities without fluoride.
So let’s look at the Data Set:
n = the number of observations in one sample. For this experiment, n=5 for both communities.
N = the number of observations in all samples being compared. (N = n1+n2). For this experiment, 5+5 = N=10
D.F. Degrees of Freedom = N-2, so for this experiment, D.F. = 8
Mean of the sample =
-
Find the sum of the values for both samples. For n1, Σ = 112, for n2, Σ = 58
-
Divide each sum by the number of samples. n1 = 112/5 = 22.4, n2 = 58/5 = 11.6.
Sample Variance =
x = each value in the sample
n-1 = 4
So that for s2nf:
= s2 = 40.3
and for s2f
[ (5-11.6) 2 /4 + (12/11.6) 2 /4 + (9-11.6) 2 /4 + (13-11/6) 2 /4 + (19-11.6) 2 /4 ] = s2 = 48.9
T Value =
or t = the absolute value of the mean of sample 1 - the mean of sample 2 OVER the square root of the Sample Variance of sample 1 SQUARED OVER number of observations in sample one PLUS the Sample Variance of sample 2 SQUARED OVER number of observations in sample 2
So, for this experiment, this would look like:
= 10.8/4.22 = 2.55924 t = 2.56
To use this t-value, we must compare it to the critical t-value for D.F. = 9.
If our calculated value is > critical value in table, we can reject the null hypothesis.
Assuming we use p< .05 or 5% (as in Chi-Square):
Our calculated is 2.56, so
Critical < Calculated
We know that the null hypothesis of no difference between the means can be rejected, but that does not mean our hypothesis is either supported or rejected.
We originally hypothesized that:
Tooth decay is greater in communities without fluoride.
Xf < Xnf
From our data (% of population with tooth decay), we found that
nf = mean of 22.4, f= mean of 11.6
We now know that
- The means are significantly different; reject null
- Difference between two means matches our hypothesis.
Conclusion:
We found that communities without fluoridated water had a greater percentage of the population with tooth decay than communities with fluoridation (t - 2.56, d.f.=9, p<.05)