- The errors of second (one week ahead) forecast
Define the question: does a Normal distribution model fit the data?
Identify the appropriate model: it is obviously a “Goodness of Fit” question, so χ2-test can be used.
As “fit” can be “too bad” to accept H0, but it is can never be “too good” to accept H0. Therefore, it is a One-tailed test.
Formulate the Null hypothesis and the Alternative hypothesis (H0 and H1):
H0 : Normal distribution model does fit the data
H1: Normal distribution model does not fit the data
The level of significance is set at α=5%.
The data are rearranged by chi-square test of Statpro package.
Table 1. Excel StatPro package- Chi-square test for errors of one week ahead forecast
As displayed in the above table, the frequencies in categories of <=(-1769) and (-1769)-(-1469) are blow 3. Because the χ2 does not work very well with small frequencies, these two categories are combined to together into category (-1469)-(-1169), which gives the number of cell k=5.
The degrees of freedom ν equal to k-p-1, where k is the number of cells, p is the number of parameters estimated from the sample data of the proposed distribution. In the case of normal distribution, the value of the sample mean and standard deviation are required to calculate the Expected values, therefore, p=2. ν=5−2−1=2
From Statistical Table 7, with ν=2, χ²crit=5.991
From Table 1. the total distance measures equals to the Chi-square statistics 2.992, which is calculated by the Chi-square test in StatPro package in Excel. The Chi-square test :χ²calc = ∑ (O – E) ² ∕ E, which gives us χ²calc= 2.992.
Therefore, χ²calc χ²crit , H0 is accepted. As a result, the normal distribution is an adequate model for the data of errors of weekly forecast.
(iii) Does it seem that the Actual Demand data are normally distributed?
The same method as previous section can be used to test whether the actual demand data are normally distributed.
Define the question: does a Normal distribution model fit the data?
Identify the appropriate model: it is obviously a “Goodness of Fit” question, so χ2-test can be used.
As “fit” can be “too bad” to accept H0, but it is can never be “too good” to accept H0. Therefore, it is a One-tailed test.
Formulate the Null hypothesis and the Alternative hypothesis (H0 and H1):
H0 : Normal distribution model does fit the data
H1: Normal distribution model does not fit the data
The level of significance is set at α=5%.
The data of actual demand are rearranged by chi-square test of Statpro package.
Table 1. Excel StatPro package- Chi-square test for data of actual demand
As displayed in the above table, the frequency in categoriey of <=1234 is merely 1, which is blow 3. Because the χ2 does not work very well with small frequencies, this category is combined with the below one 1234- 1534, which gives the number of cell k=5.
The degrees of freedom ν equal to k-p-1, where k is the number of cells, p is the number of parameters estimated from the sample data of the proposed distribution. In the case of normal distribution, the value of the sample mean and standard deviation are required to calculate the Expected values, therefore, p=2. ν=5−2−1=2
From Statistical Table 7, with ν=2, χ²crit=5.991
From Table 1. the total distance measures equals to the Chi-square statistics 1.679, which is calculated by the Chi-square test in StatPro package in Excel. The Chi-square test :χ²calc = ∑ (O – E) ² ∕ E, which gives us χ²calc= 1.679.
Therefore, χ²calc χ²crit , H0 is accepted. As a result, the normal distribution is an adequate model for the data of actual demand.
(iv) Comment on the answer to part (iii)
The result that the actual demand is normally distributed is expected by us. A normal distribution model can be described by two parameters: the mean and the variance. The data are normally distributed with the same fundamental shape and symmetry, no matter what the actual distribution parameters.
The “Goodness of fit” is test is an effective tool to identity normality, as well as some correlation tests.
(v) If the supplier aims at 95% service level, how much safety stock would they need to hold?
As assumed, the lead time is 4 weeks,
µ=µ1 +µ2+µ3+µ4 = 2025* 4=8100 =828
(Demand -µ)/=1.65, (as service level is 95%, using Statistical Table 5, t=1.65)
Therefore, Demand = 1.65*828+8100=9466
As Demand = safety stock +
Safety stock = 9466-8100=1366
(vi) Comment on the assumptions make in the calculation of safety stock.
We assume the population is normally distributed, but the variance is unknown. And the sample size (24) is relatively small. Therefore, we use t-test, rather than z-test.
The leading time has an import impact on the safety stock level. We assume it is 4 weeks in this case.
Question 2
- A review of road transport infrastructure suggests that the level of investment is in decline. The table below shows the level of investment with the investment index in year zero taken as 100.
(i) Plot a graph to show the rate of change
Firstly, the rate of change for each is calculated. We deduct the year 0’s investment index from each sequent year’s investment index, and divide the difference by 100, in order to get the percentage of rate of change.
Using the tools in Excel, the following graph is created.
Graph 2.1, the rate of change of investment on road infrastructure
(ii) Does the data suggest that there is a significant relationship between investment and time, as suggested by the reviewer?
We assume that there is a significant relationship between investment and time, which is linear regression. As assumed, the equation is Y= a + bX, where x is time and y is investment.
Using Excel, the following data are transformed
b===-3.348
a = - b =840/10-(-3.348) =99.066
Therefore, the investment=99.066+ (-3.348)number of years.
An estimate of the error variance could be found out by the following calculations:
Therefore, the value of the error mean square = 7.375
And the standard deviation =2.716
In order to test the likelihood of any real relationship existing between the two variables, we choose F-test to be the assessment method.
H0: =0 or + =0
H1: 0 or + 0
It is a one-tailed test, because a value of 0 does not have any sense.
The F value is calculated as: F ===125.39
At the level of significance is 5%, with vG=1, vL=8 (from the Degrees of Freedom column in above table ), we find Fcrit is 5.318, from Statistical Table 6(b).
As Fcalc Fcrit, we reject H0. Therefore, the conclusion is that there is a significant relationship between investment and times.
(iii) Produce an equation to show the rate of decline over the ten year period.
As is showed clearly in the previous graph, the rate of decline might follow the rules of linear regression, therefore, we assume the equation is Y= a + bX, where x is number of years and y is the rate of decline.
Using Excel, the following data are transformed
b===-0.033
a = - b =(-1.6)/10-(-0.033) =-0.012
Therefore, the rate of decline=-0.012+ (-0.033)number of years.
(iv) Calculate the 95% confidence limits for rate of change
Furthermore, we can find out an estimate of the error variance, as shown below
Therefore, the value of the error mean square = 0.0011
And the standard deviation =0.033
The standard error of b is given by
S.E.(b)====0.0037
The 95% confidence limits gives us =5%,/2=2.5%, v=8, from Statistical Table 5,
Therefore, =b S.E.(b)
=-0.0332.3060.0037
-0.042 to -0.024
(b)
A chemical filtration process was investigated to find the effect of different filter media on the speed of processing. Samples were collected from “Fast” filters and from “Slow” filters and the amount of a certain chemical ingredient was measured. The results are shown below. Does it seem that the amount of the chemical is significantly different between the two types of filter?
Explain your assumptions and your choice of tests.
In order to find out an effective method to test whether the amount of the chemical is significantly different between the two types of filter, t-test can be adapted. However, how to choose a specific t-test depends on the result whether the variances of two types of filters are equal or not by using F-test.
F-test:
Define the question: is there a difference in variance between the outputs of the two types of filters?
Identify the appropriate model: comparison of two variances we use F-test
Because difference could be in either direction, we use two-tailed test.
Formulate the Null hypothesis and the Alternative hypothesis (H0 and H1)
H0: 12 = 22
H1: 12 ≠22
We set the level of significance =5%, therefore, /2 =2.5%. G = nG-1 =6-1=5, L=nL-1=7From Statistical Table 6(c), Fcrit =5.988
Because =2 / n
1 = ∑ / n ; 2 = ∑ / n
Then we calculate the A and B.
Because = ∑x / n,
A = ∑xA / n = 78.8% / 7= 0.1126
B = ∑xB / n = 49.4% / 7=0.0823
Therefore,
1 = ∑ / n =0.000429
2 = ∑ / n=0.000469
As the variance of slow filter is greater than that of fast filter, we make 2 be the numerator, and the 1 be the denominator.
Fcalc= 2/1=0.000469/0.000429 = 1.09
Fcalc Fcritc
Accept H0
Therefore, there is no evidence of a significant difference in variance, which leads us to use the conventional t-test for difference between mean values of the two types of filters.
The following table contains useful data from Excel Data Analysis:
T-test:
Define the question: is there a significant difference in the mean output between the two filters?
Identify the appropriate model: because the population variance is not known and the mean is required, we use t-test.
As difference in either direction is possible, we choose to use two-tailed test.
Formulate the Null hypothesis and the Alternative hypothesis (H0 and H1)
H0: μ1 = μ2
H1: μ1 μ2
We set the level of significance =5%, /2=2.5%, as it is a two-tailed test.
From Statistical Table 5, with v = 6 + 5 = 11, tcrit=2.201
As calculated previously, A=0.1126, B=0.0823, and pooled standard deviation of sample equals to ==0.0211
tcalc =
tcalc = = 2.593
Therefore, tcalc tcrit H0
There is significant difference between the outputs of the two filters.
The following table of data is abstracted from Excel Data Analysis.
Question 3
(a)
A study was conducted on the utilisation of machines in a press shop to determine the Actual Capacity. The approach used was “Activity Sampling”. At random intervals of time the press shop was visited and the number of machines working was noted. The data are shown in the table below:
If the total number of machines was thirty-four,
() What was the average utilization of the press shop machines?
We set X is the number of machines working each time of visiting, therefore the mean of the sample is / n, where n is the times of visiting.
= 714/ 30=23.8
Therefore, the utilization of the press shop machines is 23.8/34=70%
(ii) Suggest a model to represent the activity of the machines.
The Poisson distribution would be one of the suggested models, because the data representing the activity of the machines have the significant characteristics of a Poisson model. The number of times machine working is known, however, the number of visiting trials is unknown and could be large, with a constant measuring interval existing. In order to testify the suggestion, we use the “Goodness to fit” test to prove whether the Poisson distribution model fits or not.
Define the question: does a Poisson distribution model fit the data?
Identify the appropriate model: it is obviously a “Goodness of Fit” question, so χ2-test can be used.
As “fit” can be “too bad” to accept H0, but it is can never be “too good” to accept H0. Therefore, it is a One-tailed test.
Formulate the Null hypothesis and the Alternative hypothesis (H0 and H1):
H0 : Poisson distribution model does fit the data
H1: Poisson distribution model does not fit the data
The level of significance is set at α=5%.
The data are rearranged by chi-square test of Statpro package.
Table 3.1 Excel StatPro package- Chi-square test for numbers of machine working
As displayed in the above table, the frequencies of the first two categories are blow 3. Because the χ2 does not work very well with small frequencies, the categories of <=16, and 16-19 are combined to together, which gives the number of cell k=5.
The degrees of freedom ν equal to k-p-1, where k is the number of cells, p is the number of parameters estimated from the sample data of the proposed distribution. In the case of Poisson distribution, the value of the sample mean is required to calculate the Expected values, therefore, p=1. ν=5−1−1=3
From Statistical Table 7, with ν=3, χ²crit=7.815
From Table 3.1, the total distance measures equals to the Chi-square statistics 6.134, which is calculated by the Chi-square test in StatPro package in Excel. The Chi-square test :χ²calc = ∑ (O – E) ² ∕ E, which gives us χ²calc= 6.134.
Therefore, χ²calc χ²crit , H0 is accepted. As a result, the Poisson distribution is an adequate model for the data of machine working.
( From the model, what is the lowest number of machines likely to be working at any one time?
(iv) What is the maximum number likely to be working?
(b)
Complaints were made about the level of pollutants in the discharge from a certain factory. The factory refuted the complaints by showing the results of their own analysis of the discharges. However, the Environmental Health Agency claimed the method of analysis used by the firm was faulty. A comparison was made over nine days using two methods of analysis in parallel to check the pollution levels. The results (in ppm.) are shown below:
Does the data suggest the firm’s method does underestimate the level of pollution?
Solution:
In order to test whether the firm’s method underestimates the level of pollution or not, t-test could be used. If there is no difference between the variances of two populations, we will use Student’s T-test; otherwise Aspin-Welch t-test will be used. Therefore, we need to do the F-test at first.
F-test:
Define the question: is there a difference in variance between the outputs of the two methods of analyzing the discharges?
Identify the appropriate model: comparison of two variances we use F-test
Because difference could be in either direction, we use two-tailed test.
Formulate the Null hypothesis and the Alternative hypothesis (H0 and H1)
H0: 12 = 22
H1: 12 ≠22
We set the level of significance =5%, therefore, /2 =2.5%. G = nG-1 =9-1=8, L=nL-1=9From Statistical Table 6(c), Fcrit =4.433
Because =2 / n
1 = ∑ / n ; 2 = ∑ / n
Then we calculate the A and B.
Because = ∑x / n,
A = ∑xA / n = 303/9=33.67
B = ∑xB / n = 351/9=39
Therefore,
1 = ∑ / n =149
2 = ∑ / n=131.75
Fcalc= 1/2=149/131.75=1.13
Fcalc Fcritc
Accept H0
Therefore, there is no evidence of a significant difference in variance, which leads us to use the conventional t-test for difference between mean values.
The following table contains useful data from Excel Data Analysis:
T-test:
Define the question: is the mean of outputs using firm’s method lower than that of the EHA method?
Identify the appropriate model: because the population variance is not known and the mean is required, we use t-test.
As “underestimate” is a defined direction, we choose to use one-tailed test
Formulate the Null hypothesis and the Alternative hypothesis (H0 and H1)
H0: μ1 = μ2
H1: μ1 μ2
We set the level of significance =5%, as it is a one-tailed test.
From Statistical Table 5, with v = 8 + 8 = 16, tcrit=1.746
As calculated previously, A=33.67, B=39, and pooled standard deviation of sample equals to ==11.84
tcalc =
tcalc = = - 0.95
Therefore, tcalc < tcrit H0
There is no difference between the outputs of the two methods, and no evidence shows that the firm’s method underestimates the level of pollution.
The following table of data is abstracted from Excel Data Analysis.