690.42 ÷ 20 = 34.5
√34.5 = 5.9
Home Goals Mean = 29.35 Standard Deviation = 9.1
Away Goals Mean = 20.25 Standard Deviation = 5.9
From these calculations the goal distributions can be compared. It can be concluded that the clubs tend to score more goals at home than away, but the number of away goals is more tightly grouped. This is expected as all the clubs commonly share the unfamiliarity of playing at an away pitch. As some teams are better than others, they will tend to score more at home than the not so good teams.
Median, Mode and Range
Due to the size and random nature of the sample it is best to first make a stem and leaf diagram. Because I am using a computer I will make two diagrams, one for home goals and one for away goals, as I cannot incorporate them into one.
Home Goals
0
1 4 8
2 0 0 3 4 4 6 7 7 8 9
3 1 1 1 6
4 0 4 5 9
Median = 27.5
Mode = 31
Range = 35
Away Goals
0
1 0 1 3 4 6 6 8 8 9 9
2 1 1 2 2 4 6 6 8
3 0 1
Median = 19.5
Mode = 21
Range = 21
This proves what was found in the standard deviation. The clubs tend to score more goals at home than away, but the range of the number of away goals is smaller. This is expected as all the clubs commonly share the unfamiliarity of playing at an away pitch. As some teams are better than others, they will tend to score more at home than the not so good teams.
Scatter Graph
I have used this scatter graph to try and prove my second prediction that the teams, which tend to score more goals at home, will also score more goals away than the teams which score less at home.
There does not seem to be much correlation, although a hint of direct proportionality. Just to be sure I will use Spearman’s Rank Correlation. This is used to find out whether there is a correlation between two sets of data.
Spearman’s Ranking Correlation = 1 - (6 x Σ d2 ÷ n3 - n) when n = number of ranks
I have created a table below showing the ranking order of the teams in respect to the home and away goals scored.
Ranking H is the ranking in terms of goals scored at home
Ranking A is the ranking in terms of goals scored away
d2 = Difference squared
Σ = Total
Σ d2 = 962
R = 1 - (6 x Σ d2 ÷ n3 - n)
R = 1- (6 x 962 ÷ [203 – 20])
R = 0.28
This shows there is a weak positive correlation between the results. The correlation is not near enough to 1 to be of much use so cannot be followed up. This proves my second Hypothesis wrong.
Conclusion (part 1)
My first prediction was proved to be correct. Teams do tend to score more goals at home than away. This can be easily seen in the stem and leaf diagram. This is because the players will be used to training and playing at their home stadium. They will not be as familiar with the opposition stadium. Also there will be more supporters for a home side at a home stadium so the players’ morale is boosted. These conditions also apply to the opposition. This means a team should score more at home than away.
The only exceptions of this rule are Middlesborough and Coventry City. They have both scored more goals away. There are two possible reasons for this. One is that they may not train at their ground so are not as familiar with the home soil. The other is that their home stadiums are of a low quality so they play better away.
It can also be seen that the dispersion of goals scored away is smaller than that of goals scored at home. This is because the probability of the clubs scoring away is more similar.
I was wrong in saying the teams, which tend to score more goals at home, will also score more goals away than the teams which score less at home. Although there was a slight positive correlation, it was not strong enough to follow up.
Hypothesis (part 2)
To get a better conclusion I will repeat the investigation with information from the 2001 – 2002 season. I can then compare the results with that of the 2000 – 2001 season.
I predict that fewer goals will be scored at home matches and more goals will be scored at away matches. This is because clubs will have invested in better players and the quality of the current squad will have been improved. This means the clubs will be more similar, although there will still be some distance between the larger clubs, like Manchester United, and their smaller counterparts, like Derby County.
I predict the dispersion of home goals will be smaller. This is because the clubs have become increasingly similar in skill level. I predict the dispersion and range of away goals will be greater. This is because the clubs are playing better on other team’s pitches as a result from the similarity.
Even though there is all this, the number of home goals will still be greater than the number of away goals.
Investigation
I have created a table to show the number of goals scored by each team home and away in the 2001-2002 season.
Again the number of goals scored at home is greater than that scored away. It can also be seen that less goals are scored at home in comparison to the previous season. More goals are scored away in comparison to the previous season. Also a lot more teams have scored more away goals than home goals in comparison to the previous season. Liverpool, Manchester United, Aston Villa, Bolton wanderers and Ipswich town all scored more goals away than home.
Comparing Means
The mean of goals scored at home is 557 ÷ 20 = 27.85
The mean of goals scored away is 444 ÷ 20 = 22.20
This shows that the average team scores 6 (nearest whole number) more goals at home than it does away. This is 3 less goals than the season before. This shows that the scores of home and away matches are becoming more similar.
Using this mean the standard deviation can be found.
Standard deviation
Here is the standard deviation of the goals scored at home.
1424.1 ÷ 20 = 71.205
√71.205 = 8.44
This shows the standard deviation of home goals is 8.44.
Here is the standard deviation of the goals scored at away.
1667.2 ÷ 20 = 83.36
√83.36 = 9.13
Home Goals Mean = 27.85 Standard Deviation = 8.44
Away Goals Mean = 22.20 Standard Deviation = 9.13
From these calculations the goal distributions can be compared. It can be concluded that the clubs tend to score more goals at home than away, but the number of away goals is more tightly grouped. This is expected as all the clubs commonly share the unfamiliarity of playing at an away pitch. As some teams are better than others, they will tend to score more at home than the not so good teams.
If these values are compared to the previous season’s it can be clearly seen that the standard deviation of home goals has decreased. This proves I correctly predicted the dispersion of goals scored at home would decrease. This is because the clubs are getting increasingly similar with the seasons.
It can also be seen that the standard deviation of away goals has increased. This proves I correctly predicted the dispersion of goals scored away would increase. This is because the teams are playing better on foreign pitches.
Median, Mode and Range
Home goals
0
1 5 8
2 0 0 0 1 2 3 3 3 6
3 1 2 2 3 3
4 0 0 2 3
Median = 24.5
Mode = 21.5
Range = 28
Away goals
0
1 1 2 3 5 5 5 6 7 9
2 1 2 2 3 3 4 4
3 4 4 7
4 7
Median = 22.5
Mode = 15
Range = 36
This proves my hypothesis and restates what was found through standard deviation.
The averages for Home goals have decreased. The range has also decreased. This proves I correctly predicted the dispersion of goals scored at home would decrease. This is because the clubs are getting increasingly similar with the seasons.
The averages for away goals have increased. The range has also increased. This proves I correctly predicted the dispersion of goals scored away would increase. This is because the teams are playing better than before at away pitches.
Scatter graph
I have used this scatter graph to try and prove my second prediction that the teams, which tend to score more goals at home, will also score more goals away than the teams which score less at home.
There looks to be a stronger positive correlation on this graph, although it is not perfect. Just to be sure I will use Spearman’s Rank Correlation. This is used to find out whether there is a correlation between two sets of data.
Spearman’s Ranking Correlation = 1 - (6 x Σ d2 ÷ n3 - n) when n = number of ranks
I have created a table below showing the ranking order of the teams in respect to the home and away goals scored.
Ranking H is the ranking in terms of goals scored at home
Ranking A is the ranking in terms of goals scored away
d2 = Difference squared
Σ = Total
Σ d2 = 485
R = 1 - (6 x Σ d2 ÷ n3 - n)
R = 1- (6 x 485 ÷ [203 – 20])
R = 0.64
This shows there is a strong positive correlation between the results. This proves my second Hypothesis correct.
The correlation is a lot stronger than that of the previous season.
Final Conclusion
This Table below shows some key information found in the investigation.
From this investigation it is clear that more goals are scored at home matches as apposed to away. This is shown well by the stem and leaf diagrams and also the two tables at the beginning of each part of the investigation. The stem and leaf diagram puts the information into a quick and easy to read table.
I was correct in stating more goals would be scored at home matches than away matches. The teams score more goals at their home stadiums than at the opposition’s stadium. This is because the players are used to training and playing at their home stadium. They are not as familiar with the opposition stadium. Also there are more supporters for a home side at a home stadium so the players’ morale is boosted. These conditions also apply to the opposition. This means a team should score more at home than away.
In the 2000 – 2001 season Middlesborough scored more away goals than at home, whilst in the 2001-2002 season Middlesborough scored almost twice as many more goals at home than away. This means that the 2000 – 2001 score was simply random.
If the averages (mean, mode and median) are considered, it can be seen that all the averages for home goals have decreased. The averages for away goals have increased, except for the mode. The mode is not particularly reliable in investigation as the modal number was not an overriding one. Its distance from the other two similar averages is evidence of this.
I was correct in saying that fewer goals will be scored at home matches and more goals will be scored at away matches. This is because clubs will have invested in better players and the quality of the current squad will have been improved.
If the dispersion figures (range and standard deviation) are considered, then for the home matches they have decreased. For the away matches they have increased.
I was correct in stating the dispersion of home goals will be smaller. This is because the clubs have become increasingly similar in skill level. I predict the dispersion and range of away goals will be greater. This is because the clubs are playing better on other team’s pitches as a result from the similarity. After saying this there are still the elite teams at the top of the table (Arsenal, Liverpool and Manchester United) who will be near the top almost all the time.
Due to this increasing similarity, certain uniformity between the goals scored at home and those scored away has developed. This is shown by the weak positive correlation in from the first season becoming a strong positive correlation in the second. A team who scored more goals at home, will continue this trend away but not at such a great a scale.
If the statistics carry on as they do now, the number of away goals will equal the number of home goals. This means the best team will come first, the second best will come second and so on.