As a piece of Statistics coursework, I have decided to compare two items of data, in order to prove, or disprove my theory:

Authors Avatar

Grant Mackenzie 11F

Statistics Coursework

As a piece of Statistics coursework, I have decided to compare two items of data, in order to prove, or disprove my theory: “A country’s position in the Commonwealth games varies accordingly to that country’s population size.”

My theory is that a country’s position in something such as the Olympics or Commonwealth Games is proportional to that country’s population size. I say this because I believe that if a country has a large population, there will be more potential athletes to choose from.

I am doing this because I would be genuinely interested in finding out whether or not this theory is true, and I believe that it is a theory that many people reading this essay would be curious in finding out. In addition, I am comparing the results from the Commonwealth Games, instead of something as renowned as the Olympic games because the Commonwealth Games are dominated by countries with very different traditions and cultures. Conversely, countries that dominate the Olympic games are countries such as France, England, or Germany, and are all countries that live a very western way of life similar to ours.    

In order to do the comparisons that I will need to make properly, I will use three different occasions of the Commonwealth Games, and then make an average for the number of medals awarded for each country. I will use the most recent of games – 2002, 1998 and 1994. I will do this, as the data collected from these three games, will contain the data from each of the countries that enter, since in the first games in 1930, only a fraction of the countries that enter now entered. I will also use fifty different countries in order to give me a large enough sample size to make an accurate conclusion. Unfortunately though, all of the data that I will be collecting will be secondary data and not primary data as all of the data that I need is on the Internet.

Firstly however, I must do a pilot test with ten samples in order to judge whether the data is suitable enough to be used.

To find this data, I went on the Internet and used an Internet search engine to find the Commonwealth Games’ official website. Fortunately, several answers came up, with one website having all of the data which I required. However, this data was unusable to me, as I had to sort it. Below is what the data looked like in its original form from the website www.commonwealthgames.com:

As you can see, all of this data is not sorted in a manner that is usable to me. So therefore, I will sort it into descending order, using Excel, by the total number of medals awarded. Since I will not need the data for the amount of gold, silver and bronze medals awarded to each country, I will delete those columns and have merely the total number of medals.

Below is the sorted data:

As you can see, all of the data that I collected contains integers, instead of decimals, since you cannot have a fraction of a medal. This is ideal considering that there will be, as a consequence, no rounding error in totalling the average amount of medals awarded to each country in the collecting of data from three of the games.

Unfortunately, I soon realised after glancing at this data, that the data that I collected does not contain the data for all of the countries that enter the Commonwealth Games. Only thirty-nine countries were mentioned although I needed at least fifty different countries in order to give me a large enough sample size, and since I knew that there were more than fifty countries that enter the Commonwealth Games.

In order to get the data for all of the countries that enter, I had to go to another section of that website. However, I couldn’t get all of the countries in a list, they simply gave me a listing of the countries by locality. In addition, it was impossible to copy and paste the data, so I had to type each one out individually. Below is an example of how the data was originally shown on screen:

As you can see, the information was split up into six different localities – Asia, Oceania, Europe, Caribbean and Americas. Once I had typed down the name of each country, I had all of the data that I needed. Below is a copy of the completed list of countries that I found:

 Inserting the missing data was easy since presumably, all of the countries that they neglected to mention, obtained no medals. In addition, the countries that are there total seventy-two, which is easily the amount of samples that I require. Regardless of the countries getting no medals, they are still valid sample units as zero is still a number.

But before any comparison in this exercise can be made I must find each country’s population size. To do this, I will go on the World Factbook’s website - http://www.odci.gov/cia/publications/factbook/index.html that I obtained from the Internet search engine Ask Jeeves.

After viewing the website, I discovered that it told me any data that I wanted for any country in the world. All I had to do was select the country that I wanted the data to be displayed for. Therefore, I searched for each of the countries individually and then recording the population in Excel to be sorted later.

Join now!

Below is the data that I collected from The World Factbook’s website after I found the missing countries, sorted in alphabetical order:

Now that I have both the data for the countries population sizes and the amounts of medals awarded to them, I can test my theory in a pilot test.

 I will select ten different samples from the finite population that I collected using stratified random sampling. By saying random, I mean that the out coming country cannot be predicted and is chosen without conscious decision.

There are many types of sampling that ...

This is a preview of the whole essay