For our GCSE statistics coursework, we were given the question Where are houses most expensive? To answer this question I have posed the hypothesis House Prices for 3 bedroom detached houses in the North of England are cheaper than those in the s
Extracts from this document...
Introduction
Mubeen Uppal Statistics coursework. Mr Fisher.
Statistics coursework
Hypothesis: ‘House Prices for 3 bedroom detached houses in the North of England are cheaper than those in the south of England. Therefore the south is a more expensive region’.
Hypothesis and strategy
For our GCSE statistics coursework, we were given the question ‘Where are houses most expensive?’ To answer this question I have posed the hypothesis ‘House Prices for 3 bedroom detached houses in the North of England are cheaper than those in the south of England. Therefore the south is a more expensive region’. I chose this hypothesis as the south stereotypically have been known for having more expensive houses and therefore its occupants enjoy a higher standard of living than those in the north. I need to gather evidence to support my hypothesis or not to support my hypothesis. Also I have chosen a 3 bedroom detached house as this seems to be the average household that the bulk of the English population live in.
I will gather evidence to help my investigation by doing the following:
- Firstly I will collect my data, 30 pieces from the north and 30 from the south, from different counties in the north and south.
- From the data I have collected I will produce a histogram to determine the shape of the distribution which is important as this will then show me the best measure of average that needs to be used. If the histogram shows a normal distribution then I will use the mean as well as the standard deviation and if it is a skewed distribution then I will use the Inter Quartile Range (IQR) and the median.
- Then I will make a box and whisker plot I will do this as this is a clearer indication if there is there is a positive or negative skew as well as clearly showing the median, IQR and range of the data and if there is any outliers within the data set. Also this is a really good way of comparing to sets of data.
- For my calculations I will be doing outlier calculations, standard deviation and finally parsons measure of skewness.
- Then I will go on to conclude the investigation.
Middle

Systematic sampling
I used a systematic sample to determine which counties I would use to find the postcodes. A systematic sample works like this:
Say if the population size was 200 (this is an example) and you needed a sample size of 50 you would divide 200/50=4. So therefore you would start with a random number and use every fourth number in that sample. In my case I numbered the counties in the north and south of England, there were approx 23 in the south, so I did the following equation: 30(the sample size I needed) / 23(number of counties approx) = 1.2 (approx). 30/23=1.2. This meant I chose 1 postcode from each of my registered counties and from every 5th county I took an extra postcode. From the north I did the same equation but I used 20(approx) counties, so the equation this time was: 30(sample size needed) / 20(number of counties approx) = 1.5. 30/20=1.5. This meant I took 1 postcode from every county and an extra postcode from every second county. To find the house prices I would use, I just used a random sample; I generated the numbers I would use for the random sample using the random number generator on a calculator.
Conclusion
North
Again I am going to calculate if there are any outliers, but this time within my north house prices data set.
Xi: 176,237.50-(1.5*112512.50) = 7468.75. This means anything below this number would be classed as outlier in my data set. But there are no pieces of data lower than this value within my data set.
Xi: 288,750 + (1.5*112512.50) = 457,518.75. This means anything over this price would be classed as an outlier in my data set. And there is one piece of data that is above this limit, this piece of data was a 3 bedroom detached house costing 550,000 in Cheshire. So there is one outlier within my north house prices data set.
There is one outlier within my north data set.
Median and IQR
The median and the IQR that I got from the box plot, I going to use this measure of averages as they are the ones you should use if you have if you have a skewed distribution. My IQR indicates that the variation of house prices in the North is lower than that in the South. The median is the preferred measure of average for a skewed distribution, and the South’s median is higher than that of the North’s median which shows that the house prices in the north are lower than that in the south, which supports my hypothesis.
This student written piece of work is one of many that can be found in our GCSE Miscellaneous section.
Found what you're looking for?
- Start learning 29% faster today
- 150,000+ documents available
- Just £6.99 a month