This is secondary data but it is very reliable and therefore ok to use.
I will then use an AA map to find the distances from the areas in my random sample to London.
Aims
This set of data is quite large and gives results for each season of the year from 1995 to 2003, and so I will reduce it to a smaller sample to make it easier to work with. I will choose a random sample of different regions in England and Wales and I will then compare the price of the house with the distance from London.
I intend to do this by drawing up a scatter graph comparing the two variables: average house price and distance from London. The graph will enable me to easily spot a relationship if there is correlation, especially if I do a line of best fit. From the line of best fit I will be able to work out the equation of the line allowing me to predict further results.
I will test whether my hypothesis is correct by seeing if house prices decrease as they get further away from London
Data Collection
I am going to reduce the data to a sample containing 36 different regions/areas.
I will use 36 because I feel that it is a suitable sample size which can be easily divided by 3 or 4.
To produce my random sampling I will number all the regions/areas in the data table. One way would be to put all the numbers in a hat and pull out 36 however I will use the random number key on my calculator to generate a number between 0 and 1. By multiplying this random number by the amount of regions there are it will produce a random region. I will do this 36 times to generate my random region list. If the same number comes out more than once I will ignore it until I have 36 different regions.
Sample
There are different regions.
The random number generated on my calculator is
0. 146 x 109 = 15.914, and so region 16 (Cheshire) will be included on my list.
When I have my random sample I will produce a table showing the results of average house price in the region and the distance it is from London. I have decided to do this for the most recent set of results which is for September to December 2003. To find the distance from London I used a map of the counties of England and Wales and measured the distance from the centre of the random chosen county to London using the scale of the map.
I will then draw a scatter graph showing the results and look to see if there is any correlation between them.
Results
Table of Average House Price and Distance from London
AnalysisOn the graph I have drawn the ‘line of best fit’ or the regression line.
To draw this line I first worked out the two means (x¬, y¬) as the regression line should go through this point.
Correlation
Equation of line
Prediction
Interpret and discuss
Extending My Investigation
I have also heard it said that there is a North / South divide regarding wealth and so I intend to investigate if this is true for house prices.
My second hypothesis is:
House prices in the South of the country are more expensive than those in the North of the country.
I could take another random sample or use the same one.
To investigate the North / South divide I will divide my sample into two tables, depending on whether the region is in the north or south of the country. I first had to decide where the dividing line would be and then look to see if the region is above or below this line.
I have cut up a map (fig. 1) showing where I have placed the north and south divide.
To try and prove / disprove my hypothesis I would draw a Cumulative frequency curve of north, south and all of the areas. I could then also do 3 box and whisker diagrams for them making it extremely easy to compare. I would also find the inter quartile range for the results to compare them further by removing the freak results.
I could also investigate if the number of houses bought and sold in the North is less than the south.
My third hypothesis is:
The number of houses bought and sold in the North is less than the south.
To investigate and therefore prove / disprove this hypothesis I could use mean and standard deviation to compare the two results.
I suspect that more people think about moving house in the summer when it is warmer and more pleasant to go out looking than in the winter. I intend to investigate whether more houses are sold in the summer than in the winter.
My forth hypothesis is:
More houses will be sold in the summer than in any of the other seasons.
To investigate if there are More Sales in Summer I shall plot a Frequency polygon. Could then go on to plot moving averages for this graph to try and find a trend and possibly plot a line of best fit. As I suspect that the mean number of sales would increase as another year past.
My fifth hypothesis is:
The mean number of house prices would increase as another year past.