AS and A Level: Probability & Statistics

Statistical diagrams

  1. 1 When working with grouped data, if the class is from 9 to 12, this includes values from 8.5 up to 12.5, which means the class width is 4, not 3.
  2. 2 In a histogram, the area is the frequency. The y-axis is the frequency density.
  3. 3 When working out lengths of scaled histograms, it is always helpful to draw the rectangle and label the relevant sides with the lengths given.
  4. 4 When drawing a stem and leaf diagram, make sure to include a key. The key is worth a mark. For example 2|1 represents 2.1 or on a different stem and leaf, 3|2 represents 32.
  5. 5 Always draw a scale when drawing a box plot, the scale is worth a mark.

How to tackle questions on regression and correlation

  1. 1 When asked about the relationship in a regression model, always get the context the correct way round. For example, weight does not affect height, height effects weight.
  2. 2 When asked if your answer is reliable for the regression model, comment on whether the x value you used to get the answer is within the original data set. If the x value is within the boundary it is suitable. Never extrapolate when using a regression model.
  3. 3 If you have found a regression model for a relationship between h and p, and are then told h=x+100 and p=y-20 and asked to find a regression model for x and y. Sub x+100 and y-20 into your original equation and re-arrange.
  4. 4 When data is coded the correlation co-efficient is not changed.
  5. 5 If a regression model is created using for example, heights and weights of children. This model could not be used to predict the weight of an adult. Models are very specific to the data with which they were created..

Normal distribution

  1. 1 When answering normal distribution questions always draw a picture and shade in the part of the graph that you know and/or want.
  2. 2 In a normal distribution, the area under the curve represents the probability.
  3. 3 A normal distribution model is appropriate if the mean and median are the same, or very close.
  4. 4 The big normal distribution table gives area to the left of the line. The small table has areas to the right of the line.
  5. 5 If unsure what the question is asking. Do the first step which is to rewrite the question, but converted to the normal distribution.

  1. males and females driving

    The raw data which I have been given is in list form and provides me with the performance of 239 students, of both sexes and includes: * The number of one hour lessons before successful test. * The total number of minor errors in test(s) taken. * The name of the instructor (4 different teachers). * The day of the test. * The time of the day of the test. The data is in no particular order and reflects a large range and variety of results and performances.

    • Word count: 1172
  2. India PopulationPyramid for 1995

    for the year 2000: India Population Pyramid for 2003 Age and s*x distribution for

    • Word count: 95
  3. Investigation into the relationship between P1 exam results and A-level results

    This would also require more time perhaps several months before I would be able gather all the results. Therefore I have restricted my population to students at Trinity School who have taken A-level maths in the 5 years. These 5 years should give me a large sample population of 249 students. I will then take another sample of 50 students. To do this, I numbered the students from 1 to 249 and then generated 50 random numbers in Excel. I picked out these 50 results.

    • Word count: 1689
  4. Bivariate Data Exploration

    This investigation will examine data from a range of cars, varying in both engine size and insurance group, and if a positive correlation is found between insurance group and engine size, then the concept of 'student cars' will not be such a worrying factor when a student goes to buy his first car, however if there is no correlation then it is entirely possible that insurance companies are charging too much for cars in the 'student car' category. Data Collection: To start the investigation, data needed to be collected before any conclusions could be made.

    • Word count: 1630
  5. GCSE Maths Coursework: proportions of different parts of the body and thier relationship to height

    6 15 29 63 0.248*80 19.84 20 6 15 31 63 0.508*80 40.64 41 5.5 14 30 70 0.701*80 56.08 56 5 14 29 70 0.762*80 60.96 61 6 13 30 62 0.808*80 64.64 65 6 16 32 67 0.593*80 47.44 47 6.5 14.5 30 68 0.642*80 51.36 52 5.5 15 30.5 74 0.784*80 62.72 63 6 14 31 75 0.053*80 4.24 4 6 16 33 77 0.222*80 17.76 18 6 15 30 60 0.122*80 9.76 10 6 15.5 31.5 70 0.809*80 64.72 65 6 16 32 67 0.607*80 48.56 49 6.5 16 30.5 71 0.982*80 78.56 79 6 14

    • Word count: 4192
  6. Undertake a small-scale survey to estimate population parameters.

    10 tubes of smarties will be bought, each from a different shop, and 5 will be selected at random from each tube to be used in the survey. This should produce a random sample. The sample must be random for the Central Limit Theorem to be in effect, so that the distribution of its mean is Normal and predictions can be made about it, even though the distribution of the parent population of smarties is unknown and not necessarily Normal.

    • Word count: 1895
  7. Identifying Substances Experiment

    Also, I will know what a chemical and physical property is and I will know how to find them out. Materials Refer to, Chemistry Lab #1 - What's the substance? I didn't change most materials when I did this experiment, but I added 4 materials, which are: * 5 test tubes * 2 stoppers * 1 large piece of paper And I deleted 1 material, which is: * Spatula Methods Refer to, Chemistry Lab #1 - What's the substance? However, I changed some of procedures during my experiment, here is the changes I made in this experiment: * I only

    • Word count: 1491
  8. Is there a Correlation between GCSE Mathematics and English Literature scores?

    If my conclusion is proved to be right then my results will have a positive correlation. This means that if a pupil achieved high scores in Mathematics I expect them to achieve high scores in English Literature. Both the conclusions made by myself and the English literature teacher expect that there will be a correlation between the GCSE scores obtained in English Literature and Mathematics. Data Collection and Sampling: I obtained my raw data from the high school which I previously attended as I thought this was only fair as both the English literature teacher and I had based our conclusion on the results from this school.

    • Word count: 2522
  9. GCSE Mathematics Coursework: Statistics Project

    The school features Years 7 - 11 and in each year, there are different numbers of girls and boys. Each group in the sample must occur in the same proportion as it does in the overall population of the school, so before I can pick specific pupils to analyse through random sampling, I must work out how many people to choose from the different years and how many should be boys or girls. To do this, I will use stratified sampling. To find the number of Yr 7s required in the sample: 275 x 100 = 24 (to the nearest whole number)

    • Word count: 3473
  10. Hewlett Packard: DeskJet Printer Supply Chain

    Vancouver manufacturer operates only as a manufacturer, and hence, optimizes its operation to hold no inventory. Meanwhile, the European distribution center operates only as a distributor, and therefore only manages inventory levels and will not perform any assembly work. If both players would broaden their viewpoints to the entire supply chain, then they could perform final, localization assembly tasks at optimal points in the supply chain to minimize stock outages or overages. Essentially, it is optimal to postpone localization assembly until the moment demand for a localized configuration is known, which is in the European distribution center, closest to the customers that drive demand.

    • Word count: 1358
  11. Linear regressions.

    appeared to be significant, then there is serial correlation in residuals. If this is the case, the estimation procedure should be modified as follows. Instead of using Y and X one should use (Yt-?Yt-1) and (Xt-?Xt-1) and estimate the regression (Yt-?Yt-1)=a+b(Xt-?Xt-1)+et Problem 3 Part B An investigator analysing consumers expenditure in the UK using quarterly data over the period 1979-1997 estimated the following two models Model A D4Ct = 0.0083 + 0.558 D4Ct-1 + 0.241 D4ct-2 + 0.037 D4Ct-3 - 0.220 D4Ct-4 (0.0026) (0.096) (0.116) (0.125) (0.103) + 0.208 D4Yt-1 - 0.124 D4Yt-2 + 0.016 D4Yt-3 - 0.172 D4Yt-4 (0.120)

    • Word count: 1806
  12. Investigate a possible relationship between self-esteem and levels of satisfaction in the undergraduate student population.

    (Baumeister 1999). Curry and Johnson (1990) describe high self-esteem as a secure sense of identity and an ability to acknowledge and value one's own efforts and achievements. They stress a connection between high self-esteem, confidence, energy and optimism and argue that these traits have their roots in early years. Baumeister, Rice and Hutton (1989) discuss self-esteem in terms of motivational orientation, with high self-esteem giving a self-enhancing orientation. In other words a person considered to have high self esteem is more likely to seek to capitalise on their good traits and pursue successes even under risky conditions.

    • Word count: 3516
  13. The case is about the Monetta Financial Services Company, an investment house.

    Two major arguments will be used to establish that Monetta willingly and knowingly distributed "hot" IPOs to its directors. These are mathematical / statistical arguments using standard descriptive statistics and legal arguments based on the SEC Act. Both arguments will hopefully proof beyond reasonable doubt that Monetta acted with ill faith and deceitful intent. Statistical Analysis To perform the statistical analysis we need to separate the IPOs that were allocated to Directors with the ones allocated to the Fund clients in order to show that IPOs allocated to Directors have higher returns with low risk as compared to IPOs allocated to Fund clients in addition to comparing these figures with the overall 50 IPOs in which Monetta participated.

    • Word count: 2281
  14. "Males in the 11-18years age range will guess the angles and lengths better than females in the 30+years age range."

    5 32 m 13 38 4.5 34 m 13 45 6 35 m 12 45 4 36 m 12 45 3 37 m 12 40 4 38 m 12 35 4 39 m 13 45 5 40 m 12 45 3.5 41 m 12 45 4 43 m 13 30 4.5 44 m 14 39 3.5 48 m 13 48 6.5 49 m 13 45 4.2 50 m 13 30 5 51 m 13 45 4.3 52 m 14 35 6.5 53 m 13 40 3 54 m 13 45 4.5 55 m 13 40 7 56 m 13 40

    • Word count: 3335
  15. Do population factors influence crime? Doesthe number of young adults aged 17 to 24 affect the amount of disorder un areas of Stockton-On-Tees?

    Because i am looking at disorders per 1000 people i have to find the number of people aged 17-24 per 1000 people in order that it can match my disorders. I will have to use this calculation :- People aged 17 to 24 x 1000 Total population To find out how many young adults ages 17 - 24 there are per 1000 people in each ward. For example I used the above calculation, the Victoria ward where I live has 530 young adults out of a total population of 5770 using the calculation:- 530 x 1000 5770 I found out that it had 91 young adults per 1000 people.

    • Word count: 1220
  16. Find out the factors that most affect the prices of second hand cars.

    I believe that the MPG will effect the prices of second hand cars significantly as the miles per gallon can show the cost of running the cars and if the MPG is high then that might persuade the buyer, to buy the car. Method During this investigation I will use the information provided to determine the main factors affecting second hand car prices. From my preliminary investigation I realised that I will compare all of these factors to the percentage decrease, as the difference in price is not accurate enough to determine the results I wanted.

    • Word count: 2504
  17. Reaction Times

    The results would be read as, the lower the number of cm's, the quicker the reaction. This would be repeated 3 times for each student, to get a fair result, and then the mean found. This is the best method of testing, because it doesn't cost money, you don't need to travel and no errors can be found with a ruler. How is it a fair test? I will make this a fair test by: - * Using the same 30cm ruler * Making sure the students are standing * Making sure the zero on the ruler is in line

    • Word count: 2559
  18. Analyse a set of results and investigate the provided hypothesise.

    During the course of my investigation I will try and eliminate any bias that might occur. This is most likely to happen when I select a range of data from the pool of results, when selecting specific data I will try and sample as many random data as I can and make sure that it hasn't all come from one person. Collection of data As part of this coursework, a given task was to collect data from random people by asking them to estimate the length of a line in (mm) and the size of an angle in (�) degrees. Once these results were taken they were then entered onto an X-cell spreadsheet as raw data.

    • Word count: 6503
  19. Dehydration and Gas Chromatography of Methylcyclohexanols.

    A simple apparatus for distillation was assembled and two 10 mL graduated cylinders were used to collect the distillate. The contents of the 50 mL round bottom flask were gently brought to a boil and the temperature of the vapor was approximately 115 �C. The rate of heating/boiling was controlled so that the rate of collection in the first 10 mL graduated cylinder was approximately 1 drop per second. When the contents of the distillate in the 10 mL graduated cylinder reached approximately 8 mL in volume the first 10 mL graduated cylinder was removed and a second clean 10 mL graduated cylinder was put in its place to collect an addition 6 mL of distillate.

    • Word count: 1048
  20. Statistical investigation into pupils at Mayfield high school

    My prediction is that the more TV watched the higher the IQ. The reason for my prediction in this hypothesis is that I believe that watching TV makes a person more knowledgeable. There are a lot of television programmes which provide a high level of information such as the News, documentaries, quiz shows, and even reality TV. People watching these programmes absorb a lot of information and are encouraged to think about what they are watching. In my conclusion on the results of this hypothesis below, I consider outside factors which I have not taken into account which may influence the results, e.g.: age, and type of programme watched.

    • Word count: 2007
  21. Statistical coursework that uses data from 'Mayfield High School.'

    However, I have decided to pick only two (at the maximum 3) pieces of data, as time is a limiting factor in this coursework. When deciding my data categories, there are a few things that I need to bear in mind. I need to use quantitative data, so I am able to apply all higher level statistical maths to my results. I also need to make sure that the data I choose are closely related, so I can analyse my results thoroughly.

    • Word count: 2689
  22. We are conducting an investigation to discover how accurate people in our school are at estimating the lengths of lines and the sizes of angles.

    A straight line measuring 13.7cm will be drawn on another piece of plain paper. This will be used to test candidates in my second hypothesis. We have chosen these measurements as they are not easily recognised, and so we will definitely be testing the candidates estimating, not recognising, skills. Candidates will be taken to an empty room where they will be shown the angles or line; depending on which investigation they have been chosen to participate in. The candidates will then be asked to each write down what they believe the size of the line/angle to be. After collecting all of our results we will put them in graphs to analyse and compare them.

    • Word count: 4703
  23. Sampling Techniques.

    Suppose that the N units in the population are numbered 1 to N in some order. To select a systematic sample of n units, if then every k-th unit is selected commencing with a randomly chosen number between 1 and k. Hence the selection of the first unit determines the whole sample, e.g., N = 5,000, n = 250 therefore k = 5000/250 = 20. Therefore, select every 20th item commencing with (say) 6.

    • Word count: 589
  24. The aim of this project is to find out which factors affect the selling price of a house.

    I will then state my median and the inter-quartile range for my cost, as outliers do not affect them. Out of the fields given, some of these affect the price of the houses. The fields that will affect the house price are if the house has a garden, if the house has a garage, the number of bedrooms in a house and the area (square ft) of a house. The field that will not affect house price is house number. The fields in order of importance are, number of bedrooms, area (sq ft) of a house, if the house has a garden, if the house has a garage and finally house number which has no importance at all.

    • Word count: 1924

