AS and A Level: Probability & Statistics

Statistical diagrams

  1. 1 When working with grouped data, if the class is from 9 to 12, this includes values from 8.5 up to 12.5, which means the class width is 4, not 3.
  2. 2 In a histogram, the area is the frequency. The y-axis is the frequency density.
  3. 3 When working out lengths of scaled histograms, it is always helpful to draw the rectangle and label the relevant sides with the lengths given.
  4. 4 When drawing a stem and leaf diagram, make sure to include a key. The key is worth a mark. For example 2|1 represents 2.1 or on a different stem and leaf, 3|2 represents 32.
  5. 5 Always draw a scale when drawing a box plot, the scale is worth a mark.

How to tackle questions on regression and correlation

  1. 1 When asked about the relationship in a regression model, always get the context the correct way round. For example, weight does not affect height, height effects weight.
  2. 2 When asked if your answer is reliable for the regression model, comment on whether the x value you used to get the answer is within the original data set. If the x value is within the boundary it is suitable. Never extrapolate when using a regression model.
  3. 3 If you have found a regression model for a relationship between h and p, and are then told h=x+100 and p=y-20 and asked to find a regression model for x and y. Sub x+100 and y-20 into your original equation and re-arrange.
  4. 4 When data is coded the correlation co-efficient is not changed.
  5. 5 If a regression model is created using for example, heights and weights of children. This model could not be used to predict the weight of an adult. Models are very specific to the data with which they were created..

Normal distribution

  1. 1 When answering normal distribution questions always draw a picture and shade in the part of the graph that you know and/or want.
  2. 2 In a normal distribution, the area under the curve represents the probability.
  3. 3 A normal distribution model is appropriate if the mean and median are the same, or very close.
  4. 4 The big normal distribution table gives area to the left of the line. The small table has areas to the right of the line.
  5. 5 If unsure what the question is asking. Do the first step which is to rewrite the question, but converted to the normal distribution.

    The heights of 16-18 year old young adults varies between males and females. My prediction is that the majority of males are taller than females.

    5 star(s)

    So overall the data I collected wasn't bias and was accurate to use for my investigation. I decided to investigate the difference in heights between males and females ages 16-18 by collecting the data from students attending Havering Sixth Form College and then working out the Mean and Variance of both populations so that I could work out confidence intervals. To work out the mean and variance I had to illustrate tables for both populations. I drew up tables showing X as the height in inches which ranged from 60 - 73 in females and 62 - 76 in males.

    • Word count: 2173
  2. Maths Statistics Investigation

    AX 1080 11 28 Ford Fusion 6020 4 29 Mitsubishi Colt 2665 7 30 Mercedes E-Class 2000 24435 4 31 Skoda Fabia 3585 6 In my opinion not only the age of the car will affect the car price, there are some other factors, which might also affect the car price, such as the colour, mileage, engine size and number of owners. The second factor I will investigate is the relationship between the price and the mileage, but this time I will choose 50 random numbers from my data sheet.

    • Word count: 2232
  3. Data Analysis of American House Price

    Each house is described by its price, size, number of bedrooms and bathrooms, if it has or nor a pool and a garage, the distance from the nearest large town, how desirable it is (scale of value among 1 = very undesirable to 7 = most desirable), the township of belonging and its age. The aim of this report is to assess and evaluate the distribution of house price in America in the 5 townships used as sample. A conclusion is provided to summarise all the findings, interpretations and explanations followed by suitable suggestions.

    • Word count: 2371
  4. maths coursework sampling

    In statistical analysis, a hypothesis is never proven to be true or false but is only rejected or accepted on the basis of statistical tests. In my einvestigation, I will put forward three hypotheses in which two will be statistically analysed by using correlation. I wish to put forward three hypothesises which will be subject to statistical analysis * Hypothesis 1 There is positive correlation between the estimates of length A and estimates of length B for year 7 pupils * Hypothesis 2 * The distribution of the estimates for angle c will be similar for Year 7 and Year 10 pupils.

    • Word count: 2672
  5. Mayfield High School Maths Coursework

    There are various samples, which can be used. However, I am going to use random sampling and stratify sampling and this way it will avoid bias results. The random sampling will pick out my data in any order. The below formula is used to stratify my samples. The formula that I will use to work out my samples is:- Number of students used in sample= Total number of girls/boys in year X Sample Size Total number of students in the school Below is a table with the data which we were provided and also showing how I worked out my samples.

    • Word count: 2013
  6. Is there a Correlation between GCSE Mathematics and English Literature scores?

    If my conclusion is proved to be right then my results will have a positive correlation. This means that if a pupil achieved high scores in Mathematics I expect them to achieve high scores in English Literature. Both the conclusions made by myself and the English literature teacher expect that there will be a correlation between the GCSE scores obtained in English Literature and Mathematics. Data Collection and Sampling: I obtained my raw data from the high school which I previously attended as I thought this was only fair as both the English literature teacher and I had based our conclusion on the results from this school.

    • Word count: 2522
  7. The case is about the Monetta Financial Services Company, an investment house.

    Two major arguments will be used to establish that Monetta willingly and knowingly distributed "hot" IPOs to its directors. These are mathematical / statistical arguments using standard descriptive statistics and legal arguments based on the SEC Act. Both arguments will hopefully proof beyond reasonable doubt that Monetta acted with ill faith and deceitful intent. Statistical Analysis To perform the statistical analysis we need to separate the IPOs that were allocated to Directors with the ones allocated to the Fund clients in order to show that IPOs allocated to Directors have higher returns with low risk as compared to IPOs allocated to Fund clients in addition to comparing these figures with the overall 50 IPOs in which Monetta participated.

    • Word count: 2281
  8. Find out the factors that most affect the prices of second hand cars.

    I believe that the MPG will effect the prices of second hand cars significantly as the miles per gallon can show the cost of running the cars and if the MPG is high then that might persuade the buyer, to buy the car. Method During this investigation I will use the information provided to determine the main factors affecting second hand car prices. From my preliminary investigation I realised that I will compare all of these factors to the percentage decrease, as the difference in price is not accurate enough to determine the results I wanted.

    • Word count: 2504
  9. Reaction Times

    The results would be read as, the lower the number of cm's, the quicker the reaction. This would be repeated 3 times for each student, to get a fair result, and then the mean found. This is the best method of testing, because it doesn't cost money, you don't need to travel and no errors can be found with a ruler. How is it a fair test? I will make this a fair test by: - * Using the same 30cm ruler * Making sure the students are standing * Making sure the zero on the ruler is in line

    • Word count: 2559
  10. Statistical investigation into pupils at Mayfield high school

    My prediction is that the more TV watched the higher the IQ. The reason for my prediction in this hypothesis is that I believe that watching TV makes a person more knowledgeable. There are a lot of television programmes which provide a high level of information such as the News, documentaries, quiz shows, and even reality TV. People watching these programmes absorb a lot of information and are encouraged to think about what they are watching. In my conclusion on the results of this hypothesis below, I consider outside factors which I have not taken into account which may influence the results, e.g.: age, and type of programme watched.

    • Word count: 2007
  11. Statistical coursework that uses data from 'Mayfield High School.'

    However, I have decided to pick only two (at the maximum 3) pieces of data, as time is a limiting factor in this coursework. When deciding my data categories, there are a few things that I need to bear in mind. I need to use quantitative data, so I am able to apply all higher level statistical maths to my results. I also need to make sure that the data I choose are closely related, so I can analyse my results thoroughly.

    • Word count: 2689
  12. Compare the heights of girls and boys in year 8 and the sixth form.

    I will measure the selected groups independently using the measuring device illustrated below. This device includes two, one meter rulers fixed against the wall, the mark for 0 cm is in line with the floor, and the second ruler is fixed alongside the first so that its 0 cm mark is in line with the first's 100cm mark. Then two, 30cm rulers with millimetre measurements are fixed, one either side of the second meter ruler, at 140-170cm and 170-200cm, again the 0cm marks for each of these rulers were fixed in line with 140cm and 170cm respectively.

    • Word count: 2576
  13. I believe that boys in year 10 are better at estimating time than girls are.

    I have decided on using a sample size of around 50. Using a data sample consisting of only 50 people is important as an appropriate. I decided not to take a sample size of 10 as I thought I would not get an accurate range of the student's guesses. I also decided not to pick a number such as 200 as I would be looking at too much data and would just be wasting my time. Samples can be taken in three different ways.

    • Word count: 2535
  14. Forensic Examination of Drugs by Thin Layer Chromatography.

    or alumina (Al2O3) coated on an aluminium or plastic sheet. The plate constitutes the stationary phase. The sheet is then placed in a chamber containing a small amount of solvent, which is the mobile phase. The solvent gradually moves up the plate via capillary action, & it carries the deposited substances along with it at different rates due to the differential solubility of each of its components. The desired result is that each component of the deposited mixture is moved a different distance up the plate by the solvent.

    • Word count: 2424
  15. Application of number: level 3 - Is House Buying a Good Idea or Not?

    * Range: highest house price - lowest house price = �84,950 - �18,500 = �66,450 (to the nearest �) Other houses: * Mean: Sum of house prices / 30 = �3,334,600 / 30 = �111,153 (to the nearest �) * Range: highest house price - lowest house price = �249,950 - �32,500 = �217,450 (to the nearest �) All houses: * Mean: Sum of house prices / 60 = �4,919,045/ 60 = �81,984 (to the nearest �) * Range: highest house price - lowest house price = �249,950 - �18,500 = �231,450 (to the nearest �)

    • Word count: 2349
  16. Collect data with a view to estimating population parameters using estimation techniques.

    The Central Limit Theorem Because I don't know anything about how the population is distributed I have to use the Central Limit Theorem. Even if you don't know how the parent population is distributed the central limit theorem allows you to make predictions as to the distribution of the sample means. Also with a large enough sample the sample mean will be close to the population mean. The central limit theorem says that: * If you take enough samples then the means will be normally distributed. * The mean of the sample means is approximately equal to the population mean.

    • Word count: 2248
  17. Standard addition was used to accurately quantify for quinine in an unknown urine sample containing approximately 100 ìg cm‑³ of quinine.

    In dilute solutions it has an astringent taste and is added to some types of tonic water. The analysis of quinine in urine is important in forensic science as quinine is frequently used as an adulterant in illicit heroin samples. Its presence can therefore be tested for in order to determine the presence of heroin in the body. Fluorescence and Phosphorescence Fluorescence and phosphorescence are phenomena associated with transitions between more than one excited state for a species. After excitation to a higher level, an electron drops by a non-radiative process to an intermediate level and then to the ground state giving rise to emission at a longer wavelength than that of the exciting radiation.

    • Word count: 2888
  18. Is there a correlation between happiness and sociability?

    'glad, content, happy'(*4) There is lots of evidence that points towards happiness and sociability being related. These quotes show this relationship: "Social science surveys have universally concluded that people claim to be most happy with friends and family, or just in the company of others"(*5). This shows how people who socialise are likely to get a feeling of happiness; therefore, in theory the more sociable a person, the happier they should be. "Relationships make us extremely happy when they go well, and very depressed when they don't work out"(*5). This shows how the breakdown of social relationships can be the cause of unhappiness, but these relationships can also cause a person to be happy when they are going well.

    • Word count: 2971
  19. Which three factors affect the price of a second hand car.

    Age - This tells you how long the car has been running, if it's been running a long time its parts may be worn from the rust and may need a repair. 3. Engine Size - Many people want a large engine, this may be because they want to go fast or maybe they live up a hill and want to make sure that it can go up without much trouble. 4. Style - The style is very important because it affects whether you want to buy the car or if it suits you.

    • Word count: 2170
  20. House buying - a good idea or not?

    Three house prices were be obtained from each of the 10 estate agents - one of which would be the highest house price whilst another would be the lowest house price, and one of the house prices was selected from the mid-range. This method of collecting information was an attempt to ensure that both samples were reliable due to consisting of a wide variety of data. A table to show the data collated for the samples for other houses and first-time buyers' houses (in ascending order): Other houses Price (�)

    • Word count: 2693
  21. My aim is that within the limits of a small-scale survey I will collect sample data of a population, and by using estimation techniques I will determine the population's parameters (such as the mean and the variance).

    For this reason, the sample size will be set at fifty, which I consider large enough for the distribution of its mean to be normal (according to the Central Limit Theorem). It should not be larger because the aim of this investigation is to carry out a "small scale survey". The sample. The sample will be of the weight of fifty smarties. To be a "good" sample I must make sure that the results are valid and not biased in any way, which means that these smarties must be collected randomly, because the sample must be random for the Central

    • Word count: 2552
  22. Throughout this experiment I have decided that I am going to investigate the tensile properties of a Copper wire.

    Force Extension Area Original Length Now that I have decided what data I need to collect I need to decide how I am going to collect it. I know what measurements I have to take and I know that I will be tensile testing. With this in mind I decided on the following experimental setup. I have set up a clamping system for the sample at one end of a desk and then used a pulley and a hang weight system on the other end of the sample at the other end of the desk.

    • Word count: 2519
  23. Mayfield High Statistics - I am going to investigate how your weight affects your lifestyle.

    Females go through puberty earlier then males and therefore stop growing earlier then males. Therefore, I think that males will have higher BMI then females who weigh less. * Your favourite sport has an affect on your BMI I predict that the more active your favourite sport is, the less you weigh because you burn more calories therefore less BMI. The less active your favourite sport is the more you weigh as less calories are burnt, therefore more BMI. * The number of hours spent watching T.V has an affect on your BMI The more average number of hours spent watching TV, the more you weigh because you don't need much energy to watch TV so less calories are burnt, therefore more BMI.

    • Word count: 2087
  24. Distribution of the weights of two types of sweets

    Hence, I will be using a sample size 100 of each type of sweet, as this size will be appropriate for my requirements. To obtain a sample of 100 sweets from the whole population it is necessary to choose a practical sampling procedure. For this investigation there are two main methods that could be used: * Stratified sampling - This is takes the same proportion of each population group and obtaining data from that proportion. * Random sampling - This sampling method is where every member of the population has an equal chance of being selected.

    • Word count: 2775

