• Join over 1.2 million students every month
• Accelerate your learning by 29%
• Unlimited access from just £6.99 per month

# AS and A Level: Probability & Statistics

Browse by
Rating:
4 star+ (1)
3 star+ (2)
Word count:
fewer than 1000 (35)
1000-1999 (77)
2000-2999 (36)
3000+ (52)

Meet our team of inspirational teachers Get help from 80+ teachers and hundreds of thousands of student written documents ## Statistical diagrams

1. 1 When working with grouped data, if the class is from 9 to 12, this includes values from 8.5 up to 12.5, which means the class width is 4, not 3.
2. 2 In a histogram, the area is the frequency. The y-axis is the frequency density.
3. 3 When working out lengths of scaled histograms, it is always helpful to draw the rectangle and label the relevant sides with the lengths given.
4. 4 When drawing a stem and leaf diagram, make sure to include a key. The key is worth a mark. For example 2|1 represents 2.1 or on a different stem and leaf, 3|2 represents 32.
5. 5 Always draw a scale when drawing a box plot, the scale is worth a mark.

## How to tackle questions on regression and correlation

1. 1 When asked about the relationship in a regression model, always get the context the correct way round. For example, weight does not affect height, height effects weight.
2. 2 When asked if your answer is reliable for the regression model, comment on whether the x value you used to get the answer is within the original data set. If the x value is within the boundary it is suitable. Never extrapolate when using a regression model.
3. 3 If you have found a regression model for a relationship between h and p, and are then told h=x+100 and p=y-20 and asked to find a regression model for x and y. Sub x+100 and y-20 into your original equation and re-arrange.
4. 4 When data is coded the correlation co-efficient is not changed.
5. 5 If a regression model is created using for example, heights and weights of children. This model could not be used to predict the weight of an adult. Models are very specific to the data with which they were created..

## Normal distribution

1. 1 When answering normal distribution questions always draw a picture and shade in the part of the graph that you know and/or want.
2. 2 In a normal distribution, the area under the curve represents the probability.
3. 3 A normal distribution model is appropriate if the mean and median are the same, or very close.
4. 4 The big normal distribution table gives area to the left of the line. The small table has areas to the right of the line.
5. 5 If unsure what the question is asking. Do the first step which is to rewrite the question, but converted to the normal distribution.

• Marked by Teachers essays 2
1.  ## The heights of 16-18 year old young adults varies between males and females. My prediction is that the majority of males are taller than females.

5 star(s)

So overall the data I collected wasn't bias and was accurate to use for my investigation. I decided to investigate the difference in heights between males and females ages 16-18 by collecting the data from students attending Havering Sixth Form College and then working out the Mean and Variance of both populations so that I could work out confidence intervals. To work out the mean and variance I had to illustrate tables for both populations. I drew up tables showing X as the height in inches which ranged from 60 - 73 in females and 62 - 76 in males.

• Word count: 2173
2.  ## data handling

3 star(s)

If any data is missing or obviously wrong, I will use another person instead. 11/261x50=2.1 I did that for all of the amounts. 14/261x50, 7/261x50 etc Months Boys Girls Total Amount Stratified amount for boys Stratified amount for girls September 11 9 20 2.11 1.72 October 14 6 20 2.68 1.15 November 7 13 20 1.34 2.49 December 9 17 26 1.72 3.26 January 13 6 19 2.49 1.15 February 17 10 27 3.26 1.92 March 15 17 32 2.87 3.26 April 9 8 17 1.72 1.53 May 9 11 20 1.72 2.11 June 12 13 25 2.3 2.49 July 9 16 25 1.72 3.07 August 4 6 10 0.77 1.15 129 132 261 24.71 25.29 Next I rounded up the numbers to their nearest whole number.

• Word count: 3869
3. ## Aim: in this task, you will investigate the different functions that best model the population of China from 1950 to 1995.

Linear trend line: * The above graph now has a linear trend line (with the equation of f(x) = 15.5x - 29690.25). Although it is very close to each point, it isn't perfect (for example, the data for 1950 and 1965 hardly touch the line). Furthermore, this model predicts that China's population would simply increase at a constant rate - which is unlikely since it is expected that a population would increase exponentially, in an ideal world. Exponential trend line: * The exponential trend line (equation: f(x) = 1.02^x) just about fits the data, but there are more anomalies than the linear line.

• Word count: 985
4. ## Statistics. I have been asked to construct an assignment regarding statistics. The statistics of which I will be using will be football attendances. The football teams I have chosen to use are Birmingham City FC and Chelsea FC.

Attendance Bar Chart This bar chart appears to show Chelsea FC being slightly more consistent, but is still difficult to know. These are also just a few of randomly picked attendances from the groups, so there is no way of knowing who is more consistent as of yet. Line Graph The above results appear to show Chelsea with the straighter line, meaning they could possibly have the most consistency in attendance. This is still too early to tell without significant proof.

• Word count: 1510
5. ## Statistics coursework

x sample size I will use stratified sampling as the number of pupils in each year group is different. By using this method I will be getting a representative proportion of each year group making my data fair. Once I have calculated the sample size for each strata I will use random sampling (using the Ran# button on the scientific calculator) to select the calculated number of pieces of data from the strata. However, before that I will set my calculator to be fixed on zero decimal places in order to avoid having to round numbers which would increase the chances of repeated figures.

• Word count: 5209
6. ## Anthropometric Data

This may also be on the wider the child's feet the longer the feet. Positive correlation will occur has one of the variable increases so does the than the other. This is saying that has the children grow older the foot length tend to change and the breadth widen. Find the correlation on a scatter graph is this case coefficient correlation will indicate the strength this will be written in depth later on the graph. Correlations The main purpose of having a correlation is a way to measure how associated or related to variable are.

• Word count: 3458
7. ## Chebyshevs Theorem and The Empirical Rule

The second shape a scatter diagram may have is anything but a normal curve as in the next drawing: We can do a lot of good statistics with the normal curve, but virtually none with any other curve. Let us assume that we have recorded the 1000 ages and computed the mean and standard deviation of these ages. Assuming the mean age came out as 40 years and the standard deviation as 6 years we can do the following predictions.

• Word count: 1174
8. ## Statistics. The purpose of this coursework is to investigate the comparative relationships between the depreciation of a cars price, in relation to the factors that affect it.

The following data will help me decide whether this hypothesis is true, when there is more or less years "attached" to the car: * Sale price ( no years attached) * Age of the vehicle When identifying a general trend for these data, I will discard these data that do not fit the trend: these will obscure my general trend and correlation when it comes to graph analysis. I.e. Some cars may appreciate in value in this particular group: some vintage cars will do this as they gain "collectors' item" status after a number of years.

• Word count: 3318
9. ## Probability of Poker Hands

Measuring the Probability of getting no pairs: PNo Pairs = 13C5 x 4^5 - 4C1 x 13C5 - (10 x 4^5 - 10 x 4) ---------------------------------------------------- 52C5 = 1317888 - 5148 - 10200 ------------------------------- 2598960 = 1302540 -------------- 2598960 = 21709 ---------------- 43316 Explanation: Choosing five different face values: To begin with, the first given condition for this hand is that, we need 5 different cards out of total 52 cards with five different face values. There are thirteen different face values, (A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K)

• Word count: 3446
10. ## Teenagers and Computers Data And Statistics Project

He wondered about how that had come about. Obviously this was because only the surface area was red. 2. Cube Tables a. 2 x 2 x 2 Cube Number of red faces Number of cubes 0 0 1 0 2 0 3 8 4 0 5 0 6 0 Total 8 b. 3 x 3 x 3 Cube Number of red faces Number of cubes 0 1 1 6 2 12 3 8 4 0 5 0 6 0 Total 27 c.

• Word count: 1785
11. ## Maths Statistics Investigation

AX 1080 11 28 Ford Fusion 6020 4 29 Mitsubishi Colt 2665 7 30 Mercedes E-Class 2000 24435 4 31 Skoda Fabia 3585 6 In my opinion not only the age of the car will affect the car price, there are some other factors, which might also affect the car price, such as the colour, mileage, engine size and number of owners. The second factor I will investigate is the relationship between the price and the mileage, but this time I will choose 50 random numbers from my data sheet.

• Word count: 2232
12. ## Frequency curves and frequency tables

Positively skewed. Eg. Non-symmetrical with the longer 'tail' of the frequency curve to the right. f (1) x (2) x (3) x Question 2 There are general rules of constructing Frequency tables. A Frequency distribution is a table in which the values for a variable are grouped into classes and the number of observed values that belong in each class is recorded. Data organized in a frequency distribution are called grouped data every individual observed value of the random variable is listed. Regardless of whether or not the data are grouped, the collection of values may be for either a sample or a population.

• Word count: 1786
13. ## Investigate the relationships between height and weight

I will use secondary data, the advantages I have are that I will not waste time collecting the data myself but the disadvantage is that the data might be unreliable. To get the sample of 10% I will use random sampling and stratified sampling. Stratified Sampling When a population is made up of different groups, Bias can be reduced by representing each group in a sample. Our sample size is 10% of 1183, which is 118. BOYS GIRLS TOTAL YEAR 7 131 151 282 YEAR 8 125 145 270 YEAR 9 143 118 261 YEAR 10 94 106 200 YEAR 11 86 84 170 1183 Stratified Sampling Multiply 118 by the fraction, each sub-group represents of the whole population.

• Word count: 1163
14. ## Investigate the relationship between height and weight and how it changes between gender and year

Year 8 female Year 9 male Year 9 female All years Male and Female Correlation Coefficient My correlation coefficient is 0.525854261 The Main Study My Hypothesis is that boys are taller and heavier then girls and the difference between boys and girls will increase as the students get older. Sampling I have chosen a random sample from 7 of the groups I have picked 30 students from each randomly and my results are as below Years 7 Females Year 7 males Year 8 Females Year 8 Males Year 9 Females Year 9 Males Anomalies To make sure my data is

• Word count: 1371
15. ## AS statistics coursework - correlation coefficient between height and weight in year 11 boys and girls

I read out their names at the end and asked them to stay behind afterwards. Once I had my sample students I told them my purpose and why I was doing it (coursework) then asked them if there where any problems with me taking their weight and height, with no complaints. I asked them all to remove their shoes and any other clothing other than uniform to minimise difference in weight of clothing and height in heel of shoe. To measure weight I used bathroom scales and measured to the nearest kg (presuming that the scales were accurate and their uniform was the only clothing on not including shoes)

• Word count: 3611
16. ## DATA HANDLING COURSEWORK

and weight when considering age and when considering gender, will not be as accurate as they would be in a stratified sample. I will be taking a 5% sample for each stratum as this size is sufficient enough to allow me to make reliable conclusions, and the sample size is not too large to cause difficulty in analysis of the sample. But for Y11 I will do a 10% sample as a 5% sample is too small to provide a reliable and accurate conclusion because there are different numbers of students in each year or gender, which means that the chance of a certain year group or gender being selected will vary, i.e.

• Word count: 4739
17. ## Carrying out an investigation to research the readability of two articles

Once I have collected the set number of words from each article I will count the number of letters in the words and group them in a table. I will then take the mode, median and mean from both sets of data and use these calculations to compare the readability of the two articles. I will also plot the data on suitable graphs for ungrouped data and find further calculations to further support my investigation such as the standard deviation and interquartile range of the data.

• Word count: 1381
18. ## Statistics Coursework

Graphs like the normal distribution curves are ever so important in these type of investigation especially because the graph itself summarise so many vital information such as the Thirdly, I will then analyse all of the results that I will get from the calculations and evaluate it against my hypothesis. I will analyse all of the data in a more depth by doing standard deviation and Spearman's rank correlation coefficient that will allow me to compare and analyse the data properties using different methods.

• Word count: 14839
19. ## data handling

261 30 = 8.7 261 32 -2 = 30 8 The collected data sample of 30 students is the raw data. This needs to be arranged into what is known as frequency distribution where like quantities are counted and displayed by writing down how many of each type there are i.e. writing down their frequencies. I will use bar charts, as these are used for discrete data, to analyse the data about KS2 maths results comparing the results for males and females.

• Word count: 1647
20. ## Intermediate Maths Driving Test Coursework

I am going to use the data given to follow my analysis and test the following hypothesises: * The more lessons you take, the less mistakes you will make * Males perform better than Females in the test * One gender performs better under because they get better instructors Preliminary Analysis: In order to notice investigate my hypothesises and notice patterns within this large amount of data, it is necessary to summarise and identify the 240 pupil's data into diagrams and charts.

• Word count: 3150
21. ## Testing Materials Coursework

For this investigation we are going to use a clamp stand, a hook, a cotton reel, a clamp and weights. We are going to put the clamp stand on the bench and place the cotton reel, with the sample tied around it, in the clamp. We will then increase the load on the material until it breaks and finally we will measure the load needed. We will make sure that our work is safe by wearing safety goggles throughout the investigation and keeping the weights away from the bench so that

• Word count: 478
22. ## Driving test

Firstly, I will tally up the amount of male and female pupils for each instructor. Instructor Male Female Total A 29 31 60 B 49 51 100 C 18 22 40 D 20 20 40 Because of the large amount of statistics, I will take a sample of the data to make calculations easier to manage. The sample should represent the complete data set, so I will take a sample of 60 (a quarter of 240). I will ensure that the proportion from each instructor and of each gender is the same as complete data set.

• Word count: 3136
23. ## I am going to design and then carry out an experiment to test people's reaction times, and therefore test my initial hypothesis.

Therefore the test will be carried out in silence and the participant and the 'dropper' will not make contact. Whether the participant is standing or sitting The 'dropper' and 'catcher' will be sitting on chairs, knees touching, face to face. Whether the measuring side is facing or facing away from the participant The measuring side will be facing the participant, so that they can read the measurement from above the thumb with ease. I learnt from the preliminary test that it was hard to measure if the marks were not facing the participant. Amount of arm movement permitted by participant The hand must hang over the table edge so that the forearm cannot move.

• Word count: 3018
24. ## Maths Cars coursework Plan

I will also need to see the correlations between Age and Price, and the correlations between, Mileage and Price. This will be done to see if they have a negative correlation or positive correlation. This will be done using scatter graphs and how strong each factor influences prices by spearman coefficient of rank correlation. My hypotheses for this investigation are: 1. There is a negative correlation between age and price. 2. There is a negative correlation between mileage and price.

• Word count: 700
25. ## Data Analysis of American House Price

Each house is described by its price, size, number of bedrooms and bathrooms, if it has or nor a pool and a garage, the distance from the nearest large town, how desirable it is (scale of value among 1 = very undesirable to 7 = most desirable), the township of belonging and its age. The aim of this report is to assess and evaluate the distribution of house price in America in the 5 townships used as sample. A conclusion is provided to summarise all the findings, interpretations and explanations followed by suitable suggestions.

• Word count: 2371