# AS and A Level: Probability & Statistics

## Statistical diagrams

1. 1 When working with grouped data, if the class is from 9 to 12, this includes values from 8.5 up to 12.5, which means the class width is 4, not 3.
2. 2 In a histogram, the area is the frequency. The y-axis is the frequency density.
3. 3 When working out lengths of scaled histograms, it is always helpful to draw the rectangle and label the relevant sides with the lengths given.
4. 4 When drawing a stem and leaf diagram, make sure to include a key. The key is worth a mark. For example 2|1 represents 2.1 or on a different stem and leaf, 3|2 represents 32.
5. 5 Always draw a scale when drawing a box plot, the scale is worth a mark.

## How to tackle questions on regression and correlation

1. 1 When asked about the relationship in a regression model, always get the context the correct way round. For example, weight does not affect height, height effects weight.
2. 2 When asked if your answer is reliable for the regression model, comment on whether the x value you used to get the answer is within the original data set. If the x value is within the boundary it is suitable. Never extrapolate when using a regression model.
3. 3 If you have found a regression model for a relationship between h and p, and are then told h=x+100 and p=y-20 and asked to find a regression model for x and y. Sub x+100 and y-20 into your original equation and re-arrange.
4. 4 When data is coded the correlation co-efficient is not changed.
5. 5 If a regression model is created using for example, heights and weights of children. This model could not be used to predict the weight of an adult. Models are very specific to the data with which they were created..

## Normal distribution

1. 1 When answering normal distribution questions always draw a picture and shade in the part of the graph that you know and/or want.
2. 2 In a normal distribution, the area under the curve represents the probability.
3. 3 A normal distribution model is appropriate if the mean and median are the same, or very close.
4. 4 The big normal distribution table gives area to the left of the line. The small table has areas to the right of the line.
5. 5 If unsure what the question is asking. Do the first step which is to rewrite the question, but converted to the normal distribution.

1. ## data handling

If any data is missing or obviously wrong, I will use another person instead. 11/261x50=2.1 I did that for all of the amounts. 14/261x50, 7/261x50 etc Months Boys Girls Total Amount Stratified amount for boys Stratified amount for girls September 11 9 20 2.11 1.72 October 14 6 20 2.68 1.15 November 7 13 20 1.34 2.49 December 9 17 26 1.72 3.26 January 13 6 19 2.49 1.15 February 17 10 27 3.26 1.92 March 15 17 32 2.87 3.26 April 9 8 17 1.72 1.53 May 9 11 20 1.72 2.11 June 12 13 25 2.3 2.49 July 9 16 25 1.72 3.07 August 4 6 10 0.77 1.15 129 132 261 24.71 25.29 Next I rounded up the numbers to their nearest whole number.

2. ## Statistics coursework

x sample size I will use stratified sampling as the number of pupils in each year group is different. By using this method I will be getting a representative proportion of each year group making my data fair. Once I have calculated the sample size for each strata I will use random sampling (using the Ran# button on the scientific calculator) to select the calculated number of pieces of data from the strata. However, before that I will set my calculator to be fixed on zero decimal places in order to avoid having to round numbers which would increase the chances of repeated figures.

3. ## Anthropometric Data

This may also be on the wider the child's feet the longer the feet. Positive correlation will occur has one of the variable increases so does the than the other. This is saying that has the children grow older the foot length tend to change and the breadth widen. Find the correlation on a scatter graph is this case coefficient correlation will indicate the strength this will be written in depth later on the graph. Correlations The main purpose of having a correlation is a way to measure how associated or related to variable are.

4. ## Statistics. The purpose of this coursework is to investigate the comparative relationships between the depreciation of a cars price, in relation to the factors that affect it.

The following data will help me decide whether this hypothesis is true, when there is more or less years "attached" to the car: * Sale price ( no years attached) * Age of the vehicle When identifying a general trend for these data, I will discard these data that do not fit the trend: these will obscure my general trend and correlation when it comes to graph analysis. I.e. Some cars may appreciate in value in this particular group: some vintage cars will do this as they gain "collectors' item" status after a number of years.

5. ## Probability of Poker Hands

Measuring the Probability of getting no pairs: PNo Pairs = 13C5 x 4^5 - 4C1 x 13C5 - (10 x 4^5 - 10 x 4) ---------------------------------------------------- 52C5 = 1317888 - 5148 - 10200 ------------------------------- 2598960 = 1302540 -------------- 2598960 = 21709 ---------------- 43316 Explanation: Choosing five different face values: To begin with, the first given condition for this hand is that, we need 5 different cards out of total 52 cards with five different face values. There are thirteen different face values, (A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K)

6. ## AS statistics coursework - correlation coefficient between height and weight in year 11 boys and girls

I read out their names at the end and asked them to stay behind afterwards. Once I had my sample students I told them my purpose and why I was doing it (coursework) then asked them if there where any problems with me taking their weight and height, with no complaints. I asked them all to remove their shoes and any other clothing other than uniform to minimise difference in weight of clothing and height in heel of shoe. To measure weight I used bathroom scales and measured to the nearest kg (presuming that the scales were accurate and their uniform was the only clothing on not including shoes)

7. ## DATA HANDLING COURSEWORK

and weight when considering age and when considering gender, will not be as accurate as they would be in a stratified sample. I will be taking a 5% sample for each stratum as this size is sufficient enough to allow me to make reliable conclusions, and the sample size is not too large to cause difficulty in analysis of the sample. But for Y11 I will do a 10% sample as a 5% sample is too small to provide a reliable and accurate conclusion because there are different numbers of students in each year or gender, which means that the chance of a certain year group or gender being selected will vary, i.e.

8. ## Statistics Coursework

Graphs like the normal distribution curves are ever so important in these type of investigation especially because the graph itself summarise so many vital information such as the Thirdly, I will then analyse all of the results that I will get from the calculations and evaluate it against my hypothesis. I will analyse all of the data in a more depth by doing standard deviation and Spearman's rank correlation coefficient that will allow me to compare and analyse the data properties using different methods.

9. ## Intermediate Maths Driving Test Coursework

I am going to use the data given to follow my analysis and test the following hypothesises: * The more lessons you take, the less mistakes you will make * Males perform better than Females in the test * One gender performs better under because they get better instructors Preliminary Analysis: In order to notice investigate my hypothesises and notice patterns within this large amount of data, it is necessary to summarise and identify the 240 pupil's data into diagrams and charts.

10. ## Driving test

Firstly, I will tally up the amount of male and female pupils for each instructor. Instructor Male Female Total A 29 31 60 B 49 51 100 C 18 22 40 D 20 20 40 Because of the large amount of statistics, I will take a sample of the data to make calculations easier to manage. The sample should represent the complete data set, so I will take a sample of 60 (a quarter of 240). I will ensure that the proportion from each instructor and of each gender is the same as complete data set.

11. ## I am going to design and then carry out an experiment to test people's reaction times, and therefore test my initial hypothesis.

Therefore the test will be carried out in silence and the participant and the 'dropper' will not make contact. Whether the participant is standing or sitting The 'dropper' and 'catcher' will be sitting on chairs, knees touching, face to face. Whether the measuring side is facing or facing away from the participant The measuring side will be facing the participant, so that they can read the measurement from above the thumb with ease. I learnt from the preliminary test that it was hard to measure if the marks were not facing the participant. Amount of arm movement permitted by participant The hand must hang over the table edge so that the forearm cannot move.

12. ## Used Cars - What main factor that affects the price of a second hand car

then the same car which is less than a year old as the five year old car has had more wear and tear so will not be as reliable as it was before, so this will lower its price on sale. Make A prestigious make car will cost more than a car that is not so prestigious, even if they have the similar specifications as the prestigious car is very much sort and signifies the owner's wealth and status. Model A newer model car will cost more than older model car that has gone out of production as the newer

13. ## Design an investigation to see if there is a significant relationship between the number of bladders and the length of longest frond in Fucus species of seaweed at two different locations on a rocky shore

I also predict that the Fucus vesiculosus found on the lower shore will have fronds of longer lengths and a larger number of bladders than those found on the middle shore. Explanation My first prediction is that there will be a significant relationship between the number of bladders and the length of the longest frond of the Fucus vesiculosus at the lower and middle shores. Being an algae, Fucus vesiculosus obtains metabolic energy from light through photosynthesis. It is therefore essential that the seaweed has a good supply of light energy from the sun so that it can carry out metabolic processes.

14. ## Estimating the length of a line and the size of an angle.

Others may have had a bad day and got a head ache and cannot think properly or some may have eye problems (etc.) so they may not guess the length and angle in the way in which they could. Aim The thing that interests me the most about the coursework is to see which year is more accurate at estimating. As year 11 is generally older than year 10 and have more experience. Do theses factors really affect their ability to estimate. This is interesting to me because small factors such as these can be the cause of their estimations.

15. ## "The lengths of lines are easier to guess than angles. Also, that year 11's will be more accurate at estimating."

Then I am going to do some calculations. For the year 11's I am going to do: (145 / 262) x 100 = 55.3 55.3 is about 55 % This means I need to have 55% of the sample of 60 from year 11's results. 55% of 60 is 33, so I need 33 samples to be Year 11 samples. For The Year 9's I am going to do: (117 / 262) x 100 = 44.6 44.6 is about 45% This means that 45% of the sample of 60 need to be Year 9 results.

16. ## Differences in wealth and life expectancy of the countries of the world

My hypotheses consist of: * The wealth and life expectancy of a continent is linked and is likely to have a strong positive correlation. I believe this happens worldwide. * Females generally tend to live longer than males worldwide. Method I shall acquire a systematically method. This will enable my work to be organised and easy to read. First, and foremost, I shall gather all the data that is presented before me. As my hypotheses are based on worldwide data I believe it is essential for me to use all the data.

17. ## Fantasy Football - Maths Coursework - Statistics

more goals in one match: +5 points extra Defender keeping a clean sheet: +5 points Defender conceding more than one goal: -1 point per goal Receiving a booking: -1 point Player being sent off: -3 points including bookings For achieving the season's top individual points: +20 points All my data is from The Sun Newspaper. The data is secondary data, which has its advantages and disadvantages. The advantage is that it is quicker and easier than going out and collecting the data yourself.

18. ## Case study -Super Savers is wishing to move into the UK Food Retail market.

As sensory analysts, we were asked to give a sensory evaluation of the two products - Ribeena and Tesco blackcurrant squashes - to the 'Super Savers' overseas retailer. "For sensory analysis to be successful, it is necessary for someone to take the responsibility to ensure the tests are carried out in the correct and appropriate manner. This is the role of the sensory analyst or the panel leader" (Lyon, Francombe, Hasdell, Lawson, 1992, p.47). 3. AIMS OF THE PROJECT * To find out if there is a significant difference between the two products * To determine the differences in terms

19. ## Guestimate - investigate how well people estimate the length of lines and the size of angles.

You continue counting in this number until you have your total sample. Note down the name and number of each student you pick. A rule for this could be: "Select every 5th name on the list for your sample." This is all good so far, but there is the very big problem of bias. To avoid bias I will not pick angles such as 90, 45 and 60 degrees as these are obvious angles and can be guessed easily. They are usually taught before doing anything else on angles.

20. ## GCSE Maths Coursework: proportions of different parts of the body and thier relationship to height

6 15 29 63 0.248*80 19.84 20 6 15 31 63 0.508*80 40.64 41 5.5 14 30 70 0.701*80 56.08 56 5 14 29 70 0.762*80 60.96 61 6 13 30 62 0.808*80 64.64 65 6 16 32 67 0.593*80 47.44 47 6.5 14.5 30 68 0.642*80 51.36 52 5.5 15 30.5 74 0.784*80 62.72 63 6 14 31 75 0.053*80 4.24 4 6 16 33 77 0.222*80 17.76 18 6 15 30 60 0.122*80 9.76 10 6 15.5 31.5 70 0.809*80 64.72 65 6 16 32 67 0.607*80 48.56 49 6.5 16 30.5 71 0.982*80 78.56 79 6 14

21. ## GCSE Mathematics Coursework: Statistics Project

The school features Years 7 - 11 and in each year, there are different numbers of girls and boys. Each group in the sample must occur in the same proportion as it does in the overall population of the school, so before I can pick specific pupils to analyse through random sampling, I must work out how many people to choose from the different years and how many should be boys or girls. To do this, I will use stratified sampling. To find the number of Yr 7s required in the sample: 275 x 100 = 24 (to the nearest whole number)

22. ## Investigate a possible relationship between self-esteem and levels of satisfaction in the undergraduate student population.

(Baumeister 1999). Curry and Johnson (1990) describe high self-esteem as a secure sense of identity and an ability to acknowledge and value one's own efforts and achievements. They stress a connection between high self-esteem, confidence, energy and optimism and argue that these traits have their roots in early years. Baumeister, Rice and Hutton (1989) discuss self-esteem in terms of motivational orientation, with high self-esteem giving a self-enhancing orientation. In other words a person considered to have high self esteem is more likely to seek to capitalise on their good traits and pursue successes even under risky conditions.

23. ## Investigating the different relationships between the T-total and T-number of the T-shape by translating it to other positions on the grid.

24. ## "Males in the 11-18years age range will guess the angles and lengths better than females in the 30+years age range."

5 32 m 13 38 4.5 34 m 13 45 6 35 m 12 45 4 36 m 12 45 3 37 m 12 40 4 38 m 12 35 4 39 m 13 45 5 40 m 12 45 3.5 41 m 12 45 4 43 m 13 30 4.5 44 m 14 39 3.5 48 m 13 48 6.5 49 m 13 45 4.2 50 m 13 30 5 51 m 13 45 4.3 52 m 14 35 6.5 53 m 13 40 3 54 m 13 45 4.5 55 m 13 40 7 56 m 13 40

25. ## Analyse a set of results and investigate the provided hypothesise.

During the course of my investigation I will try and eliminate any bias that might occur. This is most likely to happen when I select a range of data from the pool of results, when selecting specific data I will try and sample as many random data as I can and make sure that it hasn't all come from one person. Collection of data As part of this coursework, a given task was to collect data from random people by asking them to estimate the length of a line in (mm) and the size of an angle in (�) degrees. Once these results were taken they were then entered onto an X-cell spreadsheet as raw data.

