• Join over 1.2 million students every month
• Accelerate your learning by 29%
• Unlimited access from just £6.99 per month

# AS and A Level: Probability & Statistics

Browse by

Currently browsing by:

Meet our team of inspirational teachers

Get help from 80+ teachers and hundreds of thousands of student written documents

## Statistical diagrams

1. 1 When working with grouped data, if the class is from 9 to 12, this includes values from 8.5 up to 12.5, which means the class width is 4, not 3.
2. 2 In a histogram, the area is the frequency. The y-axis is the frequency density.
3. 3 When working out lengths of scaled histograms, it is always helpful to draw the rectangle and label the relevant sides with the lengths given.
4. 4 When drawing a stem and leaf diagram, make sure to include a key. The key is worth a mark. For example 2|1 represents 2.1 or on a different stem and leaf, 3|2 represents 32.
5. 5 Always draw a scale when drawing a box plot, the scale is worth a mark.

## How to tackle questions on regression and correlation

1. 1 When asked about the relationship in a regression model, always get the context the correct way round. For example, weight does not affect height, height effects weight.
2. 2 When asked if your answer is reliable for the regression model, comment on whether the x value you used to get the answer is within the original data set. If the x value is within the boundary it is suitable. Never extrapolate when using a regression model.
3. 3 If you have found a regression model for a relationship between h and p, and are then told h=x+100 and p=y-20 and asked to find a regression model for x and y. Sub x+100 and y-20 into your original equation and re-arrange.
4. 4 When data is coded the correlation co-efficient is not changed.
5. 5 If a regression model is created using for example, heights and weights of children. This model could not be used to predict the weight of an adult. Models are very specific to the data with which they were created..

## Normal distribution

1. 1 When answering normal distribution questions always draw a picture and shade in the part of the graph that you know and/or want.
2. 2 In a normal distribution, the area under the curve represents the probability.
3. 3 A normal distribution model is appropriate if the mean and median are the same, or very close.
4. 4 The big normal distribution table gives area to the left of the line. The small table has areas to the right of the line.
5. 5 If unsure what the question is asking. Do the first step which is to rewrite the question, but converted to the normal distribution.

• Marked by Teachers essays 2
1. 1
2. 2
3. 3
1. ## Statistics. I have been asked to construct an assignment regarding statistics. The statistics of which I will be using will be football attendances. The football teams I have chosen to use are Birmingham City FC and Chelsea FC.

Attendance Bar Chart This bar chart appears to show Chelsea FC being slightly more consistent, but is still difficult to know. These are also just a few of randomly picked attendances from the groups, so there is no way of knowing who is more consistent as of yet. Line Graph The above results appear to show Chelsea with the straighter line, meaning they could possibly have the most consistency in attendance. This is still too early to tell without significant proof.

• Word count: 1510
2. ## Chebyshevs Theorem and The Empirical Rule

The second shape a scatter diagram may have is anything but a normal curve as in the next drawing: We can do a lot of good statistics with the normal curve, but virtually none with any other curve. Let us assume that we have recorded the 1000 ages and computed the mean and standard deviation of these ages. Assuming the mean age came out as 40 years and the standard deviation as 6 years we can do the following predictions.

• Word count: 1174
3. ## Teenagers and Computers Data And Statistics Project

He wondered about how that had come about. Obviously this was because only the surface area was red. 2. Cube Tables a. 2 x 2 x 2 Cube Number of red faces Number of cubes 0 0 1 0 2 0 3 8 4 0 5 0 6 0 Total 8 b. 3 x 3 x 3 Cube Number of red faces Number of cubes 0 1 1 6 2 12 3 8 4 0 5 0 6 0 Total 27 c.

• Word count: 1785
4. ## Frequency curves and frequency tables

Positively skewed. Eg. Non-symmetrical with the longer 'tail' of the frequency curve to the right. f (1) x (2) x (3) x Question 2 There are general rules of constructing Frequency tables. A Frequency distribution is a table in which the values for a variable are grouped into classes and the number of observed values that belong in each class is recorded. Data organized in a frequency distribution are called grouped data every individual observed value of the random variable is listed. Regardless of whether or not the data are grouped, the collection of values may be for either a sample or a population.

• Word count: 1786
5. ## Investigate the relationships between height and weight

I will use secondary data, the advantages I have are that I will not waste time collecting the data myself but the disadvantage is that the data might be unreliable. To get the sample of 10% I will use random sampling and stratified sampling. Stratified Sampling When a population is made up of different groups, Bias can be reduced by representing each group in a sample. Our sample size is 10% of 1183, which is 118. BOYS GIRLS TOTAL YEAR 7 131 151 282 YEAR 8 125 145 270 YEAR 9 143 118 261 YEAR 10 94 106 200 YEAR 11 86 84 170 1183 Stratified Sampling Multiply 118 by the fraction, each sub-group represents of the whole population.

• Word count: 1163
6. ## Investigate the relationship between height and weight and how it changes between gender and year

Year 8 female Year 9 male Year 9 female All years Male and Female Correlation Coefficient My correlation coefficient is 0.525854261 The Main Study My Hypothesis is that boys are taller and heavier then girls and the difference between boys and girls will increase as the students get older. Sampling I have chosen a random sample from 7 of the groups I have picked 30 students from each randomly and my results are as below Years 7 Females Year 7 males Year 8 Females Year 8 Males Year 9 Females Year 9 Males Anomalies To make sure my data is

• Word count: 1371
7. ## Carrying out an investigation to research the readability of two articles

Once I have collected the set number of words from each article I will count the number of letters in the words and group them in a table. I will then take the mode, median and mean from both sets of data and use these calculations to compare the readability of the two articles. I will also plot the data on suitable graphs for ungrouped data and find further calculations to further support my investigation such as the standard deviation and interquartile range of the data.

• Word count: 1381
8. ## data handling

261 30 = 8.7 261 32 -2 = 30 8 The collected data sample of 30 students is the raw data. This needs to be arranged into what is known as frequency distribution where like quantities are counted and displayed by writing down how many of each type there are i.e. writing down their frequencies. I will use bar charts, as these are used for discrete data, to analyse the data about KS2 maths results comparing the results for males and females.

• Word count: 1647
9. ## Maths GCSE Statistics Coursework

With this, I expect there to be more extreme data from year 7 than year 11. Plan In order to show that year 11 are better guessers than year 7 I will have to find the average for each year group and the group closer to 1.58 metres will be the better guessers. I will calculate the mean for each year group and then I will verify this result by finding the median from a cumulative frequency graph. The mean formula I will use is Mean = Where f= frequency X=mid point of each group Sampling In order to choose 50 pupils from each year group I am going to use a stratified random sample, using gender as strata.

• Word count: 1268
10. ## My first hypothesis is that school pupils can estimate the length of a line, in millimeters, better than the size of an angle, in degrees. Plan for collecting dataTo see if my hypothesis is true I am going to have to support it with data

Before I collect a sample I will have to decide how much data I will need to collect. This is to make sure that the sample is a fair representative of the population. Taking this and time into account, I will obtain a random sample of 30. I chose a random sample as each member of the population has an equal chance of being selected. Plan for representing data To compare the estimations I will need to use representations. Depending on my results I will pick an average that represents the data most fairly.

• Word count: 1679
11. ## Do Left and right-handed people have roughly the same reaction time with their dominant hand.

Because we are dealing with people we can use decimals so we have to round the decimals. The ratio comes to 3:27 This means we have to choose 3 left handed people and 27 right handed people (the ratio should add up to 30). Because we are using stratified random sampling we cannot just pick the data ourselves, we have to use a calculator. You type in Ran# (number of sample, which in this case is 30). In order not to get decimals you must put it on fix mode.

• Word count: 1104
12. ## There are many measurements available to monitor changes in breathing capacity. One of the simplest measurements used by doctors and patients would be the peak flow meter

• Word count: 1064
13. ## house prices and sales

I therefore searched the internet for data that I could use. I tried looking at Building Societies web-sites for survey results and then I found the web-site for the Land Registry, http://www.landreg.gov.uk/propertyprice. The data that I chose to use is the recorded sales of houses by type and average price for all regions of the country for three-monthly periods of each year. This is secondary data but it is very reliable and therefore ok to use. I will then use an AA map to find the distances from the areas in my random sample to London.

• Word count: 1149
14. ## Is a relationship between the height and weight of a selected sample of Year 7 students

Also, by looking at the time allotted to me for completing the investigation I thought that the best investigation to do was to go for my second option which was to measure the height and weight of some students in different years starting with Year 7. In my chosen investigation there still were difficulties but the ones I have planned for are: Firstly, I would need a lot of time to measure the heights and weights of all the students in all the years.

• Word count: 1747
15. ## To prove my first hypothesis, (i.e. tall students are heavier than short students) I will use a sample

timetables are set out, I figured that I would not have the time needed in order to complete this task in the appropriate way. Some factors which may have caused me a problem if I did choose these hypothesis are, that firstly, some data in the booklet might be wrong or missing (e.g. someone might weigh 200k.g) and I wouldn't know how much they really weigh. Also, by looking at the time allotted to me for completing the investigation I thought that the best investigation to do was to go for my second option which was to use the Mayfield

• Word count: 1417
16. ## Comparative stem and leaf diagram of error lines

12.5% of year 11 were 1.6 cm within whilst 12.5% of year 7 were 1.4 cm within. This all sums up that year 7 are better at estimating since there were more of them closer to the actual measurement. This is evidence against my hypothesis. Comparative stem and leaf diagram of error angles Key: 0 2=02� Year 7: 02, 03, 03, 07, 08, 08, 08, 08 Year 11: 02, 07, 08, 08, 08, 08, 08, 08 Year 7 Mode: 8 Median: 7.5� Lower quartile: 3� Upper quartile: 8� Interquartile range: 8-3=5� Semi-interquartile range: 5�2=2.5 Year 11 Mode: 8� Median: 8� Lower quartile: 7.5� Upper quartile: 8� Interquartile range: 8-7.5=0.5� Semi-interquartile range: 0.5�2=0.25� Aware to be on track with my second hypothesis and to avoid endless diagrams, I've only chosen year 7 (youngest)

• Word count: 1063
17. ## males and females driving

The raw data which I have been given is in list form and provides me with the performance of 239 students, of both sexes and includes: * The number of one hour lessons before successful test. * The total number of minor errors in test(s) taken. * The name of the instructor (4 different teachers). * The day of the test. * The time of the day of the test. The data is in no particular order and reflects a large range and variety of results and performances.

• Word count: 1172
18. ## Investigation into the relationship between P1 exam results and A-level results

This would also require more time perhaps several months before I would be able gather all the results. Therefore I have restricted my population to students at Trinity School who have taken A-level maths in the 5 years. These 5 years should give me a large sample population of 249 students. I will then take another sample of 50 students. To do this, I numbered the students from 1 to 249 and then generated 50 random numbers in Excel. I picked out these 50 results.

• Word count: 1689
19. ## Bivariate Data Exploration

This investigation will examine data from a range of cars, varying in both engine size and insurance group, and if a positive correlation is found between insurance group and engine size, then the concept of 'student cars' will not be such a worrying factor when a student goes to buy his first car, however if there is no correlation then it is entirely possible that insurance companies are charging too much for cars in the 'student car' category. Data Collection: To start the investigation, data needed to be collected before any conclusions could be made.

• Word count: 1630
20. ## Undertake a small-scale survey to estimate population parameters.

10 tubes of smarties will be bought, each from a different shop, and 5 will be selected at random from each tube to be used in the survey. This should produce a random sample. The sample must be random for the Central Limit Theorem to be in effect, so that the distribution of its mean is Normal and predictions can be made about it, even though the distribution of the parent population of smarties is unknown and not necessarily Normal.

• Word count: 1895
21. ## Identifying Substances Experiment

Also, I will know what a chemical and physical property is and I will know how to find them out. Materials Refer to, Chemistry Lab #1 - What's the substance? I didn't change most materials when I did this experiment, but I added 4 materials, which are: * 5 test tubes * 2 stoppers * 1 large piece of paper And I deleted 1 material, which is: * Spatula Methods Refer to, Chemistry Lab #1 - What's the substance? However, I changed some of procedures during my experiment, here is the changes I made in this experiment: * I only

• Word count: 1491
22. ## Hewlett Packard: DeskJet Printer Supply Chain

Vancouver manufacturer operates only as a manufacturer, and hence, optimizes its operation to hold no inventory. Meanwhile, the European distribution center operates only as a distributor, and therefore only manages inventory levels and will not perform any assembly work. If both players would broaden their viewpoints to the entire supply chain, then they could perform final, localization assembly tasks at optimal points in the supply chain to minimize stock outages or overages. Essentially, it is optimal to postpone localization assembly until the moment demand for a localized configuration is known, which is in the European distribution center, closest to the customers that drive demand.

• Word count: 1358
23. ## Linear regressions.

appeared to be significant, then there is serial correlation in residuals. If this is the case, the estimation procedure should be modified as follows. Instead of using Y and X one should use (Yt-?Yt-1) and (Xt-?Xt-1) and estimate the regression (Yt-?Yt-1)=a+b(Xt-?Xt-1)+et Problem 3 Part B An investigator analysing consumers expenditure in the UK using quarterly data over the period 1979-1997 estimated the following two models Model A D4Ct = 0.0083 + 0.558 D4Ct-1 + 0.241 D4ct-2 + 0.037 D4Ct-3 - 0.220 D4Ct-4 (0.0026) (0.096) (0.116) (0.125) (0.103) + 0.208 D4Yt-1 - 0.124 D4Yt-2 + 0.016 D4Yt-3 - 0.172 D4Yt-4 (0.120)

• Word count: 1806
24. ## Do population factors influence crime? Doesthe number of young adults aged 17 to 24 affect the amount of disorder un areas of Stockton-On-Tees?

Because i am looking at disorders per 1000 people i have to find the number of people aged 17-24 per 1000 people in order that it can match my disorders. I will have to use this calculation :- People aged 17 to 24 x 1000 Total population To find out how many young adults ages 17 - 24 there are per 1000 people in each ward. For example I used the above calculation, the Victoria ward where I live has 530 young adults out of a total population of 5770 using the calculation:- 530 x 1000 5770 I found out that it had 91 young adults per 1000 people.

• Word count: 1220
25. ## Dehydration and Gas Chromatography of Methylcyclohexanols.

A simple apparatus for distillation was assembled and two 10 mL graduated cylinders were used to collect the distillate. The contents of the 50 mL round bottom flask were gently brought to a boil and the temperature of the vapor was approximately 115 �C. The rate of heating/boiling was controlled so that the rate of collection in the first 10 mL graduated cylinder was approximately 1 drop per second. When the contents of the distillate in the 10 mL graduated cylinder reached approximately 8 mL in volume the first 10 mL graduated cylinder was removed and a second clean 10 mL graduated cylinder was put in its place to collect an addition 6 mL of distillate.

• Word count: 1048