'The Average Length of a Sample of 52 Randomly Selected Films'.

Statistics Coursework 1

The title of my investigation is 'The Average Length of a Sample of 52 Randomly Selected Films'. As you can interpret from this title, I am going to investigate into the length of a set of randomly selected feature length films. I have chosen to investigate into this topic because I am a Film Studies student and am required by this subject to produce a feature length film as part of a coursework assignment. This leaves me with the predicament of trying to decide how long my feature film should be. Is there a certain length of film that after the audience has finished watching, they feel that they have not got their money's worth and in consequence leave the cinema feeling disappointed? In contrast, is there a certain length of film that after the audience has finished watching, they are bored because the film has run on for too long? In short, my aim for this investigation is to find the average length of feature films and so enable me to correctly decide the length of my Film Studies coursework film so that the audience leaves happy after viewing it.

To determine a population for my course-work, I am going to use the HMV superstore Internet site to search for my sample of films. This film site is split up into categories corresponding to letters of the alphabet, therefore there is a total of 26 categories as there is 26 letters of the alphabet. Each category contains roughly 500 films giving a total population of 130,00 films. Obviously this population is far too vast, so I what I propose to do is to number each film in a category and then the random number generator function on my calculator to give two random numbers. I then match up the numbers with the corresponding films, and get two films for each category, complete with total running time and year of release. I will therefore end up with a sample of 52 films, fulfilling the criteria of a sample of at least 50 items of single variable data. To ensure that my sample of films is as accurate as possible, I am going to cross-reference the data that I collect from the HMV site with data from the Channel 4 Film Four website. Film Four is regarded in film society as being the definitive movie database, and if this site does not know what you are looking for, then no one does. I would use this site to originally collect my data, but it has a smaller total population of films, that is, its categories do not contain as many films as HMV does. It is never the less a useful point of cross-reference.

See overleaf for collected data

On this page I have included results that I calculated with the help of Microsoft Excel. I can then later compare my own calculated results with those shown below.

To get a visual idea of the spread of my data, I decided to represent it in a stem and leaf diagram:

N = 52 156 6 represents 156 mins.

Stem and Leaf diagram showing the total duration of a sample of 52 films (unsorted)

To help me when constructing a cumulative frequency diagram, I have sorted the above diagram:

N = 52 156 ...