# The Game Of Spell

SPELL

SPELL is a game in which the players have to make words out of the letters that they are given. In a similar fashion to Scrabble, players are awarded points for each of the letters that they manage to use. These points vary on the relative frequency of the letters and the score for the complete word is the sum of all the letter points. In this piece of coursework we have been asked to investigate and determine whether the point system is a valid, appropriate and useable system in the game of SPELL, or to reallocate the points for the letters. The designers of the game SPELL made a few errors in the SPELL’s point system, which gave some of the letters inappropriate points for example; the letter “T” is given the highest value which is 10, this is odd as t is a very commonly used letter and can is applied in many words and this point shall be justified through my data. Also the letter “V” is given 2 points which is one of the lowest values, this is strange as v is a very uncommonly used letter and should undoubtedly have a higher score I shall also through my data justify this point.

Aims

The aims for this piece of coursework is to justify if the current SPELL score is correct or not, and then to further improve it if not suitable. After creating a new SPELL score I shall test it to gain validation.

Hypothesis

My hypothesis is that most of the SPELL’s point system has unfairly given letters inappropriate amounts of points. I suspect that letters such as E, T and A are going to come out very common and be awarded a low score while letters such as Z, Q and X are going to be very uncommon and result in being awarded high scores.

Method

To gather data I will need to use various sources form a wide range of possible sources. To do this, I picked 10 different categories; each completely different. We first picked 150 genres or formats. Then we decided to write them on pieces of paper and put them in a hat. We did this and subsequently randomly picked 10 genres, these were to be our sources and our supply of data. In a group of 5 people, each of us chose 2 groups to work on, and gather roughly 1500 letters from each source. The sources we selected are:

• Fiction Books
• Teen magazines
• Essays
• Tabloid Newspapers
• Internet Articles
• Play scripts
• Travel Writing
• Encyclopaedia
• Poems (from anthologies)

To make sure that the data was valid from these sources we had to have a method that would not be affected by any form of bias, so I used a random number generator to get numbers which I applied to these things. One thing that it was applied to is to generate a Dewey-decimal number for books in each category (that is if they were in a library), to get a page number and choose a starting letter. That means that I used the Dewey-decimal number which I generated using my calculator which was 822, when we inputted this into the school library system it came out with play script and so I generated another number which I used to pick which book it would be. However the 2 sources that were given to me were internet article and teen magazine. As I couldn’t use the Dewey-decimal number to gain sources from some of the categories I had to look for an alternative. This means that for things such as newspapers and magazines I had to use a different technique compared to the Dewey-decimal number, so for magazines I used the last 10 issues and then generated a random issue number  (1 being oldest, 10 being newest) as well as a page number and starting letter, this method was also used for newspapers. For the internet article I went to wikipedia which has a random article generator and I picked 10 articles from this and then used the random number generator to pick one form the ten then I used it to find a starting letter. From the starting letter I then sampled the next 10 letters. I did this so that the data would not be biased in any way. When the data was collected all the rules applied, which I shall speak about later, were the same. I sampled 1500 letters from each of my sources and so did the other 4 in my group so that we had a wide range of data from which we could draw accurate and reliable conclusions. Altogether we had ten sources and about 15000 letters as for some of the sources

we couldn’t stop half way through a word.

This is the data that I have gathered.

These graphs show the difference in data between each letter.

Rules

As I have said before I gathered near about 15000 letters and the rules I applied to this were:

• No proper nouns – place names, names of people, etc.

• No words with less than 4 letters

The first rule is easy to explain; in an article about a man from Southampton, for example, it is likely that the word 'Southampton' will come up far ...