Pascal and Fermat’s work eventually provided the foundations for the modern probability theory, which was advanced by Russian mathematician, Andrey Kolmogorov in the 1930’s, and became the theory we know today. Probability Theory is helpful in court cases as it allows the prosecution and the defence to look at the chance of the accused being innocent or guilty from both sides of the argument using probability to assist them.
Another area of particular interest and controversy has been Bayes’ Theorem. Bayes’ Theorem is an elementary proposition of probability theory - it provides a way of updating, in light of new information, one’s prior probability that a proposition is true. For example, suppose that the proposition to be proven is that the defendant was the source of a blood found at the crime scene. Before learning that the blood was a genetic match for the defendant’s blood, the police believe that the odds are 2 to 1 that the defendant was the source of the blood. If they used Bayes’ theorem, they could multiply those prior odds by a “likelihood ratio” in order to update their odds after learning that the evidence matched the defendant’s blood. The likelihood ratio is a statistic derived by comparing the odds that the evidence (expert testimony of a match) would be found if the defendant was the source of the blood, with the odds that it would be found if defendant was not the source of the blood. If it is ten times more likely that the testimony of a match would occur if the defendant was the source than if not, then the police need to multiply their odds by ten, giving posterior odds of 20 to one.
Sceptics of Bayes Theorem have objected to the use of the Theorem in a legal setting on a variety of grounds; these run from jury confusion and computational complexity to the assertion that standard Probability Theory is not a satisfactory basis for the adjudication of rights.
Supporters of Bayes’ Theorem have two counter-arguments; the first being that whatever its value in litigation, Bayes’ Theorem is valuable in studying evidence rules and that it can be used to model relevance. The second is that it is practical to use Bayes' Theorem in a limited set of circumstances in legal proceedings (such as integrating genetic match evidence with other evidence), and that assertions that probability theory is inappropriate for judicial determinations are “nonsensical or inconsistent.” (Stanford, 2003).
One case where it was helpful occurred in Japan. A lady was admitted to a welfare institution for handicapped persons and a doctor and lawyer of the institution asked for a paternity test. They tested the foetus with the her consent and also took samples from the mother, the younger brother, the father, the grandfather and four staff members from the institution. The only person that was not excluded via this testing was the younger brother and the paternity probability was estimated to be 99.857% on the basis of newly formulated expressions of multialleclic loci (this is a set of genes where only two of the genes can be present in a diploid organism, meaning there are two copies of the gene, one from the mother and one from the father) on the charge of sibling incest. It was therefore concluded that the father of the foetus was the brother (Tamura, A et al).
Whilst the theory of probability is very interesting and can be very helpful, it can also be equally unhelpful and confusing. Koehler (2000) states that, “Research suggests statistics often cause confusion in court”. One such case where exactly that happened is in the case of Sally Clark. Sally had a son, Christopher, with her husband Steve. When Christopher was 11 weeks old, he was found dead in his Moses basket. Sally and Steve had a second son, Harry, the next year. Harry collapsed in his bouncy chair when he was 8 weeks old and later died. Sally and Steve were both arrested and eventually Sally was convicted of the murder of both of her babies. Roy Meadows, a paediatrician, stood up in court and stated that “the chances of two cot deaths occurring in the same family is 73 million to 1”. His rule of thumb was “there is no evidence that cot death runs in families, but there is plenty of evidence that child abuse does”. It was encapsulated in what became known as Meadow's law: “one cot death is tragic, two is suspicious, and three is murder.” Meadow’s calculated his damning statement by squaring the 8,500 to 1 odds of cot death in a normal family and claimed that, “It was as likely as an 80-1 horse winning four consecutive Grand Nationals”. In the first appeal, key statistic were described as being “misleading”, given that if a family suffers one cot death, it is more likely to incur a second (Times Online). After seven years Sally Clark was eventually vindicated, but sadly died in 2007. The evidence that Roy Meadow used and the way in which he presented it is a phenomenon known as, “Prosecutor’s Fallacy”. This occurs when someone (usually an expert witness) begins to not only give evidence outside their field of expertise – let us not forget that Meadow was a paediatrician, not a statistician, but also bases their argument on a misunderstanding of probability and statistics. In this case; the probability of cot death occurring in a family and the probability of cot death occurring more than once in the same family. Meadow’s error was to treat the two unexplained deaths as if they were statistically independent of each other, when in fact, there is good reason to suppose that the likelihood of a death from SIDS in a family is significantly greater if a previous child has already died in these circumstances – Meadow’s didn’t realise that the likelihood of two SIDS deaths in the same family cannot be computed by simply squaring the likelihood of a single such death in all families. Prosecutor’s fallacy can also occur through misunderstanding conditional probability, or neglecting the prior odds of a defendant being guilty; i.e., the chance an individual might be guilty even though there's no evidence directly implicating them. When a prosecutor has collected some evidence (for instance a DNA match) and has an expert testify that the probability of finding this evidence if the accused were innocent, is miniscule, then the fallacy occurs if it’s concluded that the probability of the accused being innocent must be comparably tiny. The probability of innocence would only be the same small value if the prior odds of guilt were exactly 1:1. In reality the probability of guilt would depend on other circumstances. If the person is already suspected for other reasons, then the probability of guilt would be very high, whereas if they were otherwise totally unconnected to the case, then there is a much lower prior probability of guilt. Another reason the fallacy occurs results from misunderstanding the idea of multiple testing i.e. when evidence is compared against a large database. The size of the database elevates the likelihood of finding a match by pure chance alone. For example, DNA evidence is soundest when a match is found after a single directed comparison, because the existence of matches against a large database where the test sample is of poor quality (common for recovered evidence) is very likely by mere chance.
The same thing can happen in reverse, with a phenomenon called “Defendant’s Fallacy”. This is exactly the same as Prosecutor’s Fallacy, except that instead of mixing up statistics, the defence can forget to include (or ignore) a piece of evidence that makes the probability of innocence much smaller. For example, let us say that some hair has been found at a crime scene, and let us say that there is a one in a million chance of the hair not matching the suspect, the prosecutor states that this means there is only a one in a million chance of innocence. However, if everyone in the community where the crime occurred (let’s say London, and the population is 10 million) is tested, then there would be 10 matches, even if everyone tested is innocent. The fallacy would occur if the defence said, “There are 10 matches in London, and 10 million people, so this particular piece of evidence (the hair) means that there is a 90% chance that the accused is innocent.” The problem with this is that the defence fails to take into account other evidence, which, although on its own is also inconclusive, when placed with the hair found at the scene suddenly becomes much more conclusive. For example, CCTV cameras at the scene spot one hundred people there at the time of the crime, and none of them is the accused. The defence could now claim that the video suggests a 99% chance of innocence, and the match of the hair suggests a 90% chance of innocence, so the verdict should be innocent (Lanctot, 1991-2).
It becomes clear from these fallacies and cases such as that of Sally Clark, that the use of probability and statistics in court does cause confusion among jurors and can lead to guilty verdicts where the accused is innocent, and vice versa. Personally, I think that statistics and probability can be extremely useful in court, but only if used correctly and with caution. When “experts” begin throwing numbers they don’t necessarily fully understand around it can sway the jury to a decision that they would not otherwise have reached, because they are confused by the complexity of the testimony they are hearing, and this leads to miscarriages of justice, but equally it can be very helpful and help the jury to see how likely it is that someone is innocent or guilty.
References:
-
Oystein, O (May, 1960) “Pascal and the Invention of Probability Theory”. USA: The American Mathematical Monthly, Vol. 67, No. 5 pp. 409-419.
-
David, F.N., (1962) “Pascal, letter to Fermat” quoted in Games, Gods, and Gambling, Griffin Press, p. 239.
-
Kolmogorov, A.N. (Jan 1999), Foundations of the Theory of Probability, 2nd edn, Rhode Island: American Mathematical Society pp. 25-38.
-
Stanford Encylopedia of Philosophy (2003) “Bayes’ Theorem”, Accessed at: [] on the 20th March 2010.
-
Tamura, A (2000), “Sibling incest and formulation of paternity probability: case report”. USA: Legal Medicine, Volume 2, Issue 4, pp.189-196.
-
Koehler, J. (2002) “When do Courts think Base Rate Statistics Are Relevant?”, Accessed at: [] on 21st March 2010.
-
Times Online (2006) “The Mistake That Cost Roy Meadow His Reputation”, Accessed at: [] on 21st March 2010.
-
Lanctot, C.J. (1995) ”Defendant Lies and the Plaintiff Loses: The Fallacy of the Pretext-Plus Rule in Employment Discrimination Cases”, Accessed at: [] on 22nd March 2010