General Match Probability

As noted in this entire section, profile probabilities need to be calculated for a variety of scenarios. Balding (1999) points out that there are five different sets of people and possible relationships to a suspect: (1) the suspect's siblings, (2) his other relatives, (3) other members of his sub-population, (4) other members of his racial group, and (5) anyone else outside of his population (e.g., racial) group (see also Foreman and Evett 2001, Weir 2003).

Table 21.5

Example calculations with NRCII recommendations for population substructure adjustments (see Appendix VI). Scenarios with theta equal to 0.01 and 0.03 are examined.

Table 21.5

Example calculations with NRCII recommendations for population substructure adjustments (see Appendix VI). Scenarios with theta equal to 0.01 and 0.03 are examined.

From U.S. Caucasian (N

=302); Appendix II - sample in database

Under HWE

NRCII Recommendation 4.1

NRCII Recommendation 4.10

A1

A2

Allele 1 freq (p)

Allele 2 freq (q)

Calc freq

e = 0.01

0=0.03

0=0.01

0=0.03

D13S317

11

14

0.33940

0.04801

2pq

0.0326

2pq

0.0326

0.0326

eq. 4.10b

0.0386

0.0504

TH01

6

6

0.23179

P2

0.0537

p2 + p(1

-p) e

0.0555

0.0591

eq. 4.10a

0.0628

0.0821

D18S51

14

16

0.13742

0.13907

2pq

0.0382

2pq

0.0382

0.0382

eq. 4.10b

0.0419

0.0493

D21S11

28

30

0.15894

0.27815

2pq

0.0884

2pq

0.0884

0.0884

eq. 4.10b

0.0927

0.1011

D3S1358

16

17

0.25331

0.21523

2pq

0.1090

2pq

0.1090

0.1090

eq. 4.10b

0.1129

0.1206

D5S818

12

13

0.38411

0.14073

2pq

0.1081

2pq

0.1081

0.1081

eq. 4.10b

0.1131

0.1228

D7S820

9

9

0.17715

p2

0.0314

p2 + p(1

-p) e

0.0328

0.0358

eq. 4.10a

0.0390

0.0556

D8S1179

12

14

0.18543

0.16556

2pq

0.0614

2pq

0.0614

0.0614

eq. 4.10b

0.0654

0.0733

CSF1PO

10

10

0.21689

p2

0.0470

p2 + p(1

-p) e

0.0487

0.0521

eq. 4.10a

0.0558

0.0744

FGA

21

22

0.18543

0.21854

2pq

0.0810

2pq

0.0810

0.0810

eq. 4.10b

0.0851

0.0930

D16S539

9

11

0.11258

0.32119

2pq

0.0723

2pq

0.0723

0.0723

eq. 4.10b

0.0773

0.0871

TPOX

8

8

0.53477

p2

0.2860

p2 + p(1

-p) e

0.2885

0.2934

eq. 4.10a

0.2983

0.3227

VWA

17

18

0.28146

0.20033

2pq

0.1128

2pq

0.1128

0.1128

eq. 4.10b

0.1167

0.1245

AMEL

X

Y

1.20E-15

1.35E-15

1.70E-15

3.92E-15

Example calculations with corrections for relatives using the NRCII recommended formula.

Table 21.6

Example calculations with corrections for relatives using the NRCII recommended formula.

From U.S. Caucasian (N

= 302); Appendix II - sample in database

Under HWE

NRCII Recommendation 4.4

A1

A2

Allele 1 freq (p)

Allele 2 freq (q)

Calc freq

F = 1/4 (parent)

F = 1/8 (half sib)

F = 1/16 (1st cousin)

Full sib

D13S317

11

14

0.33940

0.04801

2pq

0.0326

eq.

4.8b

0.1937

0.1131

0.0729

eq.

4.9b

0.3550

TH01

6

6

0.23179

P2

0.0537

eq.

4.8a

0.2318

0.1428

0.0982

eq.

4.9a

0.3793

D16S539

9

11

0.11258

0.32119

2pq

0.0723

eq.

4.8b

0.2169

0.1446

0.1085

eq.

4.9b

0.3765

D18S51

14

16

0.13742

0.13907

2pq

0.0382

eq.

4.8b

0.1382

0.0882

0.0632

eq.

4.9b

0.3287

D21S11

28

30

0.15894

0.27815

2pq

0.0884

eq.

4.8b

0.2185

0.1535

0.1209

eq.

4.9b

0.3814

D3S1358

16

17

0.25331

0.21523

2pq

0.1090

eq.

4.8b

0.2343

0.1717

0.1403

eq.

4.9b

0.3944

D5S818

12

13

0.38411

0.14073

2pq

0.1081

eq.

4.8b

0.2624

0.1853

0.1467

eq.

4.9b

0.4082

D7S820

9

9

0.17715

p2

0.0314

eq.

4.8a

0.1772

0.1043

0.0678

eq.

4.9a

0.3464

D8S1179

12

14

0.18543

0.16556

2pq

0.0614

eq.

4.8b

0.1755

0.1184

0.0899

eq.

4.9b

0.3531

CSF1PO

10

10

0.21689

p2

0.0470

eq.

4.8a

0.2169

0.1320

0.0895

eq.

4.9a

0.3702

FGA

21

22

0.18543

0.21854

2pq

0.0810

eq.

4.8b

0.2020

0.1415

0.1113

eq.

4.9b

0.3713

TPOX

8

8

0.53477

p2

0.2860

eq.

4.8a

0.5348

0.4104

0.3482

eq.

4.9a

0.5889

VWA

17

18

0.28146

0.20033

2pq

0.1128

eq.

4.8b

0.2409

0.1768

0.1448

eq.

4.9b

0.3986

AMEL

X

Y

1.20E-15

3.17E-09

1.68E-11

3.74E-13

1 in 247616

Relationship Match probability formula

Homozygotes (AA

Full siblings (1+pf)2 + (7 + 7pf - 2p2)9 + (16 - 9pf+pf)&

Parent and child

Unrelated

Full siblings

Parent and child

Half siblings [29 + (1 -9)pf][2 + 49 + (1 -9)pf] 2(1 +9)(1+29)

First cousins [29 + (1 -9)pf][2 +119 + 3(1-9)pf] 4(1 + 9)(1 +29)

Heterozygotes (A,A^

(1 + pj+p! + 2p p) + (5 + 3p,+ 3pj- 4p,p,) 9 + 2(4 - 2p,- 2p,+p,p) 92 4(1 +9)(1+29)

Half siblings (pf+p; + 4pfp;) + (2 + 5pf + 5p; + 8pfpj) 9 + (8- 6p f - 6p,+4pfpj) 92

First cousins (pf+p-t + 12pfp/) + (2 + 13pf + 13p;-24pfpj) 9 + 2(8-7p -7p+ + 6pfp) 92

Unrelated 29 [+(1 -9)p,][9 + (1 -9)p;] (1 +9)(1+29)

Result with TH01 6,6 0.38921

0.24700

0.27479

0.10888

0.06283

Result with D13 11,14

0.35955

0.19977

0.11921

0.07893

0.03864 2pq = 0.03259

Table 21.7

Effects of family relatedness on match probabilities (adapted from Weir 2003, p.839). Notice that the unrelated formulas are the same as those for NRC II recommendations 4.10a and 4.10b (see Appendix VI). Worked examples are using 6 = 0.01 and the allele frequencies found in Appendix II for Caucasians: p(TH01 allele 6) = 0.23179; p(D13 allele 11) = 0.33940; p(D13 allele 14) = 0.04801.

One solution to this is the use of general match probabilities that have been calculated from the theoretically most conservative method involving the most two common alleles for each locus (D.N.A. Box 21.2) (Foreman and Evett 2001). The primary advantage of this approach is that repeated calculations are not required for each profile observed. Another reason that Foreman and Evett (2001) advocate a general match probability that avoids case-specific calculations is that it is difficult to provide any sound statistical support for probabilities of such a small magnitude (e.g., 10-21).

LIKELIHOOD RATIO

As noted previously, when matching STR profiles are obtained between a suspect (who then becomes the defendant in a court case; in other words the known reference sample, K) and the crime scene evidence (questioned sample, Q), it is necessary to quantify the evidentiary value of this match.

In a paper performing statistical analyses to support forensic interpretation of the 10-loci present in the SGM Plus kit, Foreman and Evett (2001) advocate the use of general probability values when reporting full matching STR profiles. With the 10 STR loci present in the SGM Plus kit used in the UK and Europe, the probabilities are as follows (see Foreman and Evett 2001, Table 4):

Relationship with suspect Match probability

Sibling 1 in 10000

Parent/child 1 in 1 million

Half-sibling or uncle/nephew 1 in 10 million

First cousin 1 in 100 million

Unrelated 1 in 1 billion

They argue that adoption of such figures would eliminate the need to perform case-specific match probabilities making it much easier to present information to the court. The match probabilities for specific STR profiles are typically several orders of magnitude smaller than those given above, which were calculated from the theoretically most common SGM Plus profile. Thus, these probabilities should provide a fair and reasonable assessment of the weight of DNA evidence for each category and in the end would probably be favorable to the suspect (defendant).

A similar calculation for a full match with the 13 CODIS loci using the most common alleles observed in U.S. population databases, such as Appendix II, would result in even higher general match probability values since more STR loci are being examined.

Sources:

Foreman, L.A. and Evett, I.W. (2001) Statistical analyses to support forensic interpretation for a new 10-locus STR profiling system. International Journal of Legal Medicine, 114, 147-155. Balding, D.J. (1999) When can a DNA profile be regarded as unique? Science & Justice, 39, 257-260.

D.N.A. Box 21.2 General match probability values

Another approach besides the match probability profile frequency estimate just described is the use of likelihood ratios (LR). LRs involve a comparison of the probabilities of the evidence under two alternative propositions. These mutually exclusive hypotheses represent the position of the prosecution - namely that the DNA from the crime scene originated from the suspect - and the position of the defense - that the DNA just happens to coincidently match the defendant and is instead from an unknown person out in the population at large.

A likelihood ratio is a ratio of two probabilities of the same evidence under different hypotheses. For example, if a DNA profile generated from a crime scene evidence sample matches a suspect's DNA profile, then there are generally two possible hypotheses for why the profiles match each other: (1) the suspect matches because he left his biological sample at the crime scene or (2) the true perpetrator is still at large and just happens to match the suspect at the DNA markers examined.

Typically the first hypothesis (and that championed by the prosecution) is placed in the numerator of the likelihood ratio while the second hypothesis -that someone else other than the defendant committed the crime (which is of course the defense's position) - is placed in the denominator.

Thus, in mathematical terms:

LR = Hp/Hd or verbally the likelihood ratio equals the hypothesis of the prosecution divided by the hypothesis of the defense. Since the hypothesis of the prosecution is that the defendant committed the crime, then Hp = 1 (assumes 100% probability). On the other hand the hypothesis of the defense that the profile originated from someone else can be calculated from the genotype frequency of the particular STR profile. If the STR typing result is heterozygous, then this probability would be 2pq, where p is the frequency of allele 1 and q is the frequency of allele 2 in the relevant population for the locus in question. Alternatively, for a homozygous STR type the Hd would be p2.

Therefore,

If the STR type in question was D13S317 alleles 11 and 14, then p is 0.3394 and q is 0.04801 for the Caucasian population (Appendix II). The likelihood ratio for the D13S317 genotype match then becomes

Hd 2pq 2(0.33940)(0.04801) 0.03259

Note that the rarer the particular STR genotype is, the higher the likelihood ratio will be since there is a reciprocal relationship. In its simplest form, a LR is the inverse of the estimated genotype frequency for each locus and if discrete alleles and independent marker systems are utilized, then the LR is simply the inverse of the relative frequency of the observed genotype in the relevant population. Of course, LRs can become much more complicated if mixtures or alternative scenarios for the evidence are possible (see Chapter 22, Table 22.1). The product of all locus-specific LRs results in the full profile LR, which in the example of the Caucasian data shown in Table 21.2 comes to 8.37 X 1014 (the inverse of 1.20 X 10-15).

If the value for a likelihood ratio is greater than one, then it provides support to the prosecution's case. If on the other hand, the LR is less than one, then the defense's case is supported. In the example shown here, if there is a match between a crime stain possessing D13S317 alleles 11 and 14 and the suspect who also possesses a D13S317 genotype of 11,14, then it is 30.7 times more likely if the suspect left the evidence than if it came from some unknown person out of the general Caucasian population.

When considering the strength of a likelihood ratio in terms of supporting the prosecution's position, the following guidelines have been suggested (Evett and Weir 1998, p. 226):

With a 13-locus STR match likelihood ratio of 8.37 X 1014 based on a full profile with unambiguous results (e.g., no mixture present), the evidence has extremely strong support from the proposition that the suspect supplied the evidentiary sample.

SOURCE ATTRIBUTION

Given that DNA evidence can provide strong likelihood ratios and random match probabilities from forensic samples that exceed the world population many fold, the Federal Bureau of Investigation (FBI) Laboratory has adopted a source attribution policy (Budowle et al. 2000, DAB 2000). With average random match probabilities of less than one in a trillion using the 13 core STR loci (Chakraborty et al. 1999), there comes within the context of a particular case a high degree of confidence that an individual is the source of an evidentiary DNA sample with reasonable scientific certainty. The logic behind this source attribution policy is provided below.

If px is the random match probability for a given evidentiary profile X, then (1 — Px)N is the probability of not observing the particular profile in a sample of N unrelated individuals.

When this probability is greater than or equal to a 1 — a confidence level (with a being 0.01 for 99%), then (1 — px)N > 1 — a or px < 1 — (1 — a)1/N, which enables the calculation that if N is approximately the size of the U.S. population (N = 300 000 000), then a random match probably of less than 3.35 X 10—11 will confer at least 99% confidence that the evidentiary profile is unique in the population (Budowle et al. 2000). Table 21.8 lists the random match probability thresholds for various population sizes and confidence levels.

If likelihood ratio is... 1 to 10 10 to 100 100 to 1000

Then the evidence provides. limited support. moderate support. strong support. very strong support.

1000 and greater

Table 21.8 Random match probability thresholds for source attribution at various population sizes and confidence levels (adapted from Budowle et al. 2000). With a random match probability of1.20 x 10-15 in U.S. Caucasians (see Tables 21.2 and 21.5), the example STR profile would be considered 'unique.'

Sample Size (N)

Confidence Levels (1 - a)

World pop

10 25 50 100 1000 100 000 1 000 000 10 000 000 50 000 000 260 000 000 300 000 000 1 000 000 000 6 000 000 000

0.90

2.09 x 10-2 1.74 x 10-2 1.49 x 10-2 1.31 x 10-2 1.16 x 10-2 1.05 x 10-2 4.21 x 10-3

0.95

1.02 x 10-2 8.51 x 10-3 7.30 x 10-3 6.39 x 10-3 5.68 x 10-3

0.99

2.01 x 10-3 1.67 x 10-3 1.43 x 10-3 1.26 x 10-3 1.12 x 10-3

1.05 x 10-7 1.05 x 10-8 2.11 x 10-9 4.05 x 10-10 3.51 x 10-10 1.05 x 10-10 1.76 x 10-11

5.13 x 10-8 5.13 x 10-9 1.03 x 10-9 1.97 x 10-10 1.71 x 10-10 5.13 x 10-11 8.55 x 10-12

0.999

5.00 x 10-4 3.33 x 10-4 2.50 x 10-4 2.00 x 10-4 1.67 x 10-4 1.43 x 10-4 1.25 x 10-4 1.11 x 10-4 1.00 x 10-4 4.00 x 10-5 2.00 x 10-5 1.00 x 10-5 1.00 x 10-6

1.01 x 10-8 1.01 x 10-9 2.01 x 10-10 3.87 x 10-11 3.35 x 10-11 1.01 x 10-11 1.68 x 10-12

1.00 x 10-9 1.00 x 10-10 2.00 x 10-11 3.85 x 10-12 3.33 x 10-12 1.00 x 10-12 1.67 x 10-13

A statement provided with a report involving a source attribution might include the following words: 'In the absence of identical twins or close relatives, it can be concluded to a reasonable scientific certainty that the DNA from (x) and from came from the same individual' or 'Reasonable scientific certainty means that you are (x%) certain that you would not see this profile in a sample of ())) unrelated individuals.'

It should be pointed out that if the possibility exists that a close relative of the accused had access to the crime scene and may have been a contributor of the evidence, then the best action is to obtain a reference sample from the relative (DAB 2000). This scenario should be sufficient probable cause for obtaining a reference sample, typing it with the same STR markers as the evidence, and using this information to resolve the question of whether or not the relative carries the same DNA profile as the accused.

OTHER TOPICS OF INTEREST

Was this article helpful?

0 0
Stuttering Simple Techniques to Help Control Your Stutter

Stuttering Simple Techniques to Help Control Your Stutter

Discover Simple Techniques to Help Control Your Stutter. Stuttering is annoying and embarrassing. If you or a member of your family stutters, you already know the impact it can have on your everyday life. Stuttering interferes with communication, and can make social situations very difficult. It can even be harmful to your school or business life.

Get My Free Ebook


Post a comment