Comparison To Other Population Data Sets

Once allele frequencies have been evaluated within a locus and between loci it is also informative to test whether two or more populations differ in their allele frequencies at a given locus. If two sets of population data are similar, then perhaps they can be combined to yield a larger data set. Comparisons between two or more sets of alleles may be performed with R X C contingency tables, which involve showing the responses of one variable as a function of another variable.

When examining 24 populations from European databases run with SGM Plus loci, Gill et al. (2003) utilize the following function to determine if a genotype match probability determined from the combined allele frequencies of all population data (Pcombined) is conservative relative to a genotype match probability calculated from allele frequencies with just a single population data set (Porigin):

If the d value is ^gathe then Porigin is less than ^combined suggesting that PCombined is conservative (Gill et al. 2003).

In order to get a good feel for whether or not a particular sample set is similar to another population data set, comparisons of allele frequencies may be made to information from the same or different population groups (see Table 20.4). A visual comparison of allele frequencies between different population groups may be conducted with histogram plots (Figure 20.3).

In the end, many laboratories, particularly forensic DNA typing laboratories in the United States using the FBI's Combined DNA Index System (see Chapter 18) will utilize the PopStats database that is part of the CODIS software. PopStats allele frequencies for the 13 CODIS core loci were determined from multiple populations around the U.S. (Budowle et al. 2001). These population data sets have been extensively examined prior to their routine use in determining frequency estimates for DNA profiles.

