Once STR genotypes have been generated from population samples the data are typically evaluated through statistical tests to insure that the database will be a useful one when applied to human identity testing. Statistical tests on genetic data have been greatly aided by the availability of computer processing power and a number of computer programs are now available to perform the various tests for independence that will be described below.
Ranajit Chakraborty's group, formerly at the University of Texas and now at the University of Cincinnati, has prepared a software program named DNATYPE that has been used to evaluate a wide set of population data with a number of statistical tests (Figure 20.2). These tests are primarily designed to evaluate independence of alleles within a locus (program 'H') and between multiple loci (program 'D'). Pair-wise comparisons of loci for independence can be done between any two loci in any database (program 'K') and two databases can be compared with one another for genetic similarity (program 'N').
These programs are written in BASIC and run in DOS windows that will pop up when programs are initiated. The individual programs request specific input information within the DOS window (such as name of database file and names of loci to be examined) before they run. The most challenging part about using the program is getting the genotype data into a uniform format that the program can recognize. The program checks a database file for entry errors and accuracy in format and can also search for duplicates.
Program 'H' within DNATYPE examines the genotypes inputted from a data set and outputs the distribution of genotypes and allele frequencies along with several tests that are designed to check whether or not the genotype frequencies are in Hardy-Weinberg equilibrium (HWE) proportions based on the observed allele frequencies. Four HWE tests are performed: the exact test
Figure 20.2 Screenshot of the DNATYPE computer program developed by Ranajit Chakraborty, David Stivers and Yixi Zhong in the 1990''s and widely used for original analysis of much of the restriction fragment length polymorphism (RFLP) and STR data generated by the North American forensic community. It is a DOS based program that has a Microsoft Windows user interface added by Snehit Cherian and Robert Gaensslen through funding by the National Institute of Justice. Tests performed with this program include checking for duplicate sample types in databases, single-locus tests for Hardy—Weinberg equilibrium, multiple locus tests for linkage equilibrium, and genetic distances between pairs of population databases.
(Guo and Thomson 1992), heterozygosity-biased, heterozygosity-unbiased, and likelihood ratio. If all empirical /»-values are above the significance level of 5% (i.e., p >0.05) (see Chapter 19), then the observed genotypes suggest no significant departure from HWE.
Program 'D' evaluates independence of all loci with data where individuals were typed at all loci. This test is based on the distribution of the number of heterozygous loci observed across individuals (Chakraborty 1984). Locus-specific heterozygosities can be used to compute the expected variance of the number of heterozygous loci if linkage equilibrium exists (i.e., independence exists across all loci examined). Global independence of alleles across loci is inferred if the observed variance in the number of heterozygous loci falls within the 95% confidence interval of the expected frequency for the number of heterozygous loci calculated under the assumption of independence. When this occurs, use of the product rule is deemed appropriate for multi-locus genotype probability calculations involving the tested loci.
A number of other computer programs are also available to perform statistical tests on DNA markers and population databases (Table 20.5). Most of these programs are available as free downloads or can be run over the Internet to conduct the statistical tests for Hardy-Weinberg and linkage equilibrium. It is important to note that the output from these programs (e.g., p-values) may not always be the same due to different algorithms used for analyzing the data.
Was this article helpful?
This book discusses the futility of curing stammering by common means. It traces various attempts at curing stammering in the past and how wasteful these attempt were, until he discovered a simple program to cure it. The book presents the life of Benjamin Nathaniel Bogue and his struggles with the handicap. Bogue devotes a great deal of text to explain the handicap of stammering, its effects on the body and psychology of the sufferer, and its cure.