demiology for prospective studies.) A relative risk higher than 1 (a positive association) is seen when an antigen is more frequent in the patients than in the controls, whereas a risk below unity (a negative association) reflects a decreased frequency in the patients.

Under the null hypothesis of no association of the genetic marker with disease, the expected RR is 1, and observed values of the RR follow a chi-squared distribution with 1 degree of freedom, as follows x2 = (ad - ¿»c)-T/(M|M,N|N0

Deviations of an observed RR value from the chi-squared distribution indicate a statistically significant association. For more details on statistical issues, including combining samples, see Svejgaard et al (1974).

In cases where an antigen, or marker, is absent or present in all patients or when sample sizes are small, the formula given by Haldane in 19.55 should be used:

The number of patients studied is N,, the number of control individuals is N2.

The most commonly used measure of the strength of an association between a disease and a marker allele is the relative risk (RR), or relative incidence ratio of Woolf (1955). The relative risk is simply estimated by the cross-product ratio of the entries in the 2 x 2 table of association, i.e.

(Strictly, this measure in a retrospective study as described here is the odds ratio (OR), but when the incidence of the disease in the population is low, this is equivalent to the relative risk as defined in epi-

For a number of HLA-associated diseases large relative risks are observed, for example in the case of ankylosing spondylitis the relative risk for B27 in Caucasians is 69.1, while for insulin-dependent diabetes mellitus (IDDM) the relative risks for DR3 and DR4 are 3.8 and 9.0 respectively (see Thomson, 1988).

Two possibilities exist to explain a marker-disease association. The first is that disease susceptibility is directly influenced by the presence of the marker allele, e.g. class II HLA-DRB1, DQAI and DQBI associations with IDDM. The second is linkage disequilibrium (nonrandom association) between the marker allele and a disease predisposing locus, e.g. the association of HLA class I A3 with hemochromatosis in Caucasians. In either case, the 'disease' locus may have alleles necessary for disease, or alternatively all genotypes at the locus may be disease susceptible, with different degrees of penetrance. Significant linkage disequilibrium values are usually not expected for loci with recombination distances greater than 2%, and sometimes less than this distance.

All subjects must be of the same homogeneous ethnic origin. Otherwise, spurious associations may result from a stratification effect if one of the subgroups had a higher frequency of the disease and the marker allele than did the rest of the population. Using family-based association studies, where the parental marker alleles not transmitted to an affected child in simplex families form the 'control' population, ensures that only associations of marker genes linked to a disease gene will be detected (for details, including statistical tests, see Thomson, 1995).

¿Multiple alleles at a genetic marker locus may show associations with a disease. To test this, it is preferable to test all k alleles simultaneously in a kxl contingency table of number of alleles observed in patients and controls, than to perform multiple tests on each allele separately. Genotype specific effects can similarly be determined.

