Like the spotted cDNA microarrays (1), array comparative genomic hybridization (CGH) also uses two differentially labeled test (unknown sample to be analyzed) and reference (known to be genomically normal) DNAs which are co-hybridized, under in situ suppression hybridization conditions, to cloned genomic fragments with known physical locations, spotted and immobilized on glass slides. The hybridized DNAs are then detected by their different incorporated fluorophores, and the ratios of the digitized intensity values in the hybridized patterns of the DNAs onto the cloned fragments are indicative of copy-number differences between the test and the reference genomes.

The detection of genomic alterations using array CGH requires careful statistical analysis of the intensity data from the two fluorochrome, since, besides genuine differences between the two genomes, stochastic fluctuations, measurement errors or other errors of unknown origins, and consistent, region-specific variations caused by differences in hybridization characteristics of the incorporated fluorochromes and by local variation in chromosomal structures, can all cause the ratio to deviate from unity (2).

For conventional CGH, a calibration process is usually invoked, in which reference versus reference hybridizations are performed to gauge the normal range of ratio variations (3). The ratios of the test-reference hybridizations, at each chromosomal segment where the ratio is calculated, are then compared with, say, the two standard deviations (SD) outside the mean, obtained from the calibration, and a gain or loss is declared if the ratio is above or under the two SDs (presumably the nominal 95% confidence bounds without multiple comparison adjustment) (4). Sometimes a pair of fixed, global thresholds, say, 1.15 and 0.85 (5, 6), are used in lieu of two SDs.

Recognizing the variable nature of the variance of the mean ratio within and between reference:reference hybridizations and possible inequality of variances of mean ratio between the test:reference and reference:reference experiments, a t-like statistic incorporating reference:reference and test:reference variations to detect genomic alterations segment by segment was proposed (7). This method, however, assumes that the ratio of the variances of test:reference ratio means and of reference:reference ratio means is constant across the whole genome, which may not be true. In addition, correlation in the estimated variances and the spatial correlation of ratios in the neighboring segments are completely ignored. Spatial correlations between neighboring clones can be prominent in array CGH data, since, once a clone exhibits alteration, its neighboring clones also tend to have alterations (8). The spatial correlation among neighboring clones is expected to be high when the regions with genomic alterations are large, or when the density of the CGH array becomes high. With high-density CGH arrays containing 30000 (9) or even 85 000 (10) oligonucleotides on a single chip with an average resolution of 30 kb or even higher (10) on the horizon, proper handling of spatial correlations becomes a pressing issue. Proper handling of spatial correlation may also increase statistical efficiency and improve precision in estimation, which, in turn, may translate into requirement for less calibration samples.

Besides the issue of spatial correlation in analysis of array CGH data, several additional considerations are in order. First, less restrictive assumptions on variance are preferable, since the variance may depend on the chromosomal structures and thus locations of the clones. Second, the nature of variance may vary from laboratory to laboratory due to considerable differences in the execution of array CGH experiments; less distributional assumption on the ratio would be preferable. Lastly, robustness to outliers and the minimization of the dominating effect of clones with very small variance would be desirable. Our recently proposed methods (11) are well adapted to spatial inhomogeneity as in array CGH data, and have been applied successfully to the identification of genomic alterations in the endometrium of patients with endometriosis (12).

0 0

Post a comment