Normalization is the process of removing bias from a measurement. Data on a microarray may be biased for several reasons including differences in dye properties, probe labeling, and hybridization efficiencies, as well as inappropriate detector settings on the scanner (see also Chapter 17). We address only the scanner issues in this chapter.

A common misconception with microarrays is that a 1:1 ratio of signals on the array corresponds to a 1:1 ratio of gene expression. As discussed above, equimolar amounts of different dyes may not produce equally bright signals. Equal signal is guaranteed to represent equal gene expression only if the dye is the same for all probes, the probes are labeled to the same density, and the probes have equal hybridization efficiency to their respective targets. However, quantifying the contribution of each of these factors for each microarray experiment would be extremely cumbersome. The dyes and filters commonly used for microarrays have been optimized to minimize differences caused by inherent dye properties, and including dye swap replicates is a useful way to control for dye batch, labeling and hybridization differences. With carefully prepared and properly controlled experiments, multi-channel microarrays can provide reliable estimates of actual biological ratios.

The purpose of adjusting the PMT is to maximize the range of numerical values that are available to represent the fluorescent signal from the sample. All channels should be set to the highest gain possible without causing saturated pixels. In addition, it is easier to visually evaluate micro-array images if the channels are equally bright, as indicated by a ratio of approximately 1.0. Note that the human eye is very sensitive to color, and is a very reliable judge of signal intensity, so many researchers simply rely on visual inspection to balance the channels.

In many whole-genome gene expression experiments the mean of all ratio values should be close to 1.0 because for any given experimental system relatively few genes are differentially expressed, and approximately the same number are over-expressed as are under-expressed. The PMT gain can be set to balance the signal from both channels by calculating the mean ratio of all features on the array, and adjusting the feature intensities so that this mean is set to approximately 1.0.

However if the microarray contains a small or functionally specific set of genes, or in microarray experiments that examine organisms under extreme conditions, such as heat shock, starvation or stationary phase, we may expect many of the genes to be differentially expressed. In this case normalizing the data to force a global mean ratio of 1.0 may mask important differential expression. In such cases, one may prefer to use housekeeping genes or spiked-in controls (13). Spiked-in controls of known ratios across a range of expression values provide an external standard by which one can normalize all genes on a microarray, regardless of the distribution of the genes being probed. When using external controls, one must ensure that the controls are measured at the expected ratios. External controls should be calibrated against an independent technique such as quantitative PCR. Properly calibrated external controls provide a robust method of normalization.

Using so-called 'housekeeping' genes as controls has fallen out of favor because in some organisms, such as yeast, no gene exists that is unchanged under all conditions. However, if you are studying a subset of the genes in a genome, specific experimental conditions, or specific tissues it may be possible to select a set of housekeeping genes that reliably yields the same signal across all arrays in the experiment (14).

Once you have balanced the PMT settings of the scanner, you can use computational normalization methods to adjust for many other types of non-uniformity in the data, both physical and statistical:

• Spatial non-uniformity

• Print-tip non-uniformity (a special case of spatial non-uniformity)

• Intensity dependence of ratio values

• Intensity dependence of variance.

Locally-weighted scatterplot smoothing (LOWESS) normalization corrects for the first three non-uniformities (unless the spatial non-uniformity is seen within print-tip groups). A variance stabilization method such as that proposed by Durbin et al. (15) can correct the last problem. Further discussion of these methods is beyond the scope of this chapter and is addressed in Chapter 17.

0 0

Post a comment