Transformation methods

In this section we review some of the most frequently used transformation-based normalization methods such as regression methods (global-, local-and qspline-based), ANOVA, variance stabilization and quantile normalization. Overall tendencies in the data might be corrected by choosing an appropriate regression model. Several regression-based models have been used recently.

The basic idea of introducing some error model is to describe the relation of the measured signal intensity with regard to the true abundance of RNA molecules. Assuming that the true intensity level xkg of the kth sample and gth gene is disturbed by some random multiplicative (bkg) and additive (akg) factors, the actual measurement of the gth gene in the kth sample ykg can be described as follows: ykg = akg + bkg xkg. Proposing models and approaches that determine and optimally describe the factors akg and bkg in stochastic terms has been the focus of many publications over the last few years.

Introducing an error model which describes the nature of intensity measurements including systematic and random effects, and which estimates true gene expression according to the error model, should improve the analysis. While normalization without introducing an error model might be able to correct for systematic effects that frequently appear in the data, noise effects that stochastically show up can be captured by an appropriate error model.

One of the first approaches to determine a multiplicative term was proposed in 1997 by Chen et al. (13). An integrative description of multiplicative and additive factors in an extensive error model was introduced by Rocke and Durbin (14) and led to the normalization model of variance stabilization (8, 9). A good overview of the development of recent error models for describing microarray datasets is given in Huber et al. (4).

0 0

Post a comment