It is obvious that as scaling methods can only correct for globally multiplicative effects these methods appear insufficient to normalize raw microarray datasets in a way that the normalized dataset fulfills the requirement to approximately reflect the corresponding number of mRNA molecules in the sample under consideration. Random effects cannot be captured at all. However, a number of transformation methods have been proposed in the last few years which seem to be more appropriate for the analysis of microarray data. While ANOVA normalization assumes a specific set-up of the experiment which is not always given, local regression methods as well as variance stabilization seem to be appropriate for many experimental settings. Variance stabilization in particular is superior to all other methods when the dataset contains a large proportion of low-intensity values. Local regression-based methods can especially deal with nonlinearities.

Appropriate normalization methods should be able to identify and correct for systematic and random effects in the data. Though one can detect such effects, it is impossible to correct for them in each single intensity measurement within one single experiment. Effects that are due to one specific plate, pin, enzymatic reaction, and so on, can be detected within data preprocessing and a separate normalization is possible. As for the example proposed by Smyth and Speed (3), separate normalization for a specific outlier pin improves the normalization. Thus, in some cases a separate normalization for each pin might be advisable. Problems arise if the same kind of effects is present for plates and other technical issues. It is impossible to adjust for all at the same time since the data subsets get too small. If one wants to use composite normalization one has to decide which kind of technical issue is the most likely influencing factor. Overall, local regression-based methods, such as loess and variance stabilization, emerge as the most appropriate ones. Especially, in the case of strongly scattered values in the low-intensity range, variance stabilization out-performs all other methods.

Figure 17.3.

Normalization of one experiment of the swirl dataset. Each subplot displays the scatterplot of log-product (x-axis) versus the log ratio (y-axis) of red and green intensity values. The subplot in the upper left displays the background-corrected raw intensities without any normalization. The other subplots show scatterplots after application of global mean scaling, global linear regression, loess regression, quantile normalization and qspline normalization. The gray line shows the local regression line (applying loess function for each 1% quartile) for each resulting raw or normalized dataset.

0 0

Post a comment