## Quality Assessment

A. Quality Assessment for Affymetrix GeneChip Expression Data

Producing gene expression data using microarray technology is an elaborate process with many potential sources of variability. To maximize the scientific value of gene expression information derived from microarrays, we must make rigorous quality assessments throughout the process. Standard sample preparation protocols include a number of qualitative assessments meant to ensure that good quality RNA is used in the hybridization experiments. After hybridization and image processing, each microarray provides a wealth of information that can be used to assess the quality of the data. Recommended post-hybridization quality assessments include general image quality assessment and analysis of intensity measures of specialized probes (Afifymetrix, 2001).

In this section, we suggest some methods to assess data quality based on the analysis of residuals from the models fitted to estimate gene expression. Departures from quality standards may be attributable to various sources: RNA preparation, hybridization, chip scan, wash, image processing, or faulty chips. The effects of departures from quality may be localized to a small area on a chip or may be uniformly distributed over an entire array, possibly affecting numerous arrays. In most cases, departures from quality standards attributable to processing failures will be reflected by inflated residuals from fits to models such as Eq. 1. Residuals are, therefore, expected to provide useful information for data quality assessment.

Quality assessment can be focused at different levels: at the level of individual probes, of probe set summaries, of probe sets, or of chips. Fitting the probe level models robustly will automatically reduce the effect of malfunctioning probes (cross-hybridizing or non-responding probes) on the estimated expression values, so diagnosis of dysfunctional probes is not required to obtain good expression summaries in this context. It may still be useful to identify dysfunctional probes (by means of residual analysis) for other purposes, for example, when seeking cross hybridizing probes or genes with alternative splicing.

At the probe set summary level, residuals can be combined to produce estimated standard errors of probe set summaries. These can be used to derive weights for individual probe set summaries for downstream analysis. Careful analysis is required to ensure that these weights are beneficial to the downstream analysis.

At the probe set level, residuals can be used to estimate the scale ofthe residual variance for each probe set or to produce a goodness-of-fit measure for the models fitted to each probe set. These goodness-of-fit measures can be used to derive appropriate weights for combining expression measures for different probe sets.

Our focus in this section is on obtaining an overall chip data quality index, which can be used to distinguish among chips of varying quality. We also suggest a way to visualize the distribution of residuals on a chip to help diagnose the source of departures from quality. Finally, we suggest some chip data quality assessment based on analysis of relative log expression.

To illustrate the methodology we use a set of 19 cel files from the Asymetrix HG-U95A Spike-In Experiment, the 2353 series. The cel files and corresponding chips are identified by the letters A through T (note that the C experiment is missing from this series). Differential concentrations of 14 human transcripts were spiked in a common pool of pancreas mRNA. The behavior of the 14 spike-in probe sets does not play a role in overall chip data quality assessment. For the remainder of the probe sets, the arrays in this experiment constitute a set of technical replicates. The data are available from www.affymetrix.com/analysis/ download_center2.afifx.

### 1. Summarizing Residuals from Fits

A simple way to summarize residuals for an entire chip is by means of their empirical distribution. Box plots provide a useful way to compare distributions for a large data set. The top panel of Fig. 8 shows box plots of residuals for each chip. In these, we note a slightly inflated variability in residuals for experiments A and P of the series. Note that the box plots of residuals will be centered close to zero (exactly zero for a least-squares fit), and that their distribution is approximately symmetrical about zero, so the differences between chips could effectively be summarized by the 75th percentile of the chip residual distribution.

Because our biggest concern is the effect of low-quality probe data on expression summaries, it makes sense to combine residuals into estimated standard errors of expression estimates and summarize these at the chip level. To derive the standard errors, we assume that the models were fitted robustly by iteratively re-weighted least squares (IRLS). This fitting procedure can be used to obtain the various M estimators (Holland and Welsch, 1977) as well as the

maximum likelihood fit assuming t error distributions (Lange et al., 1989). IRLS estimates of parameters are obtained as weighted least-squares estimates. The weights are updated at each step by applying a transformation to the residuals from the previous fit. The choice of weight function depends on the particular M estimator desired (Huber, 1972, 1981).

Applying the M-estimation techniques to the model specified in Eq. 1 we get

## Post a comment