## Regression methods

Regression methods correct a dataset of signal intensities by either overall (15, 16) or locally (17) estimating an optimal polynomial function (either linear or of higher degree) that explains the local or entire dataset's tendency.

### 1. Linear regression

Here, a function f(x) = ax + b is fitted to the log-log plot of two sets of signal intensities resulting from two differently fluorescence-labeled target samples by the least-squares method. The dataset is normalized according to f(x,) that means (f(x)-b)/a.

2. Polynomial regression of degree >1

A function f is fitted to the dataset (log-log plot of both sets of signal intensities) as in the linear case, but f is a polynomial function of degree >1:

3. Local regression via loess/lowess (locally weighted scatter plot smooth)

For each z in a sliding window, a linear (lowess) or quadratic polynomial (loess) weighted regression function is estimated locally. Here, a descending M estimator is used with the Tukey's biweight function. For normalization of gene expression microarray datasets as proposed in Yang et al. 2001 and 2002 (2, 18) a lowess curve is fitted to the A versus M scatterplot, where A denotes the log product intensity of two channels r and g (A = log Vgg) and M the log ratio (M = log r/g). Loess/lowess-normalization is widely used and has been widely adapted to many applications, specific problems and trends such as print tip effects (1, 3, etc.).

### 4. Local regression via Locfit

The underlying model is Y = m(xi) + e, where m(x) is assumed to be smooth and is estimated by fitting a polynomial model within a sliding window. For each point x consider a locally weighted least-

= (1- |v|3)3 for \v\ >1 and w(v) =0 otherwise; h denotes the band width. As in the case of loess/lowess normalization, the locally estimated curve is fitted to the A versus M scatterplot.

### 5. QSpline normalization

Workman et al. (19) proposed a normalization method where intensity pairs of two arrays are interpolated according to a cubic spline function. Here, smoothing B-splines are fitted to the quantiles from raw array signals of both channels. Then, the splines are used as signal-dependent normalization functions. This method is implemented in the affy-package in Bioconductor (http://www. bioconductor.org/; library: affy, function: normalize.qspline).

0 0