Smoothing the variances

From a modeling perspective, a chromosome to be analyzed can be practically viewed as continuous, and the clones, with known physical locations, are observation points interspersed along the chromosome. Within each chromosome, it is reasonable to assume that the variance is a smooth function of clone locations. Specifically, let st = Vf/n + s%/n2 be the standard error and xt be the genome position of clone i, we assume that log(Sj) = h(x) + e, i = 1,...,I, (21.3)

where h is a smooth function. We fit Equation 21.3 using the robust lowess method with 30% of the data used for smoothing at each position. Logarithm of standard errors and lowess fit to Equation 21.3 are shown in Figure 21.1.

We then define a modified Mike statistic as

genome position (kb)

Figure 21.1.

Plot of logarithm of standard errors vs. genome positions as circles and the lowess fit as the solid line.

Replacing standard errors by their smoothed estimates also reduces the effect of outliers and prevents clones with very small variances from dominating the result.

0 0

Post a comment