Reference Sequences

Human mtDNA was first sequenced in 1981 in the laboratory of Frederick Sanger in Cambridge, England (Anderson et al. 1981). For many years, the original 'Anderson' sequence (GenBank accession: M63933) was the reference sequence to which new sequences were compared. The Anderson sequence is also referred to as the Cambridge Reference Sequence (CRS). Typically laboratories report results in terms of variation compared to the L-strand of the CRS. Thus, the observation of a C nucleotide at position 16126, which contains a T in the Anderson sequence, would be reported as 16126C. If no other nucleotide variants are reported, then it is assumed that the remaining sequence contains the same sequence as the CRS.

In 1999, the original placenta material used by Anderson and co-workers to generate the CRS was re-sequenced (Andrews et al. 1999). The 1981 sequence was derived primarily from a single individual of European descent; however, it also contained some HeLa and bovine sequences to fill in gaps resulting from early rudimentary DNA sequencing procedures (Anderson et al. 1981). With improvements in DNA sequencing technology over the intervening two decades, it was felt that any original errors should be rectified to enable robust use of this reference sequence in the future.

The re-analysis effort confirmed all but 11 of the original nucleotides identified in the original published sequence (Table 10.3). One of these differences was the loss of a single cytosine residue at position 3106. An additional seven nucleotide positions were demonstrated to be accurate but represent rare polymorphisms. These sites were 263A, 311-315CCCCC, 750A, 1438A, 4769A, 8860A, and 15326A. Fortunately, no errors were observed in the widely used control region. Thus, the original Anderson sequence (Anderson et al. 1981) was found to be identical to the revised Cambridge reference sequence (Andrews et al. 1999) across the HV1 and HV2 regions that are widely used in forensic applications.

The revised Cambridge reference sequence (rCRS) is now the accepted standard for comparison (Andrews et al. 1999). However, the loss of a single C nucleotide at position 3106 means that the reference mtGenome is 16 568 bp rather than the traditionally accepted value of 16 569 bp. More critically the original nucleotide numbering would have to be updated for all previously

Nucleotide

Region of

Original

Revised

Remarks

Position

mtGenome

CRS

CRS

3423 ND1 G

4985 ND2 G

9559 COIII G

11335 ND4 T

13702 ND5 G

14199 ND6 G

14272 ND6 G

14365 ND6 G

14368 ND6 G

14766 cyt b T

C

Error

T

Error

A

Error

C

Error

C

Error

C

Error

T

Error

C

Error (bovine sequence

inserted)

C

Error (bovine sequence

inserted)

C

Error

C

Error (HeLa sequence

inserted)

identified sequence changes beyond nucleotide position 3106. Since this approach would have created an unacceptable amount of confusion and inability to easily correlate previous work, Andrews and co-workers (1999) recommended that the original numbering be retained in the rCRS with a deletion in the sequence at position 3107 to serve as a place holder. The 16 568 bp rCRS is available at the MITOMAP web site: http://www.mitomap.org/mitoseq.htm.

As a side note, however, it is probably worth noting that the official reference sequence used in the assembly of the human genome is not the rCRS. Rather the RefSeq mtGenome used by the National Center for Biotechnology Information in its official assembly of the human genome is contained in GenBank as accession NT_001807. This sequence, originally GenBank accession AF347015 sequenced by Ingman et al. (2000), is 16571 bp and derived from an African (Yoruba) individual. Thus, difference reference sequences can and have been used for various purposes with mtDNA so it is important to note the one in use for a particular study.

Table 10.3

Comparison of nucleotide differences observed between the original Cambridge Reference Sequence (Anderson et al. 1981) and the revised Cambridge Reference Sequence (Andrews et al. 1999) based on re-sequencing of the original placenta material. The true sequence at position 3106—3107 is only a single C making the entire mtGenome 16568 bp rather than the originally reported 16569bp. However, to maintain the historical numbering, a deletion at position 3107 is used to serve as a placeholder (Andrews et al. 1999). Note that no differences exist between these sequences for the two hypervariable regions most commonly used in forensic applications that span positions 16024-16365 and 73-340.

Was this article helpful?

0 0
Stammering Its Cause and Its Cure

Stammering Its Cause and Its Cure

This book discusses the futility of curing stammering by common means. It traces various attempts at curing stammering in the past and how wasteful these attempt were, until he discovered a simple program to cure it. The book presents the life of Benjamin Nathaniel Bogue and his struggles with the handicap. Bogue devotes a great deal of text to explain the handicap of stammering, its effects on the body and psychology of the sufferer, and its cure.

Get My Free Ebook


Responses

  • jens
    How was the revised cambridge reference sequence made?
    1 year ago

Post a comment