Expressed sequence tag sequencing and annotation are highly useful for identifying the repertoire of genes transcribed in tissues involved in human diseases, and— even in the well-studied human genome—the approach still represents a valuable tool for the identification of novel genes and alternatively spliced mRNAs. The ESTAnnotator facilitates processing and annotation of medium- to large-scale EST datasets. The successive steps of initial EST read quality control, followed first by the identification of ESTs which correspond to already known genes and mRNAs, and then by the clustering and further annotation by database searching of the remaining EST reads have been automated to avoid manual intervention. ESTAnnotator successfully led to the immediate bio-informatical annotation of about 75% of 5000 EST sequences originating from a human fetal cartilage cDNA library.[5] The tool could be further improved by producing a functional classification of the identified cDNAs (e.g., according to GeneOntology criteria; http:// together with known splice variants and single nucleotide polymorphisms.

