Retrieval and matching of IPI numbers for the analytes

International protein index (IPI) accession numbers were obtained for each analyte in the quantitative assays of this study using two search methods. In the first search, the analyte names were subjected to an internet search to retrieve the proper protein names. The analyte names were then used to generate Sequence Retrieval System (LION Bioscience, Heidelberg, Germany) queries of the IPI database [20] using the SRS server at EBI (http://srs.ebi.ac.uk). The search parameters were as follows: protein name is in IPI AllText and OrganismName is Human. The data returned were Accession Number(s) and EntryName. The returned name from the IPI database was compared with the input analyte name, and records with the names not matching were discarded. IPI numbers corresponding to precursor forms ofproteins were retained.

In the second search, the list of protein names against which the antibodies were raised was searched against the Human Protein Reference Database to identify all possible alternate names. These alternate names were further verified using the OMIM and Swiss-Prot databases. The IPI database was then searched using these names, and all IPI IDs, which corresponded to the protein name in question, were assigned to it. Each sequence corresponding to each IPI ID was further verified by conducting a BLASTP against the nr data set. The outputs were manually analyzed, and LocusLink identifiers were assigned to each sequence and cross-checked with those assigned in the IPI database. Alternate IPI IDs, as specified in the IPI data set, were also assigned so as to give all possible identifiers for each protein. Protein name and all alternate names were used to query the HUGO gene nomenclature committee's database, and the results verified using LocusLink identifiers. This allowed annotation of all entries with their gene name and gene symbol.

Was this article helpful?

0 0

Post a comment