Comparisons with published reports

Tab. 4 shows the numbers of proteins reported in human plasma or serum in the literature, the number of those proteins in the IPI database, and the congruence with our PPP 9504 and PPP 3020 protein lists. Our lists are integrated (see Section 3.1), while the others generally are not, and do not use the same methods. It is clear that the number and nature of proteins identified in serum and plasma depend greatly on the sample preparation and fractionation and on MS methods and analytical tools.

Tab. 4 Comparison of PPP integrated protein identification lists with published datasets for human plasma or serum

Published data

Total

# IPI

PPP_9504

PPP_3020

IDs

proteins

dataset

dataset

Anderson et al. [50]

1175

990

471

316

Shen et al. [38]

1682

1842

526

213

Chan et al. [54]

1444

1019

402

257

Zhou et al. [35]

210

107

68

51

Rose et al. [55]

405

287

159

142

Anderson et al. [50] published a compilation of 1175 non-redundant proteins reported in at least one of four sources (literature review plus three recent experimental datasets [51, 41, 52]); only 46 proteins were reported in all four sources, suggesting high false-positive rates from reliance on single-peptide hits [49]. The experimental papers used multidimensional chromatography, 2-DE, and MS; MudPIT analysis of a tryptic digest; or MudPITof a tryptic digest of low-Mr plasma fractions. Of the 990 of these proteins which have IPI (version 2.21) identifiers, 316 are found in our 3020 protein Core Dataset. When we relaxed the integration requirement (5102 IPI IDs), as was the case for [50], this figure rose only to 356 matches. Using the full 9504 dataset, the corresponding matches were 471 with integration and 539 without integration (15 710 protein IPI IDs).

Shen et al. [38] used high-efficiency nanoscale RP LC and strong cation exchange LC in conjunction with ion-trap MS/MS and then applied conservative SEQUEST peptide identification criteria (with or without considering chymotryptic or elastic peptides) and peptide LC normalized elution time constraints. Between 800 and 1682 human proteins were identified, depending on the criteria used for identification, from a total of 365 mg of human plasma. With their cooperation, we re-ran their raw spectra using HUPO PPP SEQUEST parameters (high confidence: Xcorr > 1.9/2.2/3.75 (for charges +1/+2/+3), deltaCn > 0.1, and Rsp > 4; lower confidence: Xcorr > 1.5/2.0/2.5 (for charges +1/+2/ + 3), deltaCn > 0.1) and obtained 1842 IPI protein matches. Of these, 526 and 213 were found in the PPP 9504 and 3020 datasets, respectively.

Chan et al. [53] resolved trypsin-digested serum proteins into 20 fractions by ampholyte-free liquid phase IEF. These 20 peptide fractions were submitted to strong cation-exchange chromatography, then microcapillary RP-LC-MS/MS. They identified 1444 unique proteins in serum. When we mapped these proteins against the IPI v2.21 database, there were 1019distinct proteins. From this set, 402 and 257 proteins matched with the 9504 and 3020 datasets, respectively.

Zhou et al. [35] identified an aggregate of 210 low Mr proteins or peptides after multiple immunoprecipitation steps with antibodies against albumin, IgA, IgG, IgM, transferrin, and apolipoprotein, followed by RP-LC-MS/MS. Only 107proteins were mapped with IPI identifiers, of which 68 and 51 were found in the 9504 and 3020 PPP protein lists, respectively

Finally, Rose et al. [54] reported fractionation in an industrial-scale approach, starting with 2.5 liters of plasma from healthy males, depleted of albumin and IgG, then smaller proteins and polypeptides separated into 12 960 fractions by chro-matographic techniques. From thousands of peptide identifications, 502 different proteins and polypeptides were matched, 405 of which were included in the publication. Of the 287 which mapped to IPI identifiers, 159 and 142 are included in our 9504 and 3020 protein dataset, respectively.

Thus, across studies, as well as across the PPP participating laboratories, incomplete sampling of proteins is a dominant feature. A substantial depth of analysis is achieved with depletion of highly abundant proteins, fractionation of intact proteins followed by digestion and two or more MS/MS runs for each fraction. Standardized, statistically sound criteria for peptide identification and protein matching, and estimation of error rates are necessary features for comprehensive profiling studies.

Was this article helpful?

0 0

Post a comment