Phase with 35 collaborating laboratories and multiple analytical groups generating a core dataset of3020 proteins and a publiclyavailable database

Gilbert S. Omenn, David J. States, Marcin Adamski, Thomas W. Blackwell, Rajasree Menon, Henning Hermjakob, Rolf Apweiler, Brian B. Haab, Richard J. Simpson, James S. Eddes, Eugene A. Kapp, Robert L. Moritz, Daniel W. Chan, Alex J. Rai, Arie Admon, Ruedi Aebersold, Jimmy Eng, William S. Hancock, Stanley A. Hefta, Helmut Meyer, Young-Ki Paik, Jong-Shin Yoo, Peipei Ping, Joel Pounds, Joshua Adkins, Xiaohong Qian, Rong Wang, Valerie Wasinger, Chi Yue Wu, Xiaohang Zhao, Rong Zeng, Alexander Archakov, Akira Tsugita, Ilan Beer, Akhilesh Pandey, Michael Pisano, Philip Andrews, Harald Tammen, David W. Speicher and Samir M. Hanash

1.1 Introduction 2

1.2 PPP reference specimens 4

1.3 Bioinformatics and technology platforms 5

1.3.1 Constructing a PPP database for human plasma and serum proteins 5

1.3.2 Analysis of confidence of protein identifications 14

1.3.3 Quantitation of protein concentrations 15

1.4 Comparing the specimens 17

1.4.1 Choice of specimen and collection and handling variables 17

1.4.2 Depletion of abundant proteins followed by fractionation of intact proteins 19

1.4.3 Comparing technology platforms 22

1.4.4 Alternative search algorithms for peptide and protein identification 23

1.4.5 Independent analyses of raw spectra or peaklists 24

1.4.6 Comparisons with published reports 25

1.4.7 Direct MS (SELDI) analyses 27

1.4.8 Annotation of the HUPO PPP core dataset(s) 27

1.4.9 Identification of novel peptides using whole genome ORF search 30

1.4.10 Identification of microbial proteins in the circulation 30

1.5 Discussion 31

1.6 References 33

