First Excursion Protein Surrogate Biomarker Signatures

Innovative alternative methods such as diagnostic human ESC (hESC) models are embedded in a context ofsubsequent analytical procedures. Here, we would like to focus as an example on current developments concerning developmental toxicity. The precise functional read-outs of hESC- and murine ESC (mESC)-based tests will be explained in greater detail below. On the molecular level, a quantitative differential analysis of protein expression is a mandatory requirement. This is by no means a trivial task, and this section is meant to provide a basic understanding of underlying problems and adequate methods and technologies for addressing them. Related efforts are directed by international and national regulatory agencies such as ECVAM (European Centre for the Validation of Alternative Methods) and BfR (Federal Institute for Risk Assessment, Germany), sponsored by the European Commission with the participation of major pharmaceutical companies and numerous academic partners [24]. ESC models play a crucial role in these projects [5].

Pharmaco- and toxico- and diagnostic genomics use surrogate biomarkers on the level of nucleic acids. Such methods have been vigorously pursued and developed with huge amounts of investments over the past decade or more, for all types of purposes and disease areas with certain achievements on the diagnostic level, but otherwise limited directly applicable success [22, 23, 25]. However, the knowledge - and in particular the methodology - developed during the course of the Human Genome Project and further whole-genome-sequencing projects are prerequisite for the subsequent field of proteomics. It is this availability of annotated genomic information of important organisms in well-organized and accessible databases, the statistical and bioinformatics tools developed for array-based nucleic acid quantification and cluster analysis, which now help us to understand the unfolding wealth of information essentially emerging from mass spectrometry-based protein identification.

The reason for the relative disappointments of genomics is that information is incredibly condensed on the level of genes, and unfolds to an enormous complexity of protein expression by post-transcriptional and post-translational modifications.

Fig. 8.1 Number relationship of molecular species on the levels of DNA, RNA and proteins: the enormous complexity at the level of proteins provides the basis for the equally enormous flexibility of biological systems. Single genes often lead to hundreds, or even thousands, of functionally modified protein molecules. Thus, the main task of a comprehensive analysis of proteins ("proteomics") is the establishment of a reliable methodology for complexity reduction. Dynamic, cellular processes with compensation and crosstalk require consequentially differential and quantitative identification of correlated protein signatures, rather than single all-or-none targets.

Fig. 8.1 Number relationship of molecular species on the levels of DNA, RNA and proteins: the enormous complexity at the level of proteins provides the basis for the equally enormous flexibility of biological systems. Single genes often lead to hundreds, or even thousands, of functionally modified protein molecules. Thus, the main task of a comprehensive analysis of proteins ("proteomics") is the establishment of a reliable methodology for complexity reduction. Dynamic, cellular processes with compensation and crosstalk require consequentially differential and quantitative identification of correlated protein signatures, rather than single all-or-none targets.

For humans, these mechanisms help to translate roughly 20 000 genes [23, 27] into millions of molecular species on the level of proteins. This functional inflation of complexity is largely due to chemical modifications of amino acid residues of proteins, for example, by phosphorylation, glycosylation, methylation, acetylation, oxidation, or by proteolytic processing. These mechanisms generate very distinct, organ- and process-specific protein embodiments from the very same reading frames provided by uniform genomic information in each nucleus. One of the major difficulties of related proteomic research next to the complexity of sheer numbers, is the dynamic range of possible concentrations (8 to 15 orders of magnitude); another is the time frame of molecular changes (ranging from fractions of seconds to years) [22, 23]. In addition, proteins have a maximum of chemical diversity (acidic, alkaline, hydrophobic, hydrophilic, etc.) and no amplifying method (e.g., PCR for nucleic acids) exists or is even perceivable. The relationships of numbers of molecular species on the levels of DNA, RNA, and proteins are illustrated in Figure 8.1.

The enormous complexity at the level of proteins, provides the basis for the equally enormous flexibility of biological systems.

Consequently, robust and reliable quantification and differential display of protein expression in complex samples is at the core of biomarker definition by proteomics technologies. Unfortunately, at all levels, related methods are laborious and cumbersome. Protein quantification requires the labeling of protein mixtures by radioactive or stable isotopes or fluorescent dyes. Separation techniques are defined by the complexity of samples; highly complex mixtures require two-dimensional polyacrylamide gel electrophoresis (2D-PAGE), multidimensional liquid chromatography (LC) or capillary electrophoresis (CE). Complexity reduction of samples by appropriate fractionation (e.g., albumin depletion, phosphopro-teins) has always to be considered before embarking upon a search for protein biomarkers. However, regardless of fractionation strategies, the best way to reduce complexity is a reliable differential display, which depends directly on the linear dynamic range of concentrations expected in a given sample. As shown in Figure 8.2, for an example of rather complex protein mixtures from neural derivatives of mESC, compared for the sake of a molecular signature of ischemic events in neurons, high-resolution 2D-PAGE and radioactive labeling are the best choice [28]. In cases of less-complex samples, for example, after extensive fractionation or from bacterial samples (only few post-translational modifications!), LC- or CE-based methods and stable isotope labeling represent alternatives [29, 30].

A number of recent reviews have described the state of the art in proteomics technologies [22, 23, 25]. Analytical strategies for the various levels of protein biomarker discovery must be considered carefully at every stage, and should be directly correlated to functional parameters and embedded in an iterative bioinformatics processing, which relates incoming results from mass spectrometry directly back to functional biological models used to generate respective samples. Against this background in-vitro models have real advantages, because the approach enables an integrated (at least partially) functional validation of the molecular data. The application of genomics tools, such as cluster analysis for the reliable quantitation of protein expression data (from still rather low-throughput proteomic technologies), justifies expectations that here the novel content for future high-throughput devices can be generated.

Given the enormously complex and dynamic nature of protein modifications, and the redundant and pleiotropic organization of almost all major signal transduction pathways, we envisage the emerging importance of biomarker signatures consisting of functionally related sets of post-translational protein isoforms, rather than single targets or surrogate biomarkers. So, one thing is certain: future devices will have to provide information beyond the amino acid backbone of proteins, for instance, post-translational isoforms which are functionally defined and precisely characterized on the molecular level.

0 0

Post a comment