Proteins

Of the macromolecules commonly found in living systems, proteins are the most versatile, having a wide range of biological functions and this fact is reflected in their structural diversity.

The five elements found in most naturally occurring proteins are carbon, hydrogen, oxygen, nitrogen and sulphur. In addition, other elements may be essential components of certain specialised proteins such as haemoglobin (iron) and casein (phosphorus).

Figure 2.12 Amino acid structure. (a) The basic structure of an amino acid. (b) In solution, the amino and carboxyl groups become ionised, giving rise to a zwitterion (a molecule with spatially separated positive and negative charges). All the 20 amino acids commonly found in proteins are based on a common structure, differing only in the nature of their 'R' group (see Figure 2.13)

Proteins can be very large molecules, with molecular weights of tens or hundreds of thousands. Whatever their size, and in spite of the diversity referred to above, all proteins are made up of a collection of 'building bricks' called amino acids joined together. Amino acids are thought to have been among the first organic molecules formed in the early history of the Earth, and many different types exist in nature. All these, including the 20 commonly found occurring in proteins, are based on a common structure, shown in Figure 2.12. It comprises a central carbon atom (known as the a-carbon) covalently bonded to an amino (NH2) group, a carboxyl (COOH) group and a hydrogen atom. It is the group attached to the final valency bond of the a-carbon which varies from one amino acid to another; this is known as the 'R'-group.

The 20 amino acids found in proteins can be conveniently divided into five groups, on the basis of the chemical nature of their 'R'-group. These range from a single hydrogen atom to a variety of quite complex side chains (Figure 2.13). It is unlikely nowadays that you would need to memorise the precise structure of all 20, as the author was asked to do in days gone by, but it would be advisable to familiarise yourself with the groupings and examples from each of them. The groups differentiate on the basis of a polar/non-polar nature and on the presence or absence of an ionisable 'R'-group. Box 2.4 shows how we normally refer to proteins in shorthand.

Note that one amino acid, proline, falls outside the main groups. This differs from the others in that it has one of its N—H linkages replaced by an N—C, which forms part of a cyclic structure (Figure 2.13). This puts certain conformational constraints upon proteins containing proline residues.

As can be seen from Figure 2.13, the simplest amino acid is glycine, whose R-group is simply a hydrogen atom. This means that the glycine molecule is symmetrical, with a hydrogen atom on opposite valency bonds. All the other amino acids however, are asymmetrical. The a-carbon acts as what is known as a chiral centre, giving the molecule right or left 'handedness'. Thus two stereoisomers known as the d- and l-forms are possible for each of the amino acids except glycine. All the amino acids found in naturally occurring proteins have the l-form; the d-form also occurs in nature but only in certain specific, non-protein contexts.

Proteins, as we've seen, are polymers of amino acids. Amino acids are joined together by means of a peptide bond. This involves the -NH2 group of one amino acid and the -COOH group of another. The formation of a peptide bind is a form of condensation reaction in which water is lost (Figure 2.14). The resulting structure of two linked amino

(a) Simple, aliphatic

Glycine

CH3 Alanine

Valine

CH CH3 CH

Leucine

Isoleucine

(b) Polar, uncharged

Serine

Threonine

o nh2

Asparagine

O NH2 Glutamine

(c) Charged

Lysine i

CH2 CH2

C NH2

H H Arginine

Histidine

Aspartate O O

Glutamate

Positive

Negative

(d) Aromatic

I II

Phenylalanine

Tyrosine

Tryptophan

(e) Sulphur-containing

SH Cysteine

CH2 CH2

(f) Cyclic

Methionine

2 CH2 22

CH2 Proline

Figure 2.13 The 20 amino acids found in proteins. The 'R' group of each amino acid is shown. These range from the simplest, glycine, to more complex representatives such as tryptophan

Box 2.4 Amino acid shorthand

It is sometimes necessary to express in print the sequence of amino acids which

make up the primary structure of a particular protein; clearly it would be desper

ately tedious to express a sequence of hundreds of bases in the form 'glycine,

phenylalanine, tryptophan, methionine ... etc', so a

system of abbreviations for

each amino acid has been agreed. Each amino acid can be reduced to a three

letter code, thus you might see something like:

1

2

3 4 5 6 7 8

9

10 11

Gly

Phe

Try Met His Lys Gly Ala

His

Val Glu____and so on.

Note

that each residue has a number; this numbering always begins at the N-

terminus.

Each amino acid can also be represented by a single letter. The abbreviations

using the two systems are shown below.

A

Ala

Alanine M

Met

Methionine

B

Asx

Asparagine/aspartic N

Asn

Asparagine

acid

C

Cys

Cysteine P

Pro

Proline

D

Asp

Aspartic acid Q

Gln

Glutamine

E

Glu

Glutamic acid R

Arg

Arginine

F

Phe

Phenylalanine S

Ser

Serine

G

Gly

Glycine T

Thr

Threonine

H

His

Histidine V

Val

Valine

I

Ile

Isoleucine W

Trp

Tryptophan

K

Lys

Lysine Y

Tyr

Tyrosine

L

Leu

Leucine Z

Glx

Glutamine/glutamic acid

1 Carboxyl

C OH

Amino

Peptide bond

Peptide bond

Amino acid 1

Amino acid 2

A dipeptide

Figure 2.14 The carboxyl group of one amino acid is joined to the amino group of another. This is another example of a condensation reaction (c.f. Figure 2.11). No matter how many amino acids are added, the resulting structure always has a free carboxyl group at one end and a free amino group at the other

acids is called a dipeptide; note that this structure still retains an -NH2 at one end and a -COOH at the other. If we were to add on another amino acid to form a tripeptide, this would still be so, and if we kept on adding them until we had a polypeptide, we would still have the same two groupings at the extremities of the molecule. These are referred to as the N-terminus and the C-terminus of the polypeptide. Since a water molecule has been removed at the formation of each peptide bond, we refer to the chain so formed as being composed of amino acid residues, rather than amino acids. The actual distinction between a protein and a polypeptide based on the number of amino acid residues is not clear-cut; generally, with over 100, we refer to proteins, but some naturally occurring proteins are a lot smaller than this.

So far, we can think of proteins as long chains of many amino acid residues, rather like a string of beads. This is called the primary structure of the protein; it is determined by the relative proportions of each of the 20 amino acids, and the order in which they are joined together. It is the basis of all the remaining levels of structural complexity, and it ultimately determines the properties of a particular protein. It is also what makes one protein different from another. Since the 20 types of amino acid can be linked together in any order, the number of possible sequences is astronomical, and it is this great variety of structural possibilities that gives proteins such diverse structures and functions.

Some parts of the primary sequence are more important than others. If we took a protein of, say, 200 amino acid residues in length, took it apart and reassembled the amino acids in a different order, we would almost certainly alter (and probably lose completely) the properties of that protein. If we look at the primary sequence of a protein molecule which serves essentially the same function in several species, we find that nature has allowed slight alterations to occur during evolution, but these are often conservative substitutions, where an amino acid has been replaced by a similar one (one from the same group in Figure 2.13), and thus have little effect on the protein's properties. In certain parts of the primary sequence, such substitutions are less well tolerated, for example the few residues that make up the active site of an enzyme (see Chapter 6). In cases such as the one above, alterations have not been allowed at these points in the primary sequence, and the sequence is the same, or almost so, in all species possessing that protein. The sequence in question is said to have been conserved.

Higher levels of protein structure

The structure of proteins is a good deal more complicated than a just a linear chain of amino acids. A long thin chain is unlikely to be very stable; proteins therefore undergo a process of folding which makes the molecule more stable and compact. The results of this folding are the secondary and tertiary structures of a protein.

The secondary structure is due to hydrogen bonding between a carbonyl (-CO) group and an amido (-NH) group of amino acid residues on the peptide backbone (Figure 2.15). The 'R' group plays no part in secondary protein structure. Two regular patterns of folding result from this; the a-helix and the (3-pleated sheet.

In theory, there are 20100 or some 1013° different ways in which 20 different amino acids could combine to give a protein 100 amino acid residues in length!

CI I

HO R

Hydrogen bonds H R OH

Figure 2.15 Secondary structure in proteins. Hydrogen bonding occurs between the -CO and -NH groups of amino acids on the backbone of a polypeptide chain. The two amino acids may be on the same or different chains

The a-helix occurs when hydrogen bonding takes place between amino acids close together in the primary structure. A stable helix is formed by the -NH group of an amino acid bonding to the -CO group of the amino acid four residues further along the chain (Figure 2.16a). This causes the chain to twist into the characteristic helical shape. One turn of the helix occurs every 3.6 amino acid residues, and results in a rise of 5.4 A (0.54 nm); this is called the pitch height of the helix. The ability to form a helix like this is dependent on the component amino acids; if there are too many with large R-groups, or R-groups carrying the same charge, a stable helix will not be formed. Because of its rigid structure, proline (Figure 2.13) cannot be accommodated in an a-helix. Naturally occurring a-helices are always right-handed, that is, the chain of amino acids coils round the central axis in a clockwise direction. This is a much more stable configuration than a left-handed helix, due to the fact that there is less steric hindrance (overlapping of electron clouds) between the R-groups and the C==O group on the peptide backbone. Note that if proteins were made up of the d-form of amino acids, we would have the reverse situation, with a left-handed form favoured. In the (3-pleated sheet, the hydrogen bonding occurs between amino acids either on separate polypeptide chains or on residues far apart in the primary structure (Figure 2.16b). The chains in a (-pleated sheet are fully extended, with 3.5 A (0.35 nm) between adjacent amino acid residues (c.f. a-helix, 1.5 A). When two or more of these chains lie next to each other, extensive hydrogen bonding occurs between the chains. Adjacent strands in a (-pleated sheet can either run in the same direction (e.g. N^C), giving rise to a parallel (-pleated sheet, or in opposite directions (antiparallel (-pleated sheet, as shown in Figure 2.16b).

A common structural element in the secondary structure of proteins is the (-turn. This occurs when a chain doubles back on itself, such as in an antiparallel (-pleated sheet. The -CO group of one amino acid is hydrogen bonded to the -NH group of the

Very small distances within molecules are measured in Angstrom units (A). One Angstrom unit is equal to one tenbillionth (10-10) of a metre.

Figure 2.16 Secondary structure in proteins: the a-helix and |-pleated sheet. (a) Hydrogen bonding between amino acids four residues apart in the primary sequence results in the formation of an a-helix. (b) In the |-pleated sheet hydrogen bonding joins adjacent chains. Note how each chain is more fully extended than in the a-helix. In the example shown, the chains run in the same direction (parallel)

Figure 2.16 Secondary structure in proteins: the a-helix and |-pleated sheet. (a) Hydrogen bonding between amino acids four residues apart in the primary sequence results in the formation of an a-helix. (b) In the |-pleated sheet hydrogen bonding joins adjacent chains. Note how each chain is more fully extended than in the a-helix. In the example shown, the chains run in the same direction (parallel)

residue three further along the chain. Frequently, it is called a hairpin turn, for obvious reasons (Figure 2.17). Numerous changes in direction of the polypeptide chains result in a compact, globular shape to the molecule.

Typically about 50 per cent of a protein's secondary structure will have an irregular form. Although this is often referred to as random coiling, it is only random in the sense that there is no regular pattern; it still contributes towards the stability of the molecule. The proportions and combinations in which a-helix, |-pleated sheet and random coiling occur varies from one protein to another. Keratin, a structural protein found in skin, horn and feathers, is an example of a protein entirely made up of a-helix, whilst the lectin (sugar-binding protein) concanavalin A is mostly made up of |-pleated sheets.

The tertiary structure of a protein is due to interactions between side chains, that is, R-groups of amino acid residues, resulting in the folding of the molecule to produce a thermodynamically more favourable structure. The structure is formed by a variety of weak, non-covalent forces; these include hydrogen bonding, ionic bonds, hydrophobic interactions, and Van der Waals forces. The strength of these forces diminishes with distance, therefore the formation of a compact structure is encouraged. In addition, the -SH groups on separate cysteine residues can form a covalent -S—S- linkage. This is

Hydrogen bonding "

Amino acid 4

Nh I

HC R

> Amino acid 2

Amino acid 3

Figure 2.17 The |-turn. The compact folding of many globular proteins is achieved by the polypeptide chain reversing its direction in one or more places. A common way of doing this is with the | -turn. Hydrogen bonding between amino acid residues on the same polypeptide stabilizes the structure known as a disulphide bridge and may have the effect of bringing together two cysteine residues that were far apart in the primary sequence (Figure 2.18).

In globular proteins, the R-groups are distributed according to their polarities; nonpolar residues such as valine and leucine nearly always occur on the inside, away from the aqueous phase, while charged, polar residues including glutamic acid and histidine generally occur at the surface, in contact with the water.

The protein can be denatured by heating or treatment with certain chemicals; this causes the tertiary structure to break down and the molecule to unfold, resulting in a loss of the protein's biological properties. Cooling, or removal of the chemical agents, will lead to a restoration of both the tertiary structure and biological activity, showing that both are entirely dependent on the primary sequence of amino acids.

Even the tertiary structure is not always the last level of organisation of a protein, because some are made up of two or more polypeptide chains, each with its own secondary and tertiary structure, combined together to give the quaternary structure (Figure 2.19). These chains may be identical or different, depending on the protein. Like the tertiary structure, non-covalent forces between R-groups are responsible, the difference being that this time they link amino acid residues on separate chains rather than on the same one.

Such proteins lose their functional properties if dissociated into their constituent units; the quaternary joining is essential for their activity. Phosphorylase A, an enzyme involved in carbohydrate metabolism, is an example of a protein with a quaternary structure. It has four subunits, which have no catalytic activity unless joined together as a tetramer.

Complex molecules such as globular proteins become denatured when their three-dimensional structure is disrupted, leading to a loss of biological function.

Cysteine H H O

Disulphide S I Bond

(b) Cysteine

Figure 2.18 Disulphide bond formation. (a) Disulphide bonds formed by the oxidation of cysteine residues result in cross-linking of a polypeptide chain. (b) This can have the effect of bringing together residues that lie far apart in the primary amino acid sequence. Disulphide bonds are often found in proteins that are exported from the cell, but rarely in intracellular proteins

Figure 2.18 Disulphide bond formation. (a) Disulphide bonds formed by the oxidation of cysteine residues result in cross-linking of a polypeptide chain. (b) This can have the effect of bringing together residues that lie far apart in the primary amino acid sequence. Disulphide bonds are often found in proteins that are exported from the cell, but rarely in intracellular proteins

Although all proteins are polymers of amino acids existing in various levels of structural complexity as we have seen above, some have additional, non-amino acid components. They may be organic, such as sugars (gly-coproteins) or lipids (lipoproteins) or inorganic, including metals (metalloproteins) or phosphate groups (phos-phoproteins). These components, which form an integral part of the protein's structure, are called prosthetic groups

A prosthetic group is a non-polypeptide component of a protein, such as a metal ion or a carbohydrate

Figure 2.19 Polypeptide chains may join to form quaternary structure. The example shown comprises two identical polypeptide subunits. Coils indicate a-helical sequences, arrows are |-pleated sheets. From Bolsover, SR , Hyams, JS, Jones, S, Shepherd, EA & White, HA: From Genes to Cells, John Wiley & Sons, 1997. Reproduced by permission of the publishers

Was this article helpful?

0 0
Diabetes 2

Diabetes 2

Diabetes is a disease that affects the way your body uses food. Normally, your body converts sugars, starches and other foods into a form of sugar called glucose. Your body uses glucose for fuel. The cells receive the glucose through the bloodstream. They then use insulin a hormone made by the pancreas to absorb the glucose, convert it into energy, and either use it or store it for later use. Learn more...

Get My Free Ebook


Post a comment