The sequence of a coding strand of DNA, read in the direction from 5′ to 3′, consists of nucleotide triplets (codons) corresponding to the amino acid sequence of a polypeptide read from N-terminus to C-terminus. Sequencing of DNA and proteins makes it possible to compare corresponding nucleotide and amino acid sequences directly. There are 64 codons; each of four possible nucleotides can occupy each of the three positions of the codon, making 43 = 64 possible trinucleotide sequences. In the (nearly) universal genetic code, used in the translation of prokaryotic genes and of nuclear genes of eukaryotes, each of these codons has a specific meaning in translation: 61 codons represent amino acids and 3 codons cause the termination of translation.
The breaking of the genetic code originally showed that genetic information is stored in the form of nucleotide triplets, but it did not reveal which amino acid is specified by each triplet codon. Before the advent of DNA sequencing, codon assignments were deduced on the basis of two types of in vitro studies. A system involving the translation of synthetic polynucleotides was introduced in 1961, when Nirenberg showed that polyuridylic acid (poly[U]) directs the assembly of phenylalanine into polyphenylalanine. This result means that UUU must be a codon for phenylalanine. In a later, second system, a trinucleotide was used to mimic a codon, thus causing the corresponding aminoacyl-tRNA to bind to a ribosome. By identifying the amino acid component of the aminoacyl-tRNA, the meaning of the codon could be found. The two techniques together assigned meaning to all of the codons that represent amino acids.
The assignment of amino acids to codons is not random but shows relationships in which the third (3′) base has less effect on codon meaning. In addition, chemically similar amino acids are often represented by related codons. The meaning of a codon that encodes an amino acid is determined by the tRNA that corresponds to it; the meaning of the termination codons is determined directly by protein factors (see the Translation chapter).
The code is summarized in FIGURE 23.1. Because there are more codons than there are amino acids, the result is that almost all amino acids are represented by more than one codon. The only exceptions are methionine and tryptophan. Codons that encode the same amino acid are said to be synonymous. A polypeptide is actually translated from the mRNA, so the genetic code is usually described in terms of the four bases present in RNA: U, C, A, and G.
FIGURE 23.1 All the triplet codons have meaning: 61 represent amino acids and 3 cause termination (stop codons).
Codons representing the same or chemically similar amino acids tend to be similar in sequence. Often the base in the third position of a codon (its 3′ end) is not significant because the four codons differing only in the third base represent the same amino acid. Sometimes a distinction is made only between a purine versus a pyrimidine in this position. The reduced specificity at the last position is known as third-base degeneracy.
To be interpreted, a codon in mRNA must first base pair with the anticodon of the corresponding aminoacyl-tRNA. This pairing occurs at the ribosome, where the interaction between complementary trinucleotides is stabilized by highly conserved 16S rRNA nucleotides in the A site. Strict monitoring of the overall base-pair shape by rRNA permits only conventional A-U and G-C pairing to occur at the first two positions of the codon, but additional pairings are permitted at the third codon base, where rRNA contacts can follow different rules. As a result, a single aminoacyl-tRNA may recognize more than one codon, by means of the additional, noncanonical pairs permitted at the third position. Furthermore, pairing interactions may also be influenced by the posttranscriptional modification of tRNA, especially within or directly adjacent to the anticodon.
The tendency for identical or chemically similar amino acids to be represented by related codons minimizes the effects of mutations. It increases the probability that a single random base change will result in no amino acid substitution or in one involving amino acids of similar character. For example, a mutation of CUC to CUG does not change the resulting polypeptide because both codons represent leucine. Mutation of CUU to AUU results in replacement of leucine with isoleucine; both of these amino acids are hydrophobic and are likely to play similar roles in the encoded protein.
FIGURE 23.2 plots the number of codons representing each amino acid against the frequency with which the amino acid is used in proteins (in Escherichia coli). In general, amino acids that are more common are represented by more codons. This suggests that there has been some optimization of the genetic code with regard to the utilization of amino acids.
FIGURE 23.2 Some correlation of the frequency of amino acid use in proteins with the number of codons specifying the amino acid is observed. An exception is found for amino acids specified by two codons, which occur with a wide variety of frequencies.
The three codons (UAA, UAG, and UGA) that do not encode amino acids are used specifically to terminate translation. One of these stop codons marks the end of every open reading frame.
Comparisons of DNA sequences with the corresponding polypeptide sequences reveal that an identical set of codon assignments is used in bacteria and in eukaryotes (except for some variations in mitochondria). As a result, mRNA from one species usually can be translated correctly in vitro or in vivo by the translation apparatus of another species. Thus, the codons used in the mRNA of one species have the same meaning for the ribosomes and tRNAs of other species.
The universality (with minor exceptions) of the genetic code suggests that it was established very early in evolution. Perhaps the code started in a primitive form in which a small number of codons were used to represent comparatively few amino acids, possibly even with one codon corresponding to any member of a group of amino acids. More precise codon meanings and additional amino acids could have been introduced later. One possibility is that at first only two of the three bases in each codon were used; discrimination at the third position could have evolved later.
Evolution of the code could have become “frozen” at a point at which the system had become so complex that any changes in codon meaning would disrupt functional proteins by substituting unacceptable amino acids. Its near universality implies that this must have happened at such an early stage that all living organisms are descended from a Last Universal Common Ancestor (LUCA) that used the current near-universal genetic code.
Exceptions to the universal genetic code are rare. Changes in meaning in the principal genome of a species usually concern the termination codons. For example, in a Mycoplasma, UGA encodes tryptophan; in certain species of the ciliates Tetrahymena and Paramecium UAA and UAG encode glutamine. Systematic alterations of the code have occurred only in mitochondrial DNA (see the section later in this chapter titled The Universal Code Experiences Sporadic Alterations).
The function of tRNA in translation is fulfilled when it recognizes the codon in the ribosomal A site. The interaction between anticodon and codon takes place by base pairing, but under rules that extend pairing beyond the usual G-C and A-U partnerships.
The genetic code itself yields some important clues about the process of codon recognition. The pattern of third-base degeneracy is clear in FIGURE 23.3, which shows that in almost all cases either the third base is irrelevant or a distinction is made only between purines and pyrimidines.
FIGURE 23.3 Third bases have the least influence on codon meanings. Boxes indicate groups of codons within which third-base degeneracy ensures that the meaning is the same.
There are eight codon families in which all four codons sharing the same first two bases have the same meaning, so that the third base has no role at all in specifying the amino acid. There are seven codon pairs in which the meaning is the same regardless of which pyrimidine is present at the third position, and there are five codon pairs in which either purine may be present without changing the amino acid that is encoded.
In only three cases is a unique meaning conferred by the presence of a particular base at the third position: AUG (for methionine), UGG (for tryptophan), and UGA (termination). This means that C and U never have a unique meaning in the third position, and A never signifies a unique amino acid.
The anticodon is complementary to the codon; thus it is the first base in the anticodon sequence written conventionally in the direction from 5′ to 3′ that pairs with the third base in the codon sequence written by the same convention. So the combination
Codon |
5′ A C G 3′ |
Anticodon |
3′ U G C 5′ |
is usually written as codon ACG/anticodon CGU, where the anticodon sequence must be read backward for complementarity with the codon.
To avoid confusion, we shall retain the usual convention in which all sequences are written 5′ to 3′ but indicate anticodon sequences with a backward superscript arrow as a reminder of the relationship with the codon. Thus the codon/anticodon pair shown in the previous paragraph will be written as ACG and CGU←, respectively.
Does each triplet codon require its own tRNA with a complementary anticodon, or can a single tRNA respond to both members of a codon pair and to all (or at least some) of the four members of a codon family? The answer is that often one tRNA can recognize more than one codon. All codons that a particular tRNA recognizes must be identical at their first two base positions. By contrast, the base in the first position of the tRNA anticodon is able to pair with alternative bases in the corresponding third position of the codon; base pairing at this position is not limited to the usual G-C and A-U partnerships.
The rules governing the recognition patterns are summarized in the wobble hypothesis, which states that the pairing between codon and anticodon at the first two codon positions always follows the usual rules, but that exceptional “wobbles” occur at the third position. Wobbling occurs because the structure of the ribosomal A site, in which the codon–anticodon pairing occurs, permits increased flexibility at the first base of the anticodon. The most common nonconventional pair that is found at this position is G-U (FIGURE 23.4). For example, the anticodon UUG in tRNAGln recognizes both the CAA and CAG glutamine codons, and the anticodon GUG in tRNAHis recognizes both the CAU and CAC histidine codons. Other nonconventional pairs that are tolerated at the third codon position involve modified bases (see the section later in this chapter titled Modified Bases Affect Anticodon–Codon Pairing).
FIGURE 23.4 Wobble in base pairing allows G-U pairs to form between the third base of the codon and the first base of the anticodon.
This capacity of the third codon position to tolerate G-U pairs creates a pattern of base pairing in which A can no longer have a unique meaning in the codon (because the U that recognizes it must also recognize G). Similarly, C also no longer has a unique meaning (because the G that recognizes it must also recognize U). Table 23.1 summarizes the pattern of recognition. It is therefore possible to recognize unique codons only when the third bases are G or U. However, only UGG and AUG provide examples of such unique recognition.
TABLE 23.1 Codon–anticodon pairing involves wobbling at the third position.
Base in First Position of Anticodon | Base(s) Recognized in Third Position of Codon |
---|---|
U | A or G |
C | G only |
A | U only |
G | C or U |
tRNAs are commonly synthesized as precursor chains with additional sequences at one or both ends. FIGURE 23.5 shows that the extra sequences are removed by combinations of endonucleolytic and exonucleolytic activities. The three nucleotides at the 3′ terminus, which are always present as the triplet sequence CCA, are sometimes not encoded in the genome. In such cases, they are added as part of the tRNA processing.
FIGURE 23.5 The tRNA 3′ end is generated by cutting (endonucleolytic) and trimming (exonucleolytic) reactions, followed by addition of CCA when this sequence is not encoded; the 5′ end is generated by a precise endonucleolytic cleavage.
The 5′ end of tRNA is generated by a cleavage action catalyzed by the ribonucleoprotein enzyme ribonuclease P. This enzyme recognizes the global L-shaped tRNA structure and specifically hydrolyzes the phosphodiester linkage that forms the mature 5′ end of the molecule, leaving a 5′-phosphate group. In E. coli, RNase P consists of a 377-nucleotide RNA and 17.5-kD protein, and its active site is composed of RNA. In vitro the RNA component alone is able to catalyze the tRNA-processing reaction. (This is an example of a ribozyme; see the Catalytic RNA chapter.) The function of the protein subunit is to stabilize a conformation of the RNA active site that is complementary to the tRNA precursor. This is discussed further in the Catalytic RNA chapter.
In the case of histidine-specific tRNAs in some organisms, after RNase P cleavage an additional guanosine residue is added at the 5′ terminus, thus forming a unique G−1 nucleotide. The enzyme that accomplishes this addition, Thg1, has the remarkable property of catalyzing the equivalent of a reverse polymerization reaction. The new guanosine is added by nucleotide addition in the 3′ to 5′ direction, opposite to that of all other known DNA and RNA polymerases.
The enzymes that process the 3′ end are best characterized in E. coli, where an endonuclease triggers the reaction by cleaving the precursor downstream, and several exonucleases then trim the end by degradation in the 3′ to 5′ direction. tRNA 3′-end processing also involves several enzymes in eukaryotes. The addition of the 3′-CCA is catalyzed by the enzyme tRNA nucleotidyltransferase, which functions as a non-template-directed RNA polymerase; that is, the enzyme specifically adds C, C, and A in sequence, without pairing the cytosine and adenine to complementary guanine and uracil bases on a template. Instead, the enzyme structure itself is sufficient to form sequential complementary binding sites for C, C, and A. As the nucleotides are added, the enzyme–tRNA complex changes conformation to become complementary to each successive nucleotide.
All three nucleotides are added by tRNA nucleotidyltransferase when they are not encoded in the tRNA gene sequence. Interestingly, the enzyme also plays an essential role in repairing damaged tRNA 3′ ends in organisms such as E. coli that do encode CCA. In these organisms, three different tRNA substrates are recognized: those lacking CCA, those possessing a 3′-C, and those possessing a 3′-CC.
tRNA nucleotidyltransferase enzymes are divided into two classes that retain significant amino acid similarity only in their active site regions. Class I enzymes are found in archaea; bacterial and eukaryotic enzymes together make up a second class. In some very ancient bacterial lineages, CCA addition is catalyzed by two closely related class II enzymes: one of these enzymes adds –CC, and the other adds the 3′-terminal A.
Transfer RNA is unique among nucleic acids in its content of modified bases. A modified base is any purine or pyrimidine ring except the usual A, G, C, and U from which all RNAs are synthesized. All other bases are produced by posttranscriptional modification of one of the four bases after it has been incorporated into the polyribonucleotide chain. The ribose sugar of some tRNA nucleotides is also methylated on the 2′–OH to produce the 2′-O-methyl modification.
Although all classes of RNA display some degree of modification, the range of chemical alterations to the bases is much greater in tRNA. The modifications range from simple methylation to wholesale restructuring of the base. Modifications occur in all parts of the tRNA molecule. They vary considerably in their extent of conservation among tRNA types and in the location of the molecule at which they are found. Modifications specific for particular tRNAs or small subgroups of tRNAs are generally less common than those present more broadly. Some species-specific patterns have also been identified. In all, there are 81 reported different types of modified bases in tRNA. On average, each tRNA is modified at about 15% to 20% of its bases.
The modified nucleosides are synthesized by specific tRNA-modifying enzymes. The original nucleoside present at each position can be determined either by comparing the sequence of a mature tRNA with that of its gene or by isolating precursor molecules that lack some or all of the modifications. The sequences of precursors show that different modifications are introduced at different stages during the maturation of tRNA.
The many tRNA-modifying enzymes vary greatly in specificity. In some cases, a single enzyme acts to make a particular modification at a single position. In other cases, an enzyme can modify bases at several different target positions. Some enzymes undertake single reactions with individual tRNAs; others have a range of substrate molecules. Some modifications require the successive actions of more than one enzyme.
Some details of the structural basis for tRNA modification by enzymes have emerged. One striking example is the mechanism by which archaeosine, a modified G, is introduced into the D-loop of certain archaeal tRNAs. To access the base to be modified, which is normally buried within the tRNA tertiary core, the tRNA guanine transglycosylase enzyme facilitates a dramatic induced-fit rearrangement of the tRNA to produce an alternative tertiary structure termed the lambda form. Induced-fit rearrangements of the tRNA structure have also been observed for other modifying enzymes and constitute a common theme in recognition.
Known functions of modified bases are to confer increased stability to tRNAs and to modulate their recognition by proteins and other RNAs in the translational apparatus. Roles for modified bases in recognition by aminoacyl-tRNA synthetases, for example, have been clearly defined in a number of cases (as discussed later in this chapter). However, in many cases the biological role of the tRNA modification remains unknown.
FIGURE 23.6 shows some of the more common modified bases. Modifications of pyrimidines (C and U) are generally less complex than those of purines (A and G).
FIGURE 23.6 All four bases in tRNA can be modified.
The most common modification made to uridine and cytosine is methylation, which may occur at several different positions on the ring. Methylation at position 5 of uracil creates ribothymidine (T). The thymidine base is identical to that found in DNA, but in tRNA it is attached to ribose rather than deoxyribose. This thymidine is found in nearly all tRNA molecules at position 54 in the TψC-loop. Pseudouridine is a striking uridine modification that is generated by cleavage of the glycosidic bond, followed by constrained rotation of the liberated ring and rejoining of the C5 carbon to the C1 carbon of the ribose. Thus, pseudouridine lacks an N-glycosidic linkage. Nearly all tRNAs possess pseudouridine at position 55 of the TψC-loop. Position 56 is also very highly conserved as cytosine; together, the TψC sequence at positions 54 through 56 provides the basis for naming this portion of the tRNA molecule.
The dihydrouridine (D) modification, which is generated by saturation of the double bond joining C5 and C6 of uracil, is nearly universally found in the D-loop of tRNAs. As for the TψC sequence, this D modification provides the basis for naming the D stem-loop of the tRNA. The removal of the double bond in D destroys the aromaticity and planarity of the uracil ring, generating an unusual structure that subtly modifies the shape of the globular core of the tRNA.
The nucleoside inosine (I) is normally found in the cell as an intermediate in the purine biosynthetic pathway. However, it is not directly incorporated into RNA. Instead, its presence depends on modification of A to form I. The incorporation of I at the 5′-anticodon position contributes importantly to wobble base pairing at the third codon position of mRNA (see the next section, Modified Bases Affect Anticodon–Codon Pairing).
Modifications of A and G often generate dramatic new structures (see Figure 23.6). For example, two complex series of nucleotides depend on modification of G. The Q bases, such as queuosine, have an additional pentenyl ring added via an –NH linkage to the methyl group of 7-methylguanosine. The pentenyl ring may carry a number of additional groups. The Y bases, such as wyosine, have an additional ring fused with the purine ring itself. This extra ring carries a long carbon chain; again, it is a chain to which further groups are added in different cases.
tRNA modifications in and adjacent to the anticodon influence its ability to pair with the mRNA codon. Most such modifications are present at positions 34 and 37 of the anticodon loop, and they generally function by constraining the range of available motion in the anticodon. In turn, this facilitates docking of the tRNA into the A site of the ribosome. These modifications influence codon pairing, and as a result they directly function to help determine how the cell assigns the meaning of the tRNA. Modified bases permit further pairing patterns in addition to those involving regular and wobble pairing of A, C, U, and G.
Inosine is particularly important when present at the first anticodon position (nucleotide 34 in the sequence) because it is able to pair with any one of the three bases U, C, or A (FIGURE 23.7). The role of inosine is well illustrated in the decoding of isoleucine codons. Here AUA encodes isoleucine, whereas AUG encodes methionine. To read the A at the third codon position, a tRNA would require U at the first anticodon position—but this U in the wobble position would necessarily also pair with G. Thus any tRNA with a 5′ U in its anticodon would recognize both AUG and AUA. This problem is resolved by synthesis of an isoleucine tRNA possessing A34, followed by modification of A34 to I34 by the enzyme tRNA adenosine deaminase. I34 then is able to recognize all three codons of the isoleucine set: AUU, AUC, and AUA.
FIGURE 23.7 Inosine can pair with U, C, or A.
In most cases, U at the first position of the anticodon is also converted to a modified form that has altered pairing properties. Derivatives of U possessing the 2-thio group in place of oxygen show improved selectivity in pairing to A as compared with G (FIGURE 23.8). Anticodons with uridine-5-oxyacetic acid and related modifications in the first position have the remarkable property of permitting the single tRNA to read three and sometimes all four of the synonymous codons NNA, NNC, NNU, and NNG.
FIGURE 23.8 Modification to 2-thiouridine restricts pairing to A alone because only one H-bond can form with G.
These and other pairing relationships show that there are multiple ways to construct a set of tRNAs able to recognize all the 61 codons representing amino acids. No particular pattern predominates in any particular organism, although the absence of a certain pathway for modification can prevent the use of some recognition patterns. Thus, a particular codon family is read by tRNAs with different anticodons in different organisms.
Often the tRNAs will have overlapping capacities to read certain codons, so that a particular codon is read by more than one tRNA. In such cases there may be differences in the efficiencies of the alternative recognition reactions. (As a general rule, codons that are commonly used tend to be more efficiently read.)
The predictions of wobble pairing accord very well with experimental evidence for almost all tRNAs. However, exceptions exist in which the codons recognized by a tRNA differ from those predicted by the wobble rules. Such effects probably result from the influence of neighboring bases and/or the conformation of the anticodon loop in the overall tertiary structure of the tRNA. Further support for the influence of the surrounding structure is provided by the isolation of occasional mutants in which a change in a base in some other region of the molecule alters the ability of the anticodon to recognize codons.
The universality of the genetic code is striking, but some exceptions exist. They tend to affect the codons involved in initiation or termination. The changes found in principal (bacterial or eukaryotic nuclear) genomes are summarized in FIGURE 23.9.
FIGURE 23.9 Changes in the genetic code in bacterial or eukaryotic nuclear genomes usually assign amino acids to stop codons or change a codon so that it no longer specifies an amino acid. A change in meaning from one amino acid to another is unusual.
Almost all of the changes in bacterial or eukaryotic nuclear genomes that allow a codon to represent an amino acid affect termination codons:
In the prokaryote Mycoplasma capricolum, UGA is not used for termination but instead encodes tryptophan (Trp). In fact, it is the predominant Trp codon, and UGG is used only rarely. Two tRNATrp types exist, which have the anticodons UCA← (which reads UGA and UGG) and CCA← (which reads only UGG).
Some ciliates (unicellular protozoa) read UAA and UAG as glutamine instead of as termination signals. Tetrahymena thermophila, a ciliate, contains three tRNAGln types: One tRNAGln with a UUG anticodon recognizes the usual codons CAA and CAG for glutamine, a second type with the anticodon UUA recognizes both UAA and UAG (in accordance with the wobble hypothesis), and a third type with the anticodon CUA recognizes only UAG. Restriction of the specificity of the release factor eRF so that it recognizes only the UGA stop codon is also necessary to prevent premature termination at the newly reassigned glutamine codons.
In the ciliate Euplotes octacarinatus, the UGA stop codon is reassigned to cysteine. Only UAA is used as a termination codon, and UAG is not found. The change in meaning of UGA might be accomplished by modifying the anticodon of tRNACys with I34 so that it is able to read UGA together with the usual codons UGU and UGC. UGA has dual meaning in E. crassus (see the next section, Novel Amino Acids Can Be Inserted at Certain Stop Codons).
In a yeast (Candida), CUG is reassigned to serine instead of leucine. This is a rare example of reassignment from one sense codon to another.
In general, acquisition of a coding function by a termination codon requires two types of change: A tRNA must be mutated so as to recognize the codon, and the class I release factor must be altered so that it does not terminate at this codon. The other common type of change is loss of the tRNA that recognizes a particular codon so that that codon no longer specifies any amino acid.
All of these changes are sporadic, meaning that they appear to have occurred independently in specific evolutionary lineages. They may be concentrated in termination codons because at these positions there is no substitution of one amino acid for another. Once the genetic code was established, early in evolution, any general change in the meaning of a codon would cause a substitution in all the proteins that contain that amino acid. It seems likely that the change would be deleterious in at least some of these proteins, with the result that it would be strongly selected against. The divergent uses of the termination codons could represent their “capture” for normal coding purposes. If some termination codons were used only rarely, their recruitment to coding purposes, by way of changes in tRNAs that permit reassignment, would have been more likely.
Exceptions to the universal genetic code also occur in the mitochondria of several species. FIGURE 23.10 shows a phylogeny for the changes. The ability to construct such a phylogeny suggests that there was a universal code that was changed at various points in mitochondrial evolution. The earliest change was the employment of UGA to encode tryptophan, which is common to mitochondria in all eukaryotes except plants.
FIGURE 23.10 Changes in the genetic code in mitochondria can be traced in phylogeny. The minimum number of independent changes is generated by supposing that the AUA = Met and the AAA = Asn changes each occurred independently twice and that the early AUA = Met change was reversed in echinoderms.
Some of the mitochondrial changes make the code simpler by replacing two codons that had different meanings with a pair that has a single meaning. Examples of this include UGG and UGA (both Trp instead of one Trp and one termination) and AUG and AUA (both Met instead of one Met and the other Ile).
Why have changes been able to evolve more readily in the mitochondrial code as compared to that of the nucleus? The mitochondrion synthesizes only a small number of proteins (about 10), and, as a result, the problem of disruption by changes in meaning is much less severe. It is likely that the codons that are altered were not used extensively in locations where amino acid substitutions would have been deleterious.
According to the wobble hypothesis, a minimum of 31 tRNAs (excluding the initiator) are required to recognize all 61 codons (at least 2 tRNAs are required for each 4-codon family and 1 tRNA is needed per codon pair or single codon). However, the streamlined mammalian mitochondrial genome encodes only 22 tRNAs. Other than a few redundant tRNAs that are also encoded in the mitochondrial genome, tRNAs encoded in the nuclear genome are not imported into the mitochondrion in mammals, so it can be inferred there must be some modification to the wobble rules for translation on the mitochondrial ribosome. Interestingly, in mitochondria an unmodified uridine at the first position of the anticodon is able to pair with all four bases at the third codon position. Such an unmodified uridine exists for the tRNAs representing all eight four-codon families: Pro, Thr, Ala, Ser, Leu, Val, Gly, and Arg. This reduces the total number of tRNAs required in mitochondria by eight. The conversion of AGA and AGG to stop codons in mammalian mitochondria eliminates the need for one additional tRNA, bringing the total required number of tRNAs to just 22. The conversion of AUA to methionine further eliminates the need for inosine modification at position 34 of tRNAIle (see the previous section, Modified Bases Affect Anticodon–Codon Pairing).
The different wobble rules for mitochondrial and nuclear translation very likely arise from differences in the detailed structures of the respective ribosomes that translate the two genomes. In cytoplasmic ribosomes, modifications to U34 are used to expand the decoding capacities of certain tRNAs (see the previous section, Modified Bases Affect Anticodon–Codon Pairing). On mitochondrial ribosomes, modifications to U34 are instead used to restrict pairing to codons containing A or G at the third position, according to the usual wobble rules. Modifications to U34 are indeed found in mitochondrial tRNAs representing amino acids for two-codon sets, thus avoiding the misreading that would otherwise occur.
At least two known instances have been identified in which a stop codon is used to specify an unusual amino acid other than the standard 20. Only particular stop codons are reinterpreted in this way by the translational apparatus. This demonstrates that the meaning of the codon triplet is influenced by the identity of other bases in the mRNA. Such a dual meaning for a particular codon in a genome should be distinguished from the context-independent complete reassignment of codons in some organisms or in mitochondria, as described in the previous section, The Universal Code Has Experienced Sporadic Alterations.
Selenocysteine, in which the sulfur of cysteine is replaced by selenium, is incorporated at certain UGA codons within genes coding for selenoproteins in all three domains of life. Usually, these proteins catalyze oxidation-reduction reactions. The selenocysteine residue is typically located in the active site, where it directly facilitates the reaction chemistry. For example, the UGA codon specifies selenocysteine in three E. coli genes encoding formate dehydrogenase isozymes; the incorporated selenium directly ligates a catalytic molybdenum ion in the active site.
Organisms capable of encoding selenocysteine possess an unusual tRNA, tRNASec, which is more than 90 nucleotides long and contains acceptor and T stems of nonstandard length. Instead of seven base pairs in the acceptor stem and five in the T stem (a 7/5 structure), bacterial tRNASec possesses an 8/5 structure, and archaeal and eukaryotic tRNASec likely possess a 9/4 structure. These tRNAs also possess the 5′-UCA anticodon, allowing them to read UGA. In all organisms, tRNASec is first aminoacylated with serine by seryl-tRNA synthetase (SerRS) to produce seryl-tRNASec. In bacteria, the enzyme selenocysteine synthase next converts Ser-tRNASec directly to selenocysteinyl (Sec)-tRNASec using selenophosphate as the selenium donor. In archaea and eukaryotes, Ser-tRNASec is first phosphorylated by the kinase PSTK to produce phosphoseryl (Sep)-tRNASec. In a second step, Sep-tRNASec is converted to Sec-tRNASec by the enzyme SepSecS. The exquisite specificity of PSTK is notable: It is capable of efficiently phosphorylating Ser-tRNASec while excluding the standard Ser-tRNASer. Improper phosphorylation of Ser-tRNASer by PSTK could result in the incorporation of selenocysteine in response to serine codons.
The choice of which UGA codons are to be interpreted as selenocysteine is determined by the local secondary structure of the mRNA. A hairpin loop downstream of the UGA codon, termed the SECIS element, is required for incorporation of selenocysteine and exclusion of release-factor binding. The SECIS element is directly adjacent to the UGA codon in bacteria but is located in the 3′ untranslated region (UTR) of the mRNA in archaea and eukaryotes. In E. coli, a specialized translation elongation factor, SelB, interacts solely with Sec-tRNASec and not with any other aminoacylated tRNA, including the precursor Ser-tRNASec. SelB also binds directly to the SECIS element. The consequence of the action of SelB is that only those UGA codons that also possess a properly juxtaposed SECIS site will be able to productively bind Sec-tRNASec in the ribosomal A site (FIGURE 23.11). Archaea and eukaryotes possess a homolog to SelB but also require the presence of an additional protein, SBP2, to permit the ribosome to insert selenocysteine.
FIGURE 23.11 SelB is an elongation factor that specifically binds tRNASec to a UGA codon that is followed by a stem-loop structure in mRNA.
Another example of the insertion of a special amino acid is the placement of pyrrolysine at certain UAG codons in the archaeal genus Methanosarcina as well as in a few bacteria. In Methanosarcina, pyrrolysine is found in the active site of methylamine methyltransferases, where it plays an important role in the reaction chemistry. The incorporation of pyrrolysine requires a specialized aminoacyl-tRNA synthetase, pyrrolysyl-tRNA synthetase (PylRS), which aminoacylates a specialized tRNAPyl with pyrrolysine. tRNAPyl possesses the 5′-CUA anticodon, enabling it to read UAG. As with tRNASec, tRNAPyl also possesses unusual structural features not found in other tRNAs; for example, it lacks the otherwise invariant U8 nucleotide and features atypically short D-loops and variable loops. The mechanism by which particular UAG codons are read as pyrrolysine has not yet been resolved, because it has not been possible to unambiguously identify a secondary structure element in all mRNAs that incorporate the amino acid. Further, no specific elongation factor targeting Pyl-tRNAPyl to the ribosome has been identified.
Recently, it was found that the UGA codon specifies insertion of either cysteine or selenocysteine in the ciliate E. crassus. Dual use of UGA was found to occur even within the same gene, and the choice of which amino acid is inserted depends on the structure of the 3′ untranslated region of the mRNA. UGA specifies Cys generally in Euplotes and does not function as a stop codon. As a result, this work shows that position-specific dual use can occur within the context of a codon that is not otherwise used for termination in that organism.
It is necessary for tRNAs to have certain characteristics in common but yet be distinguished by others. The crucial feature that confers this capacity is the ability of tRNA to fold into a specific tertiary structure. Changes in the details of this structure, such as the angle of the two arms of the “L” or the protrusion of individual bases, may distinguish the individual tRNAs.
All tRNAs can fit in the P and A sites of the ribosome. At one end they are associated with mRNA via codon–anticodon pairing, and at the other end the polypeptide is being synthesized and transferred. Similarly, all tRNAs (except the initiator) share the ability to be recognized by elongation factors (EF-Tu or eEF1) for binding to the ribosome. The initiator tRNA is recognized instead by IF-2 or eIF2. Thus, the tRNA set must possess common features for interaction with elongation factors and for identification of the tRNA initiator.
Amino acids enter the translation pathway through the action of aminoacyl-tRNA synthetases, which provide the essential decoding step converting the information in nucleic acids into the polypeptide sequence. All synthetases function by the mechanism depicted in FIGURE 23.12:
The amino acid first reacts with ATP to form an aminoacyl-adenylate intermediate, releasing pyrophosphate. Part of the energy released in ATP hydrolysis is trapped as a high-energy mixed anhydride linkage in the adenylate.
Next, either the 2′–OH or 3′–OH group located on the 3′-A76 nucleotide of tRNA attacks the carbonyl carbon atom of the mixed anhydride, generating aminoacyl-tRNA with concomitant release of AMP. (Note that key conserved nucleotides of tRNAs are always given the same name for consistency. Thus, the terminal nucleotide of every tRNA is called A76, even when the length of a given tRNA may vary from that typical length.)
FIGURE 23.12 An aminoacyl-tRNA synthetase charges tRNA with an amino acid.
A subset of four tRNA synthetases—those specific to glutamine, glutamate, arginine, and lysine—require the presence of tRNA to synthesize the aminoacyl-adenylate intermediate. For these enzymes, the tRNA synthetase is properly considered as a ribonucleoprotein particle (RNP), in which the RNA subunit functions to assist the protein in attaining a catalytically competent conformation. In the second step of aminoacylation, the amino acid portion of the aminoacyl adenylate is then transferred to the RNA component of the RNP (i.e., the tRNA).
Each tRNA synthetase is selective for a single amino acid among all the amino acids in the cellular pool. It also discriminates among all tRNAs in the cell. Usually, each amino acid is represented by more than one tRNA. Several tRNAs may be needed to recognize synonymous codons, and sometimes multiple types of tRNA base pair with the same codon. Multiple tRNAs representing the same amino acid are called isoaccepting tRNAs; because they are all recognized by the same synthetase, they are also described as its cognate tRNAs.
All tRNAs possess the canonical L-shaped tertiary structure (see the Translation chapter). The tRNA folds such that the acceptor and T stems form one coaxial stack, while the D and anticodon stems together form the perpendicular arm of the L-shape. The anticodon loop and CCA acceptor end are located at opposite ends of the molecule and are separated by approximately 40 Å. The globular hinge region of the tRNA, which connects the two perpendicular stacks, is composed of the D-loop, T-loop, variable arm, and two-nucleotide spacer between the acceptor and D stems. Most tRNAs possess small variable regions consisting of a four- to five-nucleotide loop, whereas a few isoaccepting groups feature a larger variable arm including a base-paired stem, which protrudes from the globular core. The common tRNA L-shape is essential for the interaction of all tRNAs with elongation factors and with the ribosome.
Within the context of this common L-shaped structure, enforced by the presence of conserved tertiary interactions within the globular core, tRNA sequences are found to diverge at a majority of positions in all four arms of the molecule. This sequence diversity can generate subtle differences in the angle between the two arms of the L-shape and, more important, leads to variations in the detailed path of the polynucleotide backbone throughout the molecule. It is this structural diversity that forms the basis for discrimination by the tRNA synthetases.
tRNA synthetases discriminate among tRNAs by means of two general mechanisms: direct readout and indirect readout. In direct readout, the enzyme recognizes base-specific functional groups directly; for example, a surface amino acid of a tRNA synthetase may accept a hydrogen bond from the exocyclic amine group of guanine (the N2 of G), a minor-groove group not found on the other three bases. By contrast, in indirect readout, the enzyme directly binds nonspecific portions of the tRNA: the sugar–phosphate backbone and nonspecific portions of the nucleotide bases. For example, sequences in the variable and D arms of a tRNA may produce a distinctively shaped surface that is complementary to the cognate tRNA synthetase, but not to other tRNA synthetases. In this way nucleotides distant from the enzyme–tRNA interface create an interface structure that is, in turn, directly bound. Both direct and indirect readout usually function within the context of mutual induced fit: Conformational changes in both the tRNA and enzyme occur after initial binding to form a productive catalytic complex. Both these mechanisms also often involve the participation of bound water molecules at the interface between the tRNA and enzyme. For example, when glutaminyl-tRNA synthetase (GlnRS) binds tRNAGln, two domains of the enzyme rotate with respect to each other; simultaneously, the 3′–single-stranded end and the anticodon loop of the tRNA undergo substantial conformational changes as compared with their presumed structures in the unliganded state.
In many cases the determinants in tRNA that are needed for specific recognition are located at the extremities of the molecule, in the acceptor stem and the anticodon loop. However, examples exist where nucleotides in the tertiary core provide the identity signals. Another commonly used identity nucleotide is the “discriminator base” at homologous position 73 in the tRNA, which is located directly 5′ to the 3′-terminal CCA sequence. Interestingly, the anticodon sequence of the tRNA is not necessarily required for specific tRNA synthetase recognition. In general, the tRNA identity set is idiosyncratic to each tRNA synthetase.
The identity determinants vary in their importance and are sometimes conserved in evolution. The conservation in tRNA identity elements is demonstrated by the capacities of many tRNA synthetases to aminoacylate tRNAs that are derived from different organisms. Hypotheses regarding the set of tRNA identity elements necessary for selection by a tRNA synthetase are derived from X-ray cocrystal structures of tRNA synthetase complexes, from classical genetics, and from in vitro mutagenesis. Final proof that a tRNA identity set has been well defined is obtained from transplantation experiments, in which the hypothesized set of nucleotides is incorporated into a tRNA from a different isoaccepting group. For example, replacement of 15 nucleotides in the acceptor stem and anticodon loop of tRNAAsp, with the corresponding nucleotides in tRNAGln, allowed glutaminyl-tRNA synthetase (GlnRS) to aminoacylate the modified tRNAAsp with glutamine, with an efficiency and selectivity comparable to that of the cognate GlnRS reaction.
Many tRNA synthetases can specifically aminoacylate a tRNA “minihelix,” which consists only of the acceptor and TψC arms of the molecule. In some cases, a tRNA microhelix, consisting of the acceptor stem alone closed at its distal end by a stable tetraloop, can serve as a substrate. For both minihelices and microhelices, the efficiency of aminoacylation is substantially weaker than in the case of the intact tRNA. However, these experiments have some significance to the evolutionary development of tRNA synthetase complexes. At an early evolutionary stage, tRNAs may have consisted solely of the acceptor arm of the contemporary molecule.
In spite of their common function, synthetases are a very diverse group of enzymes. They are divisible into two classes. Class I tRNA synthetases are primarily monomeric and feature structurally similar active-site Rossmann-fold domains at or near their N-termini. The Rossmann fold consists of a five- or six-stranded parallel β-sheet with connecting helices. This domain is homologous to the active site domain of dehydrogenases and is responsible for binding the ATP, the amino acid, and the 3′ terminus of tRNA. All class I tRNA synthetases contain an “acceptor-binding” domain that is inserted into the Rossmann fold at a common location, which also binds the single-stranded acceptor end of the tRNA, and which contains an editing active site in some of the enzymes (see the next section, Synthetases Use Proofreading to Improve Accuracy). The C-terminal domains of class I synthetases bind the inner corner of the L-shaped tRNA and the anticodon arm and also function to discriminate among tRNAs. Two short common sequence motifs involved in ATP binding are found in the active-site Rossmann fold. Aside from some limited homology among a few of the enzymes, there are no significant structural or sequence similarities among class I enzymes outside of the Rossmann fold.
Class II tRNA synthetases are similarly diverse. Their quaternary structures are generally dimeric but in some cases form homotetramers or α2β2 heterotetramers. Like class I enzymes, class II tRNA synthetases also possess a structurally conserved active site domain—in this case a mixed α/β domain dissimilar to the Rossmann fold. The active sites of class II tRNA synthetases are located toward the C-terminal end of the polypeptides. Three short sequence motifs in the active site domain are conserved in this class; one of these motifs functions in multimerization, whereas the other two have catalytic roles.
The tRNA synthetases are grouped into 23 phylogenetically distinct families. Eleven of these families fall into class I; the remaining 12 are class II enzymes (TABLE 23.2). Interestingly, two distinct types of LysRS enzymes fall into separate classes. Two noncanonical tRNA synthetase families with limited phylogenetic scope have also recently been discovered. These enzymes are the class II pyrrolysyl-tRNA synthetase (PylRS) (discussed in the section earlier in this chapter titled Novel Amino Acids Can Be Inserted at Certain Stop Codons) and the class II phosphoseryl-tRNA synthetase (SepRS). SepRS is restricted to methanogens (a subclass of archaea) and the closely related Archaeoglobus fulgidus. It attaches phosphoserine (Sep) onto tRNACys acceptors to produce a misacylated Sep-tRNACys type. All organisms possessing SepRS also possess a pyridoxal phosphate-dependent companion enzyme, SepCysS, which converts Sep-tRNACys to Cys-tRNACys. The sulfur donor used by SepCysS in vivo is unknown. Interestingly, some methanogens possess both the SepRS/SepCysS two-step pathway and, in parallel, the canonical CysRS enzyme. Recently, phosphoserine was cotranslationally inserted (in response to the UAG stop codon) into several recombinant proteins made in E. coli by introducing the SepRS enzyme together with an engineered version of elongation factor Tu. This new system holds enormous promise for the study of selectively phosphoserylated proteins such as those involved in signal transduction in mammalian cells.
TABLE 23.2 Separation of tRNA synthetases into two classes possessing mutually exclusive sets of sequence motifs and active-site structural domains. The quaternary structure of the enzyme is noted. Multiple designations indicate that the quaternary structure differs in different organisms. The quaternary structure of PylRS has not been clearly established.
Aminoacyl-tRNA Synthetases | |
---|---|
Class I | Class II |
GIn (α) | Asn (α2) |
Glu (α) | Asp (α2) |
Arg (α) | Ser (α2) |
Lys (α) | His (α2) |
Val (α) | Lys (α2) |
IIe (α) | Thr (α2) |
Leu (α) | Pro (α2) |
Met (α, α2) | Phe (α, α2β2) |
Cys (α, α2) | Ala (α, α4) |
Tyr (α2) | Gly (α, α2β2) |
Trp (α2) | Sep (α4) |
Pyl (?) |
Although there are 23 phylogenetically distinct tRNA synthetase families, most organisms possess only 18 of the enzymes. Typically missing from the repertoire are GlnRS and asparaginyl-tRNA synthetase (AsnRS). To synthesize Gln-tRNAGln and Asn-tRNAAsn, these organisms possess distinct glutamyl-tRNA synthetase (GluRS) and aspartyl-tRNA synthetase (AspRS) enzymes that are nondiscriminating (ND). GluRSND synthesizes both Glu-tRNAGlu as well as misacylated Glu-tRNAGln; AspRSND synthesizes both Asp-tRNAAsp and misacylated Asp-tRNAAsn. The misacylated tRNAs are then converted to Gln-tRNAGln and Asn-tRNAAsn by the action of a tRNA-dependent amidotransferase (AdT). AdTs are remarkable multimeric enzymes possessing three distinct activities (FIGURE 23.13). They first generate ammonia in one active site by deamidation of a nitrogen donor such as glutamine or asparagine. The ammonia is then shuttled through an intramolecular tunnel in the enzyme to emerge in a second site that binds the 3′ end of the misacylated tRNA. In the second active site, a kinase activity γ-phosphorylates the side-chain amino acid carboxylate of Glu-tRNAGln or Asp-tRNAAsn. Finally, the ammonia reacts to displace phosphate, forming Gln-tRNAGln or Asn-tRNAAsn. Distinct AdT families that function on both misacylated tRNAs or that are restricted to Gln-tRNAGln formation only also exist.
FIGURE 23.13 Mechanisms for the synthesis of Gln-tRNAGln and Asn-tRNAAsn. The top route in each case indicates the one-step pathway catalyzed by the conventional tRNA synthetase. The bottom, two-step pathways are found in most organisms. They consist of a nondiscriminating tRNA synthetase followed by the action of a tRNA-dependent amidotransferase (AdT).
Class I and class II synthetases are functionally differentiated in a number of ways. First, class I enzymes aminoacylate tRNA at the 2′–OH position of A76, whereas class II enzymes generally aminoacylate tRNA on the 3′–OH. The position of initial aminoacylation is related to the binding orientation of the tRNA on the enzyme. Class I synthetases bind tRNA on the minor groove side of the acceptor stem and require that the single-stranded 3′ terminus form a hairpin structure for proper juxtaposition with the amino acid and ATP in the active site (Figure 23.14). Class II synthetases instead bind the major groove side of the tRNA acceptor stem and do not require hairpinning of the tRNA 3′ end into the active site. A mechanistic distinction also exists: The reaction rates of class I synthetases are limited by release of aminoacylated tRNA product, whereas class II synthetases are limited by earlier chemical steps and/or physical rearrangements in the active sites.
FIGURE 23.14 Crystal structures show that class I and class II aminoacyl-tRNA synthetases bind the opposite faces of their tRNA substrates. The tRNA is shown in red and the protein in blue.
Photo courtesy of Dino Moras, Institute of Genetics and Molecular and Cellular Biology.
Aminoacyl-tRNA synthetases must distinguish one specific amino acid from the cellular pool of amino acids and related molecules and must also differentiate cognate tRNAs in a particular isoaccepting group (typically one to three) from the total set of tRNAs. tRNA discrimination can be successfully accomplished based on detailed differences in the L-shaped structures (see the section earlier in this chapter titled tRNAs Are Charged with Amino Acids by Aminoacyl-tRNA Synthetases). This occurs at both the initial binding step and at the level of induced fit; noncognate tRNAs derived from other isoaccepting groups lack the full identity set of nucleotides and are consequently unable to rearrange their structure to adopt an enzyme-bound conformation in which the reactive CCA terminus is properly aligned with the amino acid carboxylate group and the ATP α-phosphate. This rejection of noncognate tRNAs at a stage of the reaction that precedes the synthesis of misacylated tRNA is sometimes referred to as kinetic proofreading. The inability of noncognate tRNAs to proceed through the chemical steps of aminoacylation arises because the tRNA dissociates from the enzyme much faster than it can react (FIGURE 23.15).
FIGURE 23.15 Aminoacylation of cognate tRNAs by synthetase is based, in part, on greater affinities for these types, coupled with weak affinities for noncognate types. In addition, noncognate tRNAs are unable to fully undergo the induced-fit conformational changes required for the later catalytic steps.
In contrast, tRNA synthetases are unable to distinguish between some structurally similar amino acids in the course of the two-step aminoacyl-tRNA synthesis reaction alone. It is especially difficult for the enzymes to distinguish between two amino acids that differ only in the length of the carbon backbone (i.e., by one –CH2 group), or between amino acids of the same size that differ at only one atomic position. For example, the amino acid–binding pocket of isoleucyl-tRNA synthetase (IleRS) cannot distinguish isoleucine from valine sufficiently well enough to prevent synthesis of a significant amount of Val-tRNAIle. Similarly, valyl-tRNA synthetase (ValRS) synthesizes Thr-tRNAVal to a significant extent.
IleRS, ValRS, and at least seven additional tRNA synthetases (those specific to leucine, methionine, alanine, proline, phenylalanine, threonine, and lysine) are able to correct, or proofread, the aminoacyl adenylates and aminoacyl-tRNA formed in their active sites by means of additional activities that either hydrolyze the aminoacyl-AMP to yield free amino acid and AMP or that hydrolyze the misacylated tRNA to yield free amino acid and deacylated tRNA. The hydrolysis of aminoacyl-AMP is referred to as pretransfer editing, whereas the hydrolysis of aminoacyl-tRNA is referred to as posttransfer editing (FIGURE 23.16). In the case of pretransfer editing, it is also possible that some of the incorrectly formed aminoacyl-AMP dissociates from the active site, after which it is hydrolyzed nonenzymatically in solution (the aminoacyl ester bond is relatively unstable). This type of editing reaction can also be considered as a form of kinetic proofreading. In contrast, pretransfer hydrolysis of noncognate aminoacyl adenylate when bound by the enzyme, as well as enzyme-catalyzed posttransfer editing, are each known as chemical proofreading. Although pretransfer editing reactions may sometimes occur in the absence of tRNA (i.e., before tRNA binding), the presence of tRNA generally substantially improves the efficiency of the hydrolytic reaction. The extent to which pretransfer versus posttransfer editing predominates varies with the individual synthetase.
FIGURE 23.16 Proofreading by aminoacyl-tRNA synthetases may take place at the stage prior to aminoacylation (pretransfer editing), in which the noncognate aminoacyl adenylate is hydrolyzed. Alternatively or additionally, hydrolysis of incorrectly formed aminoacyl-tRNA may occur after its synthesis (posttransfer editing).
A general way to think of the editing reaction is in terms of the classic double-sieve mechanism, illustrated for IleRS in FIGURE 23.17, in which the size of the amino acid is used as the basis for discrimination. IleRS possesses two active sites: the synthetic (or activation) site located in the common class I Rossmann-fold domain and the editing (or hydrolytic) site located in the acceptor-binding domain (see the earlier section, Aminoacyl-tRNA Synthetases Fall into Two Classes). The crystal structure of IleRS shows that the synthetic site is too small to allow leucine to enter (the leucine side-chain is branched at a different position as compared with isoleucine). Indeed, all amino acids larger than isoleucine are excluded from activation because they cannot enter the synthetic site. However, some smaller amino acids that retain sufficient capacity to bind—such as valine—can enter the synthetic site and become attached to tRNA. The synthetic site functions as the first sieve. The editing site is smaller than the synthetic site and cannot accommodate the cognate isoleucine, but it does bind valine. Thus, Val-tRNAIle can be hydrolyzed in the editing site, functioning as the second sieve, while Ile-tRNAIle is not hydrolyzed.
FIGURE 23.17 Isoleucyl-tRNA synthetase has two active sites. Amino acids larger than Ile cannot be activated because they do not fit in the synthetic site. Amino acids smaller than Ile are removed because they are able to enter the editing site.
The double-sieve model functions as a convenient and generally accurate way to think of posttransfer editing. In IleRS, as well as in other editing tRNA synthetases from both class I and class II, the synthetic and editing sites are located a considerable distance apart, on the order of 10 to 40 Å. For posttransfer hydrolysis (editing) to occur, the misacylated aminoacyl-tRNA acceptor end is translocated across the surface of the enzyme, moving from the synthetic site to the editing site. This involves a change in the conformation of the acceptor end of the tRNA. In class I tRNA synthetases, the acceptor end adopts a hairpinned conformation when bound in the synthetic site (see the earlier section, Aminoacyl-tRNA Synthetases Fall into Two Classes) and an extended structure when bound in the editing site.
Translocation of the incorrect amino acid across the tRNA synthetase surface in posttransfer editing is possible because it is covalently bound to the 3′ end of the tRNA. In contrast, pretransfer editing occurs before formation of the aminoacyl-tRNA bond, and this reaction is instead localized within the confines of the synthetic active site. Kinetic partitioning of the aminoacyl-adenylate intermediate between hydrolysis and aminoacyl transfer may control the extent to which an editing tRNA synthetase relies on pretransfer versus posttransfer editing.
Isolation of mutant tRNAs has been one of the most potent tools for analyzing the ability of a tRNA to recognize its codon(s) in mRNA and for determining the effects that changes in different parts of the tRNA molecule have on codon–anticodon recognition.
Mutant tRNAs are isolated by virtue of their ability to overcome the effects of mutations in genes encoding polypeptides. In genetic terminology, a mutation that is able to overcome the effects of another mutation is called a suppressor.
In tRNA suppressor systems, the primary mutation changes a codon in an mRNA so that the polypeptide product is no longer functional. The secondary suppressor mutation changes the anticodon of a tRNA so that it recognizes the mutant codon instead of (or as well as) its original target codon. The amino acid that is now inserted restores polypeptide function. The suppressors are described as nonsense suppressors or missense suppressors, depending on the nature of the original mutation.
A nonsense mutation converts a codon that specifies an amino acid to one of the three stop codons. In a wild-type cell, such a nonsense mutation is recognized only by a release factor, which terminates translation. However, the second suppressor mutation in the tRNA anticodon creates an aminoacyl-tRNA that can recognize the termination codon. By inserting an amino acid, the second-site suppressor allows translation to continue beyond the site of nonsense mutation. This new capacity of the translation system allows a full-length polypeptide to be synthesized, as illustrated in FIGURE 23.18. If the amino acid inserted by suppression is different from the amino acid that was originally present at this site in the wild-type polypeptide, the activity of the polypeptide may be altered.
FIGURE 23.18 Nonsense mutations can be suppressed by a tRNA with a mutant anticodon, which inserts an amino acid at the mutant codon, producing a full-length polypeptide in which the original Leu residue has been replaced by Tyr.
Missense mutations change a codon representing one amino acid into a codon representing another amino acid—one that cannot function in the polypeptide in place of the original residue. (Formally, any substitution of amino acids constitutes a missense mutation, but in practice it is detected only if it changes the activity of the polypeptide.) The mutation can be suppressed by the insertion either of the original amino acid or of some other amino acid that restores the function of the polypeptide.
FIGURE 23.19 demonstrates that missense suppression can be accomplished in the same way as nonsense suppression, by mutating the anticodon of a tRNA carrying an acceptable amino acid so that it recognizes the mutant codon. Thus, missense suppression involves a change in the meaning of the codon from one amino acid to another.
FIGURE 23.19 Missense suppression occurs when the anticodon of tRNA is mutated so that it responds to the wrong codon. The suppression is only partial because both the wild-type tRNA and the suppressor tRNA can recognize AGA.
Nonsense suppressors fall into three classes, one for each type of termination codon. TABLE 23.3 describes the properties of some of the best characterized suppressors.
TABLE 23.3 Nonsense suppressor tRNAs are generated by mutations in the anticodon.
Locus | tRNA | Wild Type | Suppressor |
---|---|---|---|
Codon/Anti | Anti/Codon | ||
SupD (su1) | Ser | UCG CGA | CUA UAG |
SupdE (su2) | Gin | CAG CUG | CUA UAG |
SupdE (su3) | Tyr | UACU GUA | CUA UAG |
SupdE (su4) | Tyr | UACU GUA | UUA UAAG |
SupdE (su5) | Lys | AAAG UUU | UUA UAAG |
SupdU (su7) | Trp | UGG CCA | UCA UGAG |
The easiest to characterize have been the so-called amber suppressors. In E. coli, at least six tRNAs have been mutated to recognize UAG codons. All of the amber suppressor tRNAs have the anticodon CUA←, in each case derived from wild type by a single base change. The site of mutation can be any one of the three bases of the anticodon, as seen in the mutants supD, supE, and supF. Each suppressor tRNA recognizes only the UAG codon instead of its former codon(s). The amino acids inserted are serine, glutamine, or tyrosine—the same as those carried by the corresponding wild-type tRNAs.
Ochre suppressors also arise by mutations in the anticodon. The best known are supC and supG, which insert tyrosine or lysine in response to both ochre (UAA) and amber (UAG) codons. This is consistent with the prediction of the wobble hypothesis that UAA cannot be recognized alone.
A UGA suppressor has an unexpected property. It is derived from tRNATrp, but its only mutation is the substitution of A in place of G at position 24. This change replaces a G-U pair in the D stem with an A-U pair, increasing the stability of the helix. The sequence of the anticodon remains the same as the wild-type CCA←, so the mutation in the D stem must in some way alter the conformation of the anticodon loop, allowing CCA← to pair with UGA in an unusual wobble pairing of C with A. The suppressor tRNA continues to recognize its usual codon UGG.
A related situation is seen in the case of a particular eukaryotic tRNA. Bovine liver contains a tRNASer with the anticodon mCCA←. The wobble rules predict that this tRNA should recognize the tryptophan codon UGG, but in fact it recognizes the termination codon UGA. It is possible that UGA is suppressed naturally in this situation.
The general importance of these observations lies in the demonstration that codon–anticodon recognition of either wild-type or mutant tRNA cannot be predicted entirely from the relevant triplet sequences but may in some cases be influenced by other features of the molecule.
An interesting difference exists between the usual recognition of a codon by its proper aminoacyl-tRNA and the situation in which mutation allows a suppressor tRNA to recognize a new codon. In the wild-type cell, only one meaning can be attributed to a particular codon, which represents either a particular amino acid or a signal for termination. However, in a cell carrying a suppressor mutation the mutant codon may either be recognized by the suppressor tRNA or be read with its usual meaning.
A nonsense suppressor tRNA must compete with the release factors that recognize the termination codon(s). A missense suppressor tRNA must compete with the tRNAs that respond properly to its new codon. In each case, the extent of competition influences the efficiency of suppression, so the effectiveness of a particular suppressor depends not only on the affinity between its anticodon and the target codon but also on its concentration in the cell and on the parameters governing the competing termination or insertion reactions.
The efficiency with which any particular codon is read is influenced by its location. Thus, the extent of nonsense suppression by a particular tRNA can vary quite widely, depending on the context of the codon. The effect that neighboring bases in mRNA have on codon–anticodon recognition is poorly understood, but the context can change the frequency with which a codon is recognized by a particular tRNA by more than an order of magnitude.
A nonsense suppressor is isolated by its ability to respond to a mutant nonsense codon. However, the same triplet sequence constitutes one of the normal termination signals of the cell. The mutant tRNA that suppresses the nonsense mutation must, in principle, be able to suppress natural termination at the end of any gene that uses this codon. FIGURE 23.20 shows that this readthrough results in the synthesis of a longer polypeptide, with additional C-terminal sequence. The extended polypeptide will end at the next termination triplet sequence found in the reading frame. Any extensive suppression of termination is likely to be deleterious to the cell by producing extended polypeptides whose functions are thereby altered.
FIGURE 23.20 Nonsense suppressors also read through natural termination codons, synthesizing polypeptides that are longer than the wild type.
Amber suppressors tend to be relatively efficient, usually in the range of 10% to 50%, depending on the system. This efficiency is possible because amber codons are used relatively infrequently to terminate translation in E. coli. In contrast, ochre suppressors are difficult to isolate. They are always much less efficient, usually with activities below 10%. All ochre suppressors grow rather poorly, which indicates that suppression of both UAA and UAG is damaging to E. coli, probably because the UAA ochre codon is used most frequently as a natural termination signal. Finally, UGA is the least efficient of the termination codons in its natural function; it is misread by tRNATrp as frequently as 1% to 3% in wild-type cells. However, in spite of this deficiency, UGA is used more commonly than the amber triplet UAG to terminate bacterial translation.
A missense suppressor tRNA that compensates for a mutated codon at one position may have the effect of introducing an unwanted mutation in another gene. A suppressor corrects a mutation by substituting one amino acid for another at the mutant site. However, in other locations, the same substitution will replace the wild-type amino acid with a new amino acid. The change may inhibit normal polypeptide function. This poses a dilemma for the cell: It must suppress what is a mutant codon at one location but not change too extensively its normal meaning at other locations. The absence of any strong missense suppressors is most likely explained by the damaging effects that would be caused by a general and efficient substitution of amino acids.
A mutation that creates a suppressor tRNA can have two consequences. First, it allows the tRNA to recognize a new codon. Second, it sometimes prevents the tRNA from recognizing the codons to which it previously responded. It is significant that all the high-efficiency amber suppressors are derived by mutation of one copy of a redundant tRNA set. In these cases, the cell has several tRNAs able to respond to the codon originally recognized by the wild-type tRNA. Thus, the mutation does not abolish recognition of the old codons, which continue to be served adequately by the tRNAs of the set. In the unusual situation in which there is only a single tRNA that responds to a particular codon, any mutation that prevents the response would be lethal.
Suppression is most often considered in the context of a mutation that changes the reading of a codon. However, in some situations a stop codon is read as an amino acid at a low frequency in wild-type cells. The first example discovered was the coat protein gene of the RNA phage Qβ. The formation of infective Qβ particles requires that the stop codon at the end of this gene be suppressed at a low frequency to generate a small proportion of coat proteins with a C-terminal extension. In effect, this stop codon is leaky. The reason is that tRNATrp recognizes the codon at a low frequency.
Readthrough past stop codons also occurs in eukaryotes, where it is employed most often by RNA viruses. This may involve the suppression of UAG/UAA by tRNATyr, tRNAGln, or tRNALeu or the suppression of UGA by tRNATrp or tRNAArg. The extent of partial suppression is dictated by the context surrounding the codon.
The error rate for incorporation of amino acids into polypeptides must be kept low, in the range of one misincorporation per 10,000 amino acids, to ensure that the functional properties of the encoded polypeptides are not altered in such a way as to be deleterious to the cell. Errors may be made in the following general stages of translation (see the Translation chapter):
Charging a tRNA only with its correct amino acid is clearly critical. This is a function of the aminoacyl-tRNA synthetase. The error rate varies with the particular enzyme, in the range of one misincorporation per 105 to 107 aminoacylations (as discussed earlier in this chapter).
Transporting only correctly aminoacylated tRNA to the ribosome, the function of initiation or elongation factors, can provide a mechanism for enhancing overall selectivity. In addition, these factors assist in the process of docking aminoacyl-tRNA to the ribosomal P and A sites.
The specificity of codon–anticodon recognition is also crucial. Although binding constants vary with the individual codon–anticodon pairing, the intrinsic specificity associated with formation of a cognate versus noncognate 3-bp sequence (about 10−1 to 10−2) is far too low to provide an error rate of 10−5.
It had long been assumed that the bacterial elongation factor EF-Tu is a sequence-nonspecific RNA-binding protein, given that it must transport all aminoacyl-tRNAs (except for the initiator tRNA) to the ribosome. However, EF-Tu recognizes both the amino acid portion of the aminoacyl-tRNA bond and the tRNA body, where it primarily binds to the sugar–phosphate backbone in the acceptor and T stems. Studies in which EF-Tu binding affinity to correctly and incorrectly aminoacylated tRNA was measured have shown that the strength of binding to the amino acid is inversely correlated with the strength of binding to the tRNA body; that is, weakly bound amino acids are correctly esterified to tightly bound tRNA bodies, and tightly bound amino acids are correctly esterified to weakly bound tRNA bodies. As a result, correctly acylated aminoacyl-tRNAs bind EF-Tu with quite similar affinities. Selectivity in overall translation can then result because misacylation of a weakly bound amino acid to a weakly bound tRNA body produces a noncognate aminoacyl-tRNA that interacts very poorly with EF-Tu. It is also possible that a misacylated aminoacyl-tRNA that binds more tightly to EF-Tu may be discriminated against because it is more difficult to properly release this type upon docking to the ribosome.
It has been found that mutations in EF-Tu are able to suppress frameshifting errors (see the next section, Frameshifting Occurs at Slippery Sequences, for a discussion of frameshifting). This implies that EF-Tu does not merely bring aminoacyl-tRNA to the A site, but it also is involved in positioning the incoming aminoacyl-tRNA relative to the peptidyl-tRNA in the P site. Similarly, mutations in the yeast initiation factor eIF2 allow the initiation of translation at a start codon that is mutated from AUG to UUG. This implies a role for eIF2 in assisting the docking of tRNAiMet to the P site.
Proofreading on the ribosome, to enhance the intrinsically low level of specificity achievable from codon–anticodon base pairing alone, requires additional interactions provided by the local environment in the 30S subunit. In its function as a proofreader the ribosome amplifies the modest intrinsic selectivity of trinucleotide pairing by as much as 1,000-fold (FIGURE 23.21).
FIGURE 23.21 Any aminoacyl-tRNA can be placed in the A site (by EF-Tu), but only one that pairs with the anticodon can make stabilizing contacts with rRNA. In the absence of these contacts, the aminoacyl-tRNA diffuses out of the A site.
Aminoacyl-tRNA selection by the ribosome occurs at several stages along the pathway by which the EF-Tu–GTP–aminoacyl-tRNA ternary complex forms after aminoacylation delivers aminoacyl-tRNA to the ribosomal A site. First, a rather unstable initial binding complex forms with the ribosome. Next, there is a codon-recognition step in which the initial complex is rearranged to permit codon–anticodon pairing in the A site. Recall that the adjacent P site accommodates peptidyl-tRNA (see the Translation chapter). Both the initial binding step and the subsequent codon-recognition step are reversible. Mispaired aminoacyl-tRNAs can be rejected at these stages by a combination of increased dissociation rates and/or lowered association rates for mispaired complexes.
After codon–anticodon recognition, a further conformational change triggers hydrolysis of GTP. Release of phosphate from the GDP-bound EF-Tu then occurs; this release triggers another extensive conformational rearrangement, whereby EF-Tu–GDP dissociates from the aminoacyl-tRNA–ribosome complex. Only after EF-Tu dissociates do final conformational rearrangements associated with docking of the aminoacyl moiety into the 50S peptidyl transfer site, and the subsequent peptidyl transfer reaction, occur. In addition to selection at the early binding stage, rejection of mispaired aminoacyl-tRNA can also take place after the GTP hydrolysis step. Here the rejection occurs because the rate of the final conformational transition is very slow in the case of a misacylated complex. Thus, the overall specificity is enhanced because the tRNA must pass through two selection steps before peptide bond formation can occur.
The precision of codon–anticodon pairing in the A site is maintained by close monitoring of the steric and electrostatic properties of the trinucleotide. Three conserved bases in the 16S ribosomal RNA (A1492, A1493, and G530) interact closely with the minor groove of the codon–anticodon helix at the first two base pairs and are able to accurately assess the presence of canonical Watson–Crick base pairs at these positions. At the third (wobble) position, some noncanonical pairs can be accommodated because the ribosomal RNA does not monitor the pairing as closely. Ultimately, it is the failure of misacylated tRNA to fully meet the scrutiny of the ribosome at the codon–anticodon helix, and perhaps other positions, that leads to its rejection either before or after the GTP hydrolysis step.
Recently, an additional mechanism that contributes to the specificity of translation has been discovered: The ribosome is able to exert quality control after the formation of the peptide bond. In this mechanism, the formation of a peptide bond that arises from a mismatched aminoacyl-tRNA in the A site leads to a more general loss in specificity in the A site. In turn, this results in the early termination of translation.
The mechanism by which the ribosome recognizes errors after peptide bond synthesis is by monitoring the precise complementarity of the codon–anticodon helix in the peptidyl (P) site. The consequence of the misincorporation is the increased capacity of release factors to bind in the A site to cause premature termination, even when a stop codon is not present. Additionally, the rate of improper coding in the adjacent A site is increased. The resulting propagation of errors ultimately leads to premature termination.
The cost of translation, as calculated by the number of high-energy bonds that must be hydrolyzed, is clearly increased by proofreading processes. The extent of the increased energetic cost depends on the stage at which the misacylated tRNA is rejected. The cost associated with rejection before GTP hydrolysis is associated only with the production of the misacylated tRNA by the tRNA synthetase. However, if GTP is hydrolyzed before the mismatched aminoacyl-tRNA dissociates, the energetic cost will be greater. Of course, the greatest cost is associated with the premature termination of translation to give a nonfunctional product, in post-peptidyl-transfer quality control. In that case, the full energetic payment associated with synthesis of the polypeptide to the point of premature release must be paid.
Recoding events usually involve changes to the meaning of a single codon. Examples include the phenomenon of tRNA suppression (see the section earlier in this chapter titled Suppressor tRNAs Have Mutated Anticodons That Read New Codons) and the covalent modification of an aminoacyl-tRNA (see the section earlier in this chapter titled Novel Amino Acids Can Be Inserted at Certain Stop Codons). However, three other types of recoding cause more global changes in the resulting polypeptide product. These are frameshifting (considered in this section), bypassing, and the use of two mRNAs to synthesize one polypeptide (both are discussed in the next section, Other Recoding Events: Translational Bypassing and the tmRNA Mechanism to Free Stalled Ribosomes).
Frameshifting is associated with specific tRNAs in two circumstances:
Some mutant tRNA suppressors recognize a “codon” of four bases instead of the usual three bases.
Certain “slippery” sequences allow a tRNA to move along the mRNA in the A site by one base in either the 5′ or 3′ direction.
Frameshift mutants in a polypeptide result from an aberrant reading of the mRNA codon. Instead of reading a codon triplet, the ribosome reads either a doublet or a quadruplet set of nucleotides. In either case, resumption of triplet reading following this event results in a polypeptide that is out of frame. A frameshift can be suppressed by means of a tRNA that is capable of reading a two- or four-base codon. In the case of four-base codons, the tRNA possesses an expanded anticodon loop consisting of eight nucleotides instead of the normal seven. For example, a G may be inserted in a run of several contiguous G bases. The frameshift suppressor is a tRNAGly that has an extra base inserted in its anticodon loop, converting the anticodon from the usual triplet sequence CCC← to the quadruplet sequence CCCC←. The suppressor tRNA recognizes a four-base “codon.”
Some frameshift suppressors can recognize more than one four-base codon. For example, a bacterial tRNALys suppressor can respond to either AAAA or AAAU instead of the usual codon AAA. Another suppressor can read any four-base codon with ACC in the first three positions; the next base is irrelevant. In these cases, the alternative bases that are acceptable in the fourth position of the longer codon are not related by the usual wobble rules. The suppressor tRNA probably recognizes a three-base codon, but for some other reason—most likely steric hindrance—the adjacent base is blocked. This forces one base to be skipped before the next tRNA can find a codon.
Situations in which frameshifting is a normal event are found in phages and other viruses. Such events may affect the continuation or termination of translation and result from the intrinsic properties of the mRNA.
In retroviruses, translation of the first gene is terminated by a nonsense codon in phase with the reading frame. The second gene lies in a different reading frame and (in some viruses) is translated by a frameshift that changes to the second reading frame and therefore bypasses the termination codon (see FIGURE 23.22 and also the Transposable Elements and Retroviruses chapter). The efficiency of the frameshift is low, typically around 5%. The low efficiency is important in the replicative cycle of the virus; an increase in efficiency can be damaging. FIGURE 23.23 illustrates the similar situation of the yeast Ty element, in which the termination codon of tya must be bypassed by a frameshift in order to read the subsequent tyb gene.
FIGURE 23.22 A tRNA that slips one base in pairing with codon causes a frameshift that can suppress termination. The efficiency is usually about 5%.
FIGURE 23.23 A +1 frameshift is required for expression of the tyb gene of the yeast Ty element. The shift occurs at a seven-base sequence at which two Leu codon(s) are followed by a scarce Arg codon.
Such situations make the important point that the rare (but predictable) occurrence of “misreading” events can be relied on as a necessary step in natural translation. This is called programmed frameshifting. It occurs at particular sites at frequencies that are 100 to 1,000 times greater than the rate at which errors are made at nonprogrammed sites (about 3 × 10−5 per codon).
This type of frameshifting has two common features:
A “slippery” sequence allows an aminoacyl-tRNA to pair with its codon and then to move 1+ or −1 base to pair with an overlapping triplet sequence that can also pair with its anticodon.
The ribosome is delayed at the frameshifting site to allow time for the aminoacyl-tRNA to rearrange its pairing. The cause of the delay can be an adjacent codon that requires a scarce aminoacyl-tRNA, a termination codon that is recognized slowly by its release factor, or a structural impediment in mRNA (e.g., a “pseudoknot,” a particular conformation of RNA) that impedes the ribosome.
Slippery events can involve movement in either direction: A −1 frameshift is caused when the tRNA moves backward, and a +1 frameshift is caused when it moves forward. In either case, the result is to expose an out-of-phase triplet in the A site for the next aminoacyl-tRNA. The frameshifting event occurs before peptide bond formation. In the most common type of case, when it is triggered by a slippery sequence in conjunction with a downstream hairpin in mRNA, the surrounding sequences influence its efficiency.
The frameshifting in Figure 23.23 shows the behavior of a typical slippery sequence. The seven-nucleotide sequence CUUAGGC is usually recognized by tRNALeu at CUU, followed by tRNAArg at AGG. However, tRNAArg is scarce and when its scarcity results in a delay, tRNALeu slips from the CUU codon to the overlapping UUA triplet. This causes a frameshift because the next triplet in phase with the new pairing (GGC) is read by tRNAGly. Slippage usually occurs in the P site (when tRNALeu actually has become peptidyl-tRNA, carrying the nascent chain).
Frameshifting at a stop codon causes readthrough of the polypeptide. The base on the 3′ side of the stop codon influences the relative frequencies of termination and frameshifting and thus affects the efficiency of the termination signal. This helps to explain the significance of context on termination.
Bypassing involves a movement of the ribosome to change the codon that is paired with the peptidyl-tRNA in the P site. The sequence between the two codons is skipped over and is not represented in the polypeptide product. As shown in FIGURE 23.24, this allows translation to continue past any termination codons in the intervening region. This is a very rare phenomenon; one of the few authenticated examples is that of gene 60 of phage T4, where the ribosome moves 60 nucleotides along the mRNA. Bypassing in individual cells has also been documented to be a result of nutrient starvation.
FIGURE 23.24 Bypassing occurs when the ribosome moves along mRNA so that the peptidyl-tRNA in the P site is released from pairing with its codon and then repairs with another codon farther along.
The key to the bypass system is that there are identical (or synonymous) codons at either end of the skipped sequence. These are sometimes referred to as the “takeoff” and “landing” sites. Before bypass, the ribosome is positioned with a peptidyl-tRNA paired with the takeoff codon in the P site, with an empty A site waiting for an aminoacyl-tRNA to enter. FIGURE 23.25 shows that the ribosome slides along mRNA in this condition until the peptidyl-tRNA can become paired with the codon in the landing site.
FIGURE 23.25 In bypass mode, a ribosome with its P site occupied can stop translation. It slides along mRNA to a site where peptidyl-tRNA pairs with a new codon in the P site. Then translation is resumed.
The sequence of the mRNA triggers the bypass. The important features are the two GGA codons for takeoff and landing, the spacing between them, a stem-loop structure that includes the takeoff codon, and a stop codon positioned adjacent to the takeoff codon.
The takeoff stage requires the peptidyl-tRNA to unpair from its codon. This is followed by a movement of the mRNA that prevents it from re-pairing. Then the ribosome scans the mRNA until the peptidyl-tRNA can re-pair with the codon in the landing reaction. This is followed by the resumption of translation when aminoacyl-tRNA enters the A site in the usual way.
Like frameshifting, the bypass reaction depends on a pause by the ribosome. The probability that peptidyl-tRNA will dissociate from its codon in the P site is increased by delays in the entry of aminoacyl-tRNA into the A site. Starvation for an amino acid can trigger bypassing in bacterial genes because of the delay that occurs when there is no aminoacyl-tRNA available to enter the A site. In phage T4 gene 60, one role of mRNA structure may be to reduce the efficiency of termination, thus creating the delay that is needed for the takeoff reaction.
The rescue of stalled ribosomes in bacteria and some mitochondria is accomplished by means of a unique mRNA–tRNA hybrid, termed tmRNA, which contains two functional domains. One domain mimics part of tRNAAla, whereas the second domain encodes a short polypeptide. tmRNA is first aminoacylated by alanyl-tRNA synthetase (AlaRS). It is then bound by EF-Tu and subsequently used in a ternary complex at the A site of stalled ribosomes. Peptidyl transfer occurs on the ribosome to join alanine to the C-terminal end of the stalled nascent protein; simultaneously, the mRNA present on the ribosome is replaced by the second domain of tmRNA. tmRNA then functions as a template for the synthesis of 10 additional amino acids, after which a stop codon is present to terminate translation and release the protein. The newly added C-terminal sequence then acts as a tag for subsequent recognition by proteases, which degrade the truncated protein. tmRNA thus functions as a quality-control mechanism to recycle stalled ribosomes and to remove truncated proteins that might otherwise accumulate.
The sequence of mRNA read in triplets in the 5′ to 3′ direction is related by the genetic code to the amino acid sequence of a polypeptide read from the N-terminus to the C-terminus. Of the 64 triplets, 61 encode amino acids and 3 provide termination signals. Synonymous codons that represent the same amino acids are related, often by a difference in the third base of the codon. This third-base degeneracy, coupled with a pattern in which chemically similar amino acids tend to be encoded by related codons, minimizes the effects of mutations. The genetic code is nearly universal and must have been established very early in evolution. Variations in the code in nuclear genomes are rare, but some changes have occurred during mitochondrial evolution.
Multiple tRNAs may recognize a particular codon. The set of tRNAs recognizing the various codons for each amino acid is distinctive for each organism. Codon–anticodon recognition involves wobbling at the first position of the anticodon (third position of the codon), which allows some tRNAs to recognize multiple codons. All tRNAs have modified bases, introduced by enzymes that recognize target bases in the tRNA structure. Codon–anticodon pairing is influenced by modifications of the anticodon itself and also by the context of adjacent bases, especially on the 3′ side of the anticodon. Taking advantage of codon–anticodon wobble allows vertebrate mitochondria to use only 22 tRNAs to recognize all codons, compared with the usual minimum of 31 tRNAs; this is assisted by the changes in the mitochondrial code.
Each amino acid is recognized by a particular aminoacyl-tRNA synthetase, which also recognizes all of the tRNAs encoding that amino acid. Some aminoacyl-tRNA synthetases have a proofreading function that scrutinizes the aminoacyl-tRNA products and hydrolyzes incorrectly joined aminoacyl-tRNAs.
Aminoacyl-tRNA synthetases vary widely but fall into two general groups featuring mutually exclusive sequence motifs and protein structures in their catalytic domains. The two groups of synthetases are also distinguished by the initial site of aminoacylation on the 3′-terminal tRNA ribose, by the orientation of binding of the tRNA acceptor helix, and by the rate-limiting step in aminoacylation. A defined set of nucleotides in the tRNA, termed the identity set, is selectively recognized by the synthetase using a combination of direct and indirect readout mechanisms. In many cases the identity set is localized at the anticodon and 3′-acceptor ends of the molecule.
Mutations may allow a tRNA to read different codons; the most common form of such mutations occurs in the anticodon itself. Alteration of the anticodon may allow a tRNA to suppress a mutation in a gene encoding a polypeptide. A tRNA that recognizes a termination codon provides a nonsense suppressor, whereas a tRNA that changes the amino acid recognizing a codon is a missense suppressor. Suppressors of UAG codons are more efficient than those of UAA codons, which is explained by the fact that UAA is the most commonly used natural termination codon. However, the efficiency of all suppressors depends on the context of the individual target codon.
Frameshifts of the +1 type may be caused by aberrant tRNAs that read “codons” of four bases. Frameshifts of either +1 or −1 may be caused by slippery sequences in mRNA that allow a peptidyl-tRNA to slip from its codon to an overlapping sequence that can also pair with its anticodon. Certain programmed frameshifts determined by the mRNA sequence may be required for expression of natural genes. Bypassing occurs when a ribosome stops translation and moves along mRNA with its peptidyl-tRNA in the P site until the peptidyl-tRNA pairs with an appropriate codon; then translation resumes. The use of tmRNA provides a quality-control mechanism to recycle stalled ribosome and to remove undesirable truncated polypeptide products.
Nirenberg, M. W., and Leder, P. (1964). The effect of trinucleotides upon the binding of sRNA to ribosomes. Science 145, 1399–1407.
Nirenberg, M. W., and Matthaei, H. J. (1961). The dependence of cell-free protein synthesis in E. coli upon naturally occurring or synthetic polyribonucleotides. Proc. Natl. Acad. Sci. USA 47, 1588–1602.
Crick, F. H. C. (1966). Codon-anticodon pairing: the wobble hypothesis. J. Mol. Biol. 19, 548–555.
Hopper, A. K., and Phizicky, E. M. (2003). tRNA transfers to the limelight. Genes Dev. 17, 162–180.
Hyde, S. J., Eckenroth, B. E., Smith, B. A., Eberley, W. A., Heintz, N. H., Jackman, J. E., and Doublie, S. (2010). tRNA(His) guanylyl-transferase (THG1), a unique 3′-5′ nucleotidyl transferase, shares unexpected structural homology with canonical 5′-3′ polymerases. Proc. Natl. Acad. Sci. USA 107, 20305–20310.
Hopper, A. K., and Phizicky, E. M. (2003). tRNA transfers to the limelight. Genes Dev. 17, 162–180.
Agris, P. F. (2008). Bringing order to translation: the contributions of transfer RNA anticodon-domain modifications. EMBO R. 9, 629–635.
Chawla, M., Oliva, R., Bujnicki, J. M., and Cavallo, L. (2015). An atlas of RNA base pairs involving modified nucleobases with optimal geometries and accurate energies. Nucleic Acids Res. 43, 6714–6729.
Osawa, S., Jukes, T. H., Watanabe, K., and Muto, A. (1992). Recent evidence for evolution of the genetic code. Microbiol. Rev. 56, 229–264.
Santos, M. A. S., Moura, G., Massey, S. E., and Tuite, M. F. (2004). Driving change: the evolution of alternative genetic codes. Trends Genet. 20, 95–102.
Ambrogelly, A., Palioura, S., and Söll, D. (2007). Natural expansion of the genetic code. Nat. Chem. Biol. 3, 29–35.
Krzycki, J. (2005). The direct genetic encoding of pyrrolysine. Curr. Opin. Microbiol. 8, 706–712.
Srinivasan, G., James, C. M., and Krzycki, J. A. (2002). Pyrrolysine encoded by UAG in Archaea: charging of a UAG-decoding specialized tRNA. Science 296, 1459–1462.
Turanov, A. A., Lobanov, A. V., Fomenko, E. D., Morrison, H. G., Sogin, M. L., Klobutcher, L. A., Hatfield, D. L., and Gladyshev, V. N. (2009). Genetic code supports targeted insertion of two amino acids by one codon. Science 323, 259–261.
Giege, R., Sissler, M., and Florentz, C. (1998). Universal rules and idiosyncratic features in tRNA identity. Nucleic Acids Res. 26, 5017–5035.
Ibba, M., and Söll, D. (2000). Aminoacyl-tRNA synthesis. Annu. Rev. Biochem. 69, 617–650.
Perona, J. J., and Hou, Y-M. (2007). Indirect readout of tRNA for aminoacylation. Biochemistry 46, 10419–10432.
Ibba, M., and Söll, D. (2004). Aminoacyl-tRNAs: setting the limits of the genetic code. Genes Dev. 18, 731–738.
Eriani, G., Delarue, M., Poch, O., Gangloff, J., and Moras, D. (1990). Partition of tRNA synthetases into two classes based on mutually exclusive sets of sequence motifs. Nature 347, 203–206.
Park, H.-S., Hohn, M. J., Umehara, T., Guo, L.-T. Osborne, E. M., Benner, J., Noren, C. J., Rinehart, J., and Söll, D. (2011). Expanding the genetic code of Escherichia coli with phosphoserine. Science 333, 1151–1154.
Rould, M. A., Perona, J. J., Söll, D., and Steitz, T. A. (1989). Structure of E. coli glutaminyl-tRNA synthetase complexed with tRNAGln and ATP at 28Å resolution. Science 246, 1135–1142.
Ruff, M., Krishnaswamy, S., Boeglin, M., Poterszman, A., Mitschler, A., Podjarny, A., Rees, B., Thierry, J. C., and Moras, D. (1991). Class II aminoacyl tRNA synthetases: crystal structure of yeast aspartyl-tRNA synthetase complexes with tRNAAsp. Science 252, 1682–1689.
Sauerwald, A., Zhu, W., Major, T. A., Roy, H., Palioura, S., Jahn, D., Whitman, W. B., Yates, J. R., III, Ibba, M., and Söll, D. (2005). RNA-dependent cysteine biosynthesis in archaea. Science 307, 196–1972.
Dulic, M., Cvetesic, N., Perona, J. J., and Gruic-Sovulj, I. (2010). Partitioning of tRNA-dependent editing between pre-and post-transfer pathways in class I aminoacyl-tRNA synthetases. J. Biol. Chem. 285, 23799–23809.
Lin, L., Hale, S. P., and Schimmel, P. (1996). Aminoacylation error correction. Nature 384, 33–34.
Minajigi, A., and Francklyn, C. S. (2010). Aminoacyl transfer rate dictates choice of editing pathway in threonyl-tRNA synthetase. J. Biol. Chem. 285, 23810–23817.
Silvian, L. F., Wang, J., and Steitz, T. A. (1999). Insights into editing from an Ile-tRNA synthetase structure with tRNAIle and mupirocin. Science 285, 1074–1077.
Beier, H., and Grimm, M. (2001). Misreading of termination codons in eukaryotes by natural nonsense suppressor tRNAs. Nucleic Acids Res. 29, 4767–4782.
Eggertsson, G., and Söll, D. (1988). Transfer RNA-mediated suppression of termination codons in E. coli. Microbiol. Rev. 52, 354–374.
Lu, Z. (2012). Interaction of nonsense suppressor tRNAs and codon nonsense mutations or termination codons. Adv. Biol. Chem. 2, 301–314.
Murgola, E. J. (1985). tRNA, suppression, and the code. Annu. Rev. Genet. 19, 57–80.
Ruan, B., Palioura, S., Sabina, J., Marvin-Guy, L., Kochhar, S., LaRossa, R. A., and Söll, D. (2009). Quality control despite mistranslation caused by an ambiguous genetic code. Proc. Natl. Acad. Sci. USA 105, 16502–16507.
Daviter, T., Gromadski, K. B., and Rodnina, M. V. (2006). The ribosome’s response to codon-anticodon mismatches. Biochimie 88, 1001–1011.
Ogle, J. M., and Ramakrishnan, V. (2005). Structural insights into translational fidelity. Annu. Rev. Biochem. 74, 129–177.
LaRiviere, F. J., Wolfson, A. D., and Uhlenbeck, O. C. (2001). Uniform binding of aminoacyl-tRNAs to elongation factor Tu by thermodynamic compensation. Science 294, 165–168.
Ogle, J. M., Brodersen, D. E., Clemons, W. M., Tarry, M. J., Carter, A. P., and Ramakrishnan, V. (2001). Recognition of cognate transfer RNA by the 30S ribosomal subunit. Science 292, 897–902.
Zaher, H. S., and Green, R. (2009). Quality control by the ribosome following peptide bond formation. Nature 457, 161–166.
Baranov, P. B., Gesteland, R. F., and Atkins, J. F. (2002). Recoding: translational bifurcations in gene expression. Gene 286, 187–202.
Gesteland, R. F., and Atkins, J. F. (1996). Recoding: dynamic reprogramming of translation. Annu. Rev. Biochem. 65, 741–68.
Chen, J., Petrov, A., Johansson, M., Tsai, A., O’Leary, S. E., and Puglisi, J. D. (2014). Dynamic pathways of –1 translational frameshifting. Nature 512, 328–332.
Jacks, T., Power, M. D., Masiarz, F. R., Luciw, P. A., Barr, P. J., and Varmus, H. E. (1988). Characterization of ribosomal frameshifting in HIV-1 gag-pol expression. Nature 331, 280–283.
Herr, A. J., Atkins, J. F., and Gesteland, R. F. (2000). Coupling of open reading frames by translational bypassing. Annu. Rev. Biochem. 69, 343–372.
Gallant, J. A., and Lindsley, D. (1998). Ribosomes can slide over and beyond “hungry” codons, resuming protein chain elongation many nucleotides downstream. Proc. Natl. Acad. Sci. USA 95, 13771–13776.
Huang, W. M., Ao, S. Z., Casjens, S., Orlandi, R., Zeikus, R., Weiss, R., Winge, D., and Fang, M. (1988). A persistent untranslated sequence within bacteriophage T4 DNA topoisomerase gene 60. Science 239, 1005–1012.
Samatova, E., Konevega, A. L., Wills, N. M., Atkins, J. F., and Rodnina, M. V. (2014). High-efficiency translational bypassing of non-coding nucleotides specified by mRNA structure and nascent peptide. Nat. Commun. 5, doi:10.1038/ncomms5459