RNA Metabolism
26.1 DNA-Dependent Synthesis of RNA 1022
26.2 RNA Processing 1033
26.3 RNA-Dependent Synthesis of RNA and DNA 1050
xpression of the information in a gene generally in-
volves production of an RNA molecule transcribed
from a DNA template. Strands of RNA and DNA may
seem quite similar at first glance, differing only in that
RNA has a hydroxyl group at the 2' position of the al-
dopentose, and uracil instead of thymine. However, un-
J-ike DNA, most RNAs carry out their functions as single
strands, strands that fold back on themselves and have
the potential for much greater structural diversity than
DNA (Chapter 8). RNA is thus suited to a variety of cel-
lular functions.
RNA is the only macromolecule known to have a
role both ir the storage and transmission of information
and in catalysis, which has led to much speculation
about its possible role as an essential chemical interme-
diate in the development of Iife on this planet. The dis-
covery of catalfiic RNAs, or riboz;.'rnes, has changed the
very definition of an enzJrme, extending it beyond the
domain of proteins. Proteins nevertheless remain essen-
tial to RNA and its functions. In the modern cell, all
nucleic acids, including RNAs, are complexed with pro-
teins. Some of these complexes are quite elaborate, and
RNA can assume both structural and catalytic roles
within complicated biochemical machines.
All RNA molecules except the RNA genomes of cer-
tain viruses are derived from information permanently
stored in DNA. During transcription, an enz).rne sys-
tem converts the genetic information in a segment of
double-stranded DNA into an RNA strand with a base
sequence complementary to one of the DNA strands.
Three major kinds of RNA are produced. Messenger
RNAS (mRNAs) encode the amino acid sequence of
one or more polypeptides specified by a gene or set of
genes. Thansfer RNAs (tRNAs) read the information
encoded in the mRNA and transfer the appropriate
amino acid to a growing polypeptide chain during pro-
tein synthesis. Ribosomal RNAs (rRNAs) are con-
stituents of ribosomes, the intricate cellular machines
that synthesize proteins. Many additional specialized
RNAs have regulatory or catalltic functions or are pre-
cursors to the three main classes of RNA. These special-
function RNAs are no longer thought of as minor species
in the catalog of cellular RNAs. In vertebrates, RNAs
that do not fit into one of the classical categories
(mRNA, tRNA, rRNA) seem to vastly outnumber those
that do.
During replication the entire chromosome is usually
copied, but transcription is more selective. OnJy partic-
ular genes or groups ofgenes are transcribed at any one
time, and some portions of the DNA genome are never
transcribed. The cell restricts the expression of genetic
information to the formation of gene products needed at
any particular moment. Specific regulatory sequences
mark the beginning and end of the DNA segments to be
transcribed and designate whrch strand in duplex DNA
is to be used as the template. The transcript itself may
interact with other RNA molecules as part of the overall
regulatory program. The reguJation of transcription is
described in detail in Chapter 28.
The sum of all the RNA molecules produced in a cell
under a given set of conditions is called the cellular
transcriptome. Given the relatively small fraction of
the human genome devoted to protein-encoding genes,
we might have expected that only a small part of the hu-
man genome is transcribed. This is not the case. Modern
1 021
I tor4 RNA Metabor ism
microarray analysis of transcription patterns has re-
vealed that much of the genome of humans and other
mammals is transcribed into RNA. The products are pre-
dominantly not mRNAs, tRNAs, or rRNAs, but rather
special-function RNAs, a host of which are being discov-
ered. Many of these seem to be involved in regulation of
gene expression; however, the rapid pace of discovery
has forced the realization that we do not yet know what
many of these RNAs do.
In this chapter we examine the synthesis of RNA on
a DNA template and the postsynthetic processing and
turnover of RNA molecules. In doing so we encounter
many of the specialized functions of RNA, including cat-
alytic functions. Interestingly, the substrates for RNA
enzJ,.rnes are often other RNA molecules. We also de-
scribe systems in which RNA is the template and DNA
the product, rather than vice versa. The information
pathways thus come full circle, and reveal that template-
dependent nucleic acid synthesis has standard rules
regardless of the nature of template or product (RNA
or DNA). This examination of the biological intercon-
version of DNA and RNA as information carriers leads
to a discussion of the evolutionary origin of biological
information.
26.1 DNA-Dependent Synthesis of RNA
Our discussion of RNA synthesis begins with a compari-
son between transcription and DNA replication (Chap-
Ler 25). Ttanscription resembles replication in its
fundamental chemical mechanism, its polarity (direc-
tion of synthesis), and its use of a template. Alrd like
replication, transcription has initiation, elongation, and
termination phases-though in the titerature on tran-
scription, initiation is further divided into discrete
phases of DNA binding and initiation of RNA synthesis.
Ttanscription differs from replication in that it does not
require a primer and, generally, involves only limited
segments of a DNA molecule. Additionally, wrthin tran-
scribed segments only one DNA strand serves as a tem-
plate for a particular RNA molecule.
RttlA ls Synthesized by RNA Polymerases
The discovery of DNA polymerase and its dependence
on a DNA template spurred a search for an enzyme
that synthesizes RNA complementary to a DNA strand.
By 1960, four research groups had independently de-
tected an enzyme in cellular extracts that could form
an RNA polymer from ribonucleoside 5'-triphosphates.
Subsequent work on the purified Escheri,chi,a col'i
RNA polymerase helped to define the fundamental
properties of transcription (Fig. 26-1). DNA-de-
pendent RNA polymerase requires, in addition to a
DNA template, all four ribonucleoside 5'-triphosphates
(ATP, GTP, UTP, and CTP) as precursors of the nu-
cleotide units of RNA, as well as Mg2*. The protein also
binds one Zn"- . The chemistry and mechanism of RNA
synthesis closely resemble those used by DNA poly-
merases (see Fig. 25-5). RNA polymerase elongates an
RNA strand by adding ribonucleotide units to the 3'-
hydroxyl end, building RNA in the 5'-+3' direction.
The 3'-hydroxyl group acts as a nucleophile, attacking
the a phosphate of the incoming ribonucleoside
triphosphate (Fig. 26-lb) and releasing pyrophos-
phate. The overall reaction is
(NMP)" + NTP ------+ (NMP),+' + PPi
RNA LengthenedRNA
RNA polymerase requires DNA for activity and is most
active when bound to a double-stranded DNA. As noted
above, only one of the two DNA strands serves as a tem-
plate. The template DNA strand is copied in the 3'-+5'
direction (antiparallel to the new RNA strand), just as
in DNA replication. Each nucleotide in the newly
formed RNA is selected by Watson-Crick base-pairing
interactions; U residues are inserted in the RNA to pair
with A residues in the DNA template, G residues are
inserted to pair with C residues, and so on. Base-pair
geometry (see Fig. 25-6) may also play a role in base
selection.
Unlike DNA poll'rnerase, RNA pol}rmerase does not
require a primer to initiate s5,nthesis. Initiation occurs
when RNA pol;rmerase binds at speci-flc DNA sequences
called promoters (described below). The 5'-triphosphate
group of the first residue in a nascent (newly formed)
RNA molecule is not cleaved to release PP1, but instead
remains intact throughout the transcription process.
During the elongation phase of transcription, the grow-
ing end of the new RNA strand base-pairs temporarily
with the DNA template to form a short hybrid RNA-DNA
double helix, estimated to be 8 bp long (Fig. 26-1a). The
RNA in this hybrid duplex "peels off' shortly after its
formation, and the DNA duplex reforms.
To enable RNA polymerase to synthesize an RNA
strand complementary to one of the DNA strands, the
DNA duplex must unwind over a short distance, form-
ing a transcription "bubble." During transcription, the
E. colz RNA polymerase generally keeps about 17 bp
unwound. The 8 bp RNA-DNA hybrid occurs in this
unwound region. Elongation of a transcript by E coli,
RNA polymerase proceeds at a rate of 50 to 90 nu-
cleotides/s. Because DNA is a helix, movement of a
transcription bubble requires considerable strand ro-
tation of the nucleic acid molecules. DNA strand rota-
tion is restricted in most DNAs by DNA-binding
proteins and other structural barriers. As a result, a
moving RNA polymerase generates waves of positive
supercoils ahead of the transcription bubble and neg-
ative supercoils behind (Fig.26-1c). This has been ob-
served both in vitro and in vivo (in bacteria). In the
cell, the topological problems caused by transcription
are relieved through the action of topoisomerases
(Chapter 24).
26.1 DNA-Dependent Synthesis of RNA !-otd
MECHANfSM FIGURE 26-l Transcription by RNA polymerase in E. coli.
For synthesis of an RNA strand complementary to one of two DNA
strands in a double helix, the DNA is transiently unwound. (a) About .l 7
bp are unwound at any given time. RNA polymerase and the transcrip-
tion bubble move from left to right along the DNA as shown, facilitating
RNA synthesis. The DNA is unwound ahead and rewound behind as
RNA is transcribed. Red arrows show the direction in which the DNA
must rotate to permit this process. As the DNA is rewound, the RNA-DNA
hybrid is displaced and the RNA strand extruded The RNA polymerase is
in close contact with the DNA ahead of the transcription bubble, as well
as with the separated DNA strands and the RNA within and immediately
behind the bubble A channel in the orotein funnels new nucleoside
triphosphates (NTPs) to the polymerase active site. The polymerase foot-
print encompasses about 35 bp of DNA during elongation.
Incoming NTP is attacked at the
DNA
5',,
Rewin
Template
strand
nwinding
NTP
channel
hybrid,
-8 bp
RNA-DNA
Active
site
Direction of transcription
(a)
o
-o-P-o-P-oj
OH OH
RNA
polSrmerase
(b) Catalytic mechanism of RNA synthesis by RNA polymerase.
Note that this is essential ly the same mechanism used by DNA poly-
merases (see Fig 25-5b). The reaction involves lwo Mg2 ions, coor-
dinated to the phosphate groups of the incoming NTP and to threeAsp
residues (Arpouo, Aspa62, and Aspa6t in the p' subunit of the F. coli
RNA polymerase), which are highly conserved in the RNA poly-
merases of all species. One Mg2+ ion facilitates attack by the 3'-hy-
droxyl group on the d phosphate of the NTP; the other Mg2* ion
facilitates displacement of the pyrophosphate; and both metal ions
stabilize the oentacovalent transition state
(c) Changes in the supercoi l ing of DNA brought about by tran-
scription. Movement of an RNA polymerase along DNA tends to cre-
ate positive supercoils (overwound DNA) ahead of the transcription
bubble and negative supercoi ls (underwound DNA) behind i t . In a cel l ,
topoisomerases rapidly eliminate the positive supercoils and regulate
the level of negative supercoiling (Chapter Z+;.
o
OHOH
(b)
Negative
supercoils
Positive
supercoils
Direction of transcription
(c)
Eor4 RNA Metaborism
FIGURE 26-2 Template and nontemplate (coding) DNA strands.
The two complementary strands of DNA are defined by their
function in transcription. The RNA transcript is synthesized on the
template strand and is identical in sequence (with U in place of T)
to the nontemplate strand, or coding strand.
+->
(5 ' ) CGC TATAGCGT TT (3 ' )
( 3 ' ) GCGATAT C GCAAA(5 ' )
DNA nontemplate (coding) strand
DNA template strand
RNA transcript
DNA
FIGURE 26-3 Organization of coding information in the adenovirus
genome. The genetic information of the adenovirus genome (a conve-
niently simple example) is encoded by a double-stranded DNA mole-
cule of 36,000 bp, both strands of which encode proteins. The
information for most proteins is encoded by (that is, identical to) the top
strand-by convention, the strand oriented 5' to 3' from left to right. The
bottom strand acts as template for these transcripts. However, a few
KEY C0NVENTI0N: The two complementary DNA strands
have different roles in transcription. The strand that
serves as template for RNA s),'nthesis is called the tem-
plate strand. The DNA strand complementary to the
template, the nontemplate strand, or coding strand,
is identical in base sequence to the RNA transcribed
from the gene, with U in the RNA in place of T in the
DNA (Fig. 26-2). The coding strand for a particular
gene may be located in either strand of a given chromo-
some (as shown in Fig. 26-3 for a virus). By conven-
tion, the regulatory sequences that control transcription
(described later in this chapter) are designated by the
sequences in the coding strand. r
The DNA-dependent RNA polymerase of -4. cotiis a
large, complex enzJirne with flve core subunits (a2BB'a;
M. 390,000) and a sixth subunit, one of a group desig-
nated o, with variants designated by size (molecular
weight). The o subunit binds transiently to the core and
directs the enz5rme to specific binding sites on the DNA
(described below). These six subunits constitute the
RNA polymerase holoenz)irne (Fig. 26-4). The RNA
polyrnerase holoenz;nne of E. coli, thus exists in several
forms, depending on the type of o subunit. The most
cornmon subunit is o7o 1M,20,000), and the upcoming
discussion focuses on the corresponding RNA poly-
merase holoenzyme.
RNA polymerases lack a separate proofreading B'--+5,
exonuclease active site (such as that of many DNA poly-
merases), and the error rate for transcription is higher
than that for chromosomal DNA replication-approxi-
mately one error for every 104 to 105 ribonucleotides in-
corporated into RNA. Because many copies of an RNA are
generally produced from a single gene and all RNAs are
proteins are encoded by the bottom strand, which is transcribed in the
opposite direction (and uses the top strand as template). Synthesis of
mRNAs in adenovirus is actual ly much more complex than shown
here. Many of the mRNAs shown for the upper strand are initially syn-
thesized as a single, long transcript (25,000 nucleotides), which is then
extensively processed to produce the separate mRNAs. Adenovirus
causes upper respiratory tract infections in some vertebrates.
eventually degraded and replaced, a mistake in an RNA
molecule is of less consequence to the cell than a mistake
in the permanent information stored in DNA. Marry RNA
polymerases, including bacterial RNA polymerase and the
eukaryotic RNA polymerase II (discussed below), do
pause when a mispaired base is added dudng transcrip-
tion, and they can remove mismatched nucleotides from
the 3' end of a transcript by direct reversal of the poly-
merase reaction. But we do not yet knowwhether this ac-
tivity is a true proofreading function and to what extent it
may contribute to the fldelity of transcription.
FIGURE 26-4 Structure of the RNA polymerase holoenzyme of the
bacterium Thermus aquaticus. (Derived from PDB lD l lWZ) The over-
all structure of this enzyme is very similar to that of the f. coll RNA
polymerase; no DNA or RNA is shown here. The B subunit is in gray,
the B' subunit is white; the two a subunits are different shades of red;
the ar subunit is yellow; the o subunit is orange. The image on the left
is oriented as in Figure 26-6. When the structure is rotated 180" about
the y axis (r ight) the small r .r subunit is visible.
3.6 x 104 bD^ <. ts
llt!;i 5:rruth*iii *eqins at Frrmoters
Initiation of RNA synthesis at random points in a DNA
molecule would be an extraordinarilywastefirl process. In-
stead, an RNA poly.rnerase binds to specific sequences in
the DNA called promoters, whrch direct the transcrip-
tion ofadjacent segments ofDNA (genes). The sequences
where RNA pollenerases bind can be quite variable, and
much research has focused on identi{ying the particular
sequences that are critical to promoter function.
In E co|i,, RNA polymerase binding occurs within a
region stretching from about 70 bp before the transcrip-
tion start site to about 30 bp beyond it. By convention,
the DNA base pairs that correspond to the beginning of
an RNA molecule are given positive numbers, and those
preceding the RNA start site are given negative num-
bers. The promoter region thus extends between posi-
tions -70 and *30. Analyses and comparisons of the
most common class of bacterial promoters (those recog-
nized by an RNA polymerase holoenzyme containing
o701 have revealed similarities in two short sequences
centered about positions -10 and -35 (Fig. 2{i-i).
These sequences are important interaction sites for the
o70 subunit Although the sequences are not identical
for all bacterial promoters in this class, certain nu-
cleotides that are particularly common at each position
form a consensus sequence (recall the E. coLi, ori,C
consensus sequence; see Fig 25-11). The consensus se-
quence at the -10 region is [5')TATAAT(3'); the con-
sensus sequence at the -35 region is (5')TTGACA(3').
A third M-rich recognition element, called the UP
26.1 DNA-Dependent Synthesis of RNA 1 025
(upstream promoter) element, occurs between posi-
tions -40 and -60 in the promoters of certain highly
expressed genes. The UP eiement is bound by the a sub-
unit of RNA polyrnerase. The efflciency with which an
RNA polymerase containing o70 binds to a promoter and
initiates transcription is determined in large measure by
these sequences, the spacing between them, and their
distance from the transcription start site.
Many independent lines of evidence attest to the func-
tional importance of the sequences in the -35 and - 10 re-
gions. Mutations that affect the function of a given
promoter often hvolve a base pair in these regions Varia-
tions in the consensus sequence also affect the efflciency
of RNA pol;.'merase burding and transcription initiation. A
change in only one base pair can decrease the rate ofbind-
ing by several orders of magnitude. The promoter se-
quence thus establishes a basal Ievel of expression that can
vary greatly from one E. co\i gene to the next. A method
that provides in-formation about the interaction between
RNA pollenerase and promoters is illustrated in Box 26-1.
The pathway of transcription initiation and the fate
of the a subunit are becoming much better defined
{Irig. 26-{ia). The pathway consists of two major parts,
binding and initiation, each with multipie steps. First, the
polymerase, directed by its bound o factor, binds to the
promoter A closed complex (in which the bound DNA is
intact) and an open complex (in which the bound DNA
is intact and partially unwound near the - 10 sequence)
form in succession. Second, transcription is initiated
within the complex, leading to a conformational change
that converts the complex to the elongation form,
Consensus
sequence
rrnB PI
*TTTT}JNAAAANNN
trp
lac
recA
araBAD
FI6URE 26-5 Typical E coli promoters recognized by an RNA poly-
merase holoenzyme containing oto. Sequences of the nontemplate
strand are shown, read in the 5 '+3'd i rect ion, as is the convent ion for
representations of this kind The sequences vary from one promoter to the
next , but compar isons of many promoters reveal s imi lar i t ies, part icular ly
in the 1 0 and 35 regions The sequence element UP, not present in a l l
F coli promoters, is shown rn the Pl promoter for the highly expressed
rRNA gene rnB. UP elements, generally occurring in the region between
-40 and -60, strongly stimulate transcription at the promoters that con-
tain them. The UP element in the rnB P1 promoter encompasses the re-
gion between -38 and -59. The consensus sequence for E col i
promoters recognized by o7o is shown secon