5¢ end cDNA amplification using classic RACE
Elizabeth Scotto–Lavino1,2, Guangwei Du2 & Michael A Frohman1–3
1Graduate Program in Molecular & Cellular Pharmacology; 2Department of Pharmacological Sciences & Center for Developmental Genetics, Stony Brook University,
Stony Brook, New York 11794, U.S.A. 3Correspondence should be addressed to M.A.F. (michael@pharm.stonybrook.edu).
Published online 29 December 2006; doi:10.1038/nprot.2006.480
The 5¢ ends of transcripts provide important information about transcription initiation sites and the approximate locations of local
cis-acting enhancer elements; it is therefore important to establish the 5¢ ends with some precision. RACE (rapid amplification of
cDNA ends) PCR is useful for quickly obtaining full length cDNAs for mRNAs for which only part of the sequence is known and to
identify alternative 5¢ or 3¢ ends of fully sequenced genes. The method consists of using PCR to amplify, from complex mixtures of
cellular mRNA, the regions between the known parts of the sequence and non-specific tags appended to the ends of the cDNA.
Whereas the poly(A) tail serves to provide such a tag at the 3¢ end of the mRNA, an artificial one needs to be generated at the 5¢ end,
and various approaches have been described to address this step. The classical scheme for 5¢ RACE described here is simple, suffices
in many instances in which RACE is needed and can be performed in 1–3 days.
INTRODUCTION
Most attempts to identify and isolate a novel cDNA result in the
acquisition of clones that represent only a part of the mRNA’s
complete sequence. The ever-growing collections of sequenced
genomes and high-quality cDNA libraries can often facilitate
acquisition of the remainder of the transcript. For less well-
characterized organisms, or for low-abundance cDNAs even in
well-characterized organisms, such information is often not avail-
able, particularly at the 5¢ end of the transcript. Obtaining a full-
length cDNA at the 5¢ ensures that the entire protein region has
been identified and yields information concerning the transcription
initiation site. In some instances, 5¢ untranslated regions encode
structural information that is relevant to mRNA stability, restricted
subcellular localization or translational efficiency.
The missing sequence (cDNA ends) can be cloned by PCR, using
a technique variously called rapid amplification of cDNA ends
(RACE)1, anchored PCR2 or one-sided PCR3 (Fig. 1). Since
the initial reports describing this technique, many labs and com-
panies have developed significant improvements on the basic
approach4–15. A protocol for 5¢ end cDNA amplification by
classic RACE is presented here. 3¢ end cDNA amplification can
also be performed using a classic RACE protocol as described in a
separate protocol16. A more complex but also more powerful
approach (new RACE), which has evolved from the work of a
number of laboratories17–24 is also described in a separate proto-
col25. Commercial RACE kits and libraries are available from many
companies that are more convenient but often not as powerful
as the versions described here.
Classic RACE
PCR is used to amplify partial cDNAs that represent the region
between a single point in an mRNA transcript and its 3¢ or 5¢ end
(Figs. 1, 2). A short internal stretch of sequence must already be
known from the mRNA of interest. From this sequence, gene-
specific primers (GSPs) are chosen that are oriented in the direction
of the missing sequence. Extension of the partial cDNAs from the
unknown end of the message back to the known region is achieved
using primers that anneal to the pre-existing poly(A) tail (3¢ end) or
an appended homopolymer tail or linker (5¢ end). Using RACE,
enrichments in the order of 106–107-fold can be obtained. As a
result, relatively pure cDNA ‘ends’ are generated that can be easily
cloned or rapidly characterized using conventional techniques1.
To generate ‘3¢ end’ partial cDNA clones, mRNA is reverse
transcribed using a ‘hybrid’ primer (Qtotal; QT) that consists of
two mixed bases (GATC or GAC followed by (T)17 followed by a
unique 35-base oligonucleotide sequence (QI–QO; Fig. 2a,c).
Amplification is then performed using a primer containing part
of this sequence (Qouter, QO), which now binds to each cDNA at its
3¢ end, and using a primer derived from the gene of interest (GSP1).
A second set of amplification cycles is then carried out using
‘nested’ primers (Qinner (QI) and GSP2) to quench the amplifica-
tion of non-specific products.
To generate ‘5¢ end’ partial cDNA clones, reverse transcription
(primer extension) is carried out using a gene-specific primer
(GSP-RT; Fig. 2b) to generate first-strand products. Following
this, a poly(A) tail is appended using terminal deoxynucleotidyl-
transferase (Tdt) and dATP. Amplification is then achieved using
the hybrid primer QT to form the second strand of cDNA, the QO
primer, and a GSP upstream of the one used for reverse transcrip-
tion. Finally, a second set of PCR cycles is carried out using nested
primers (QI and GSP2) to increase the yield of specific product
4.
Updated RACE Techniques
The most technically challenging step in classic 5¢ RACE is to cajole
reverse transcriptase to copy the mRNA of interest in its entirety
into first-strand cDNA. Because prematurely terminated first-
strand cDNAs are tailed by terminal transferase just as effectively
as full-length cDNAs, cDNA populations that are composed largely
of prematurely terminated first strands will result primarily in the
amplification and recovery of cDNA ends that are not full length
(Fig. 3a). This problem is regularly encountered for vertebrate
p
u
or
G
g
n ih si lb
uP
er
u ta
N
600 2
©
n
at
ur
ep
ro
to
co
ls
/
m
oc
.
er
ut a
n
.
w
w
w//:ptth
mRNA
Partial cDNA clone
5′ UTR Coding 3′ UTR
Figure 1 | A schematic representation of the setting in which Classic RACE is
used. The figure shows a mRNA for which only a partial, internal cDNA is
available.
NATURE PROTOCOLS | VOL.1 NO.6 | 2006 | 2555
PROTOCOL
genes, which are often GC-rich at their 5¢ ends and, therefore, often
contain sequences that hinder reverse transcription. A number of
laboratories and companies have developed steps or protocols that
are designed to overcome the problem17–24.
New RACE. One approach to force the specific acquisition of full-
length 5¢ cDNAs consists of ligating an anchor primer to the 5¢ end
of the mRNA before performing the reverse transcription step
(Fig. 3b)17. Accordingly, subsequently generated cDNAs that do
not extend all the way to the 5¢ end of the transcript fail to
incorporate the anchor sequence and do not get amplified in the
ensuing PCR mediated by the gene-specific and anchor primers.
This method, which is discussed in an accompanying protocol25, is
more powerful than the classic 5¢ RACE protocol described here,
but is also more challenging to perform.
Cap-switching RACE. A simpler method to amplify only full
length cDNA ends involves adapter addition during reverse tran-
scription (cap-switching RACE; Fig. 4). This method takes
advantage of the propensity of Moloney murine leukemia virus
reverse transcriptase to add an extra 2–4 cytosines to the 3¢ ends of
newly synthesized cDNA strands upon reaching the cap structure
at the 5¢ end of the mRNA template26,27. In the presence of a
primer terminating in multiple Gs at its 3¢ end, annealing and then
complementary copying of the sequence of the annealed oligo takes
place, which adds a linker sequence to the cDNA terminus. Because
the template-independent addition of cytosines is cap-dependent,
the oligo is appended only to full-length cDNA ends. Also, because
this method involves fewer steps than classic and new RACE, it is
simpler; however, the presence of the dG-terminating (‘switch’)
primer can cause problems if it binds to C-rich sequences in the
mRNA of interest.
Making cDNA ends meet. Another variation of RACE allows for
the simultaneous amplification of both ends of a cDNA molecule,
eliminating the need for performing two separate 5¢ and 3¢ RACE
reactions21,28. One version of this approach28 is achieved through a
combination of a standard reverse transcription template switching
(TS) reaction — in which a so-called TS-oligo is added that allows
the reverse transcriptase enzyme to switch templates from the
mRNA to the oligonucleotide, creating a double-stranded molecule
— and inverse PCR, with a crucial ligation step between them. The
ligation reaction circularizes the double-stranded cDNA, allowing
primers that are directed away from the unknown sequence to be
used. This straightforward method has been reported to compare
very favorably to standard RACE techniques with respect to
sensitivity and specificity 28.
Commercially available RACE kits and their limitations. Var-
ious commercial RACE kits are available, including Clontech’s cap-
finding (switching) Smart RACE system, Invitrogen’s 5¢ RACE
p
u
or
G
g
n ih si lb
uP
er
u ta
N
600 2
©
n
at
ur
ep
ro
to
co
ls
/
m
oc
.
er
ut a
n
.
w
w
w//:ptth
mRNA
Reverse
transcription
1st strand cDNA
GSP1
GSP2
First set of
amplifications
Second set of
amplifications
"cDNA 3′ End"
mRNA
Reverse
transcription
1st strand cDNA
GSP-RT
GSP-RT
cDNA tailing
1st strand cDNA GSP-RT
First set of
amplifications
Second set of
amplifications
GSP1
GSP2
"cDNA 5′ End"
Xho I Sst I Hind IIIQT
QO
QI
QI
QT
QO
QO-QI-
QTQO
QTQO
QO
QI
QI
*
3′5′
*
a
b
c
QI
Figure 2 | A schematic representation of Classic RACE. Please see text for
details. (a) Amplification of 3¢ partial cDNA ends. (b) Amplification of 5¢
partial cDNA ends. (c) Schematic representation of the primers used in Classic
RACE. The 52 nucleotide QT primer (5¢ QO-QI-TTTT 3’) contains a 17-nucleotide
oligo-(dT) sequence at the 3¢ end followed by a 35-nucleotide sequence
encoding Hind III, Sst I, and Xho I recognition sites. The QI and QO primers
overlap by a single nucleotide; the QI primer contains all three of the
restriction enzyme recognition sites. Optionally, two additional nucleotides
can be added to the 3¢ end of QT to force it to bind to the junction of the
cDNA and the poly(A) tail: (G, A or C, followed by G, A, T or C). Primers: QT:
5¢- CCAGTGAGCAGAGTGACGAGGACTCGAGCTCAAGCTTTTTTTTTTTTTTTTT-3¢ QO:
5¢- CCAGTGAGCAGAGTGACG-3¢ QI: 5¢-GAGGACTCGAGCTCAAGC-3¢ GSP1, gene-specific
primer 1; GSP2, gene-specific primer 2; GSP-RT, gene-specific primer, used
for reverse transcription; *-, GSP-Hyb/Seq (a gene-specific primer for use
in hybridization and sequencing reactions).
Classic RACE New RACE
mRNA
mRNA
Reverse
transcription
GSP-RT
Ligation of
RNA oligo
mRNA
Reverse
transcription
GSP-RT
Poly(A)
tailing
*
*
*
*
*
*
a b
Figure 3 | The advantage of new RACE over classic RACE. (a) In classic
RACE, premature termination in the reverse transcription step results in
polyadenylation of less-than-full-length first-strand cDNAs, all of which can
be amplified using PCR to generate less-than-full length cDNA 5¢ ends. The
asterisk indicates cDNA ends that will be amplified in the subsequent PCR.
(b) In new RACE, less-than-full-length cDNAs are also created, but only
full-length molecules are terminated by the RNA oligonucleotide (the
anchor sequence) and hence amplified in the subsequent PCR.
2556 | VOL.1 NO.6 | 2006 | NATURE PROTOCOLS
PROTOCOL
system, and Ambion’s First-Choice RLM-RACE kit. The Ambion
kit has been a popular choice for validating micro RNA (miRNA)
cleavage sites29,30. Commercial systems are often geared toward the
construction of universal pools of full-length cDNAs (Fig. 4), in
which all of the mRNAs in the starting material become converted
to cDNA. The value of this approach is that a single reverse-
transcription pool can, in theory, be used to obtain the 5¢ end of any
transcript. By contrast, non-commercial versions of RACE have
emphasized the use of a GSP to generate the first-strand cDNA
templates. Although it lacks universality, the latter approach is
more powerful because the reverse transcription step starts closer to
the 5¢ end of transcript, and the relative frequency of the desired
cDNA is increased 450-fold in the resulting pool. This greatly
increases the chances of the desired 5¢ end being present in
sufficient quantity to be amplified using standard PCR methods.
Which approach should investigators choose? For the one-time
user or for those with limited molecular biology experience, the
most practical approach would be to obtain a commercial system
and, if possible, a pre-made pool of reverse-transcribed cDNAs.
Pools representing many human tissues are available — for
example, from Clontech or Origene. Failing that, using a GSP-RT
primer with the commercial kits will overcome the limitation
described above. The Clontech and Ambion systems are relatively
powerful and easy to use (both are variations on new RACE);
however, they may not be optimal for every purpose. Invitrogen’s
system is simpler and less powerful (a variation on classic RACE),
but may suffice for many needs. In addition, because the commer-
cial kits are relatively expensive, investigators who plan to use RACE
regularly will achieve substantial savings if they prepare the reagents
themselves.
Experimental design considerations for 5¢ RACE
Reverse transcription reaction. In 5¢ end cDNA amplification,
the efficiency of cDNA extension is crucial. In the classic 5¢
procedure, each specific cDNA, no matter how short, is tailed
and becomes a potential target for amplification (Fig. 2a). Thus,
the quality of the final PCR products directly reflects that of the
reverse transcription reaction. The length of the first-strand cDNA
can be maximized by using clean, intact RNA, and by selecting a
reverse transcriptase primer that anneals near to the 5¢ end of a
region of known sequence. Improvements can also be made, at least
in theory, by using a combination of SuperScript II and heat-stable
reverse transcriptase at multiple temperatures. At increased tem-
peratures the amount of secondary structure encountered in GC-
rich regions of the mRNA should be reduced. Incorporation of
cyclic compatible solutes such as homoectoine can also improve the
generation of first-strand cDNA or the subsequent PCR amplifica-
tion steps 31,32.
Poly(A) tailing reaction. To attach a known sequence to the 5¢
end of the first-strand cDNA, a homopolymeric tail is appended
using Tdt. It is preferable to add poly(A) tails4 rather than poly(C)
tails2 for a number of reasons. First, the 3¢ end strategy is based on
the naturally occurring poly(A) tail; adding a poly(A) tail to the 5¢
end allows the same adapter primer to be used for both ends, which
simplifies the protocol and reduces the cost. Second, because A:T
binding is weaker than G:C binding, longer stretches of A residues
(approximately two times longer) are required before the oli-
go(dT)-tailed QT primer will bind to the template. Internal
poly(A) tracts are rare so the chance of non-specific binding and
the production of truncated amplification products is reduced.
Third, vertebrate coding sequences and 5¢ untranslated regions tend
to be biased toward G/C residues; therefore, use of a poly(A) tail
further decreases the likelihood of inappropriate amplification.
Unlike many other applications that use homopolymeric tails, the
actual length of the tail added here is unimportant, as long as it
exceeds 17 nucleotides. This is because the oligo(dT)-tailed primer
binds at the junction of the appended poly(A) tail and the cDNA
transcript. The conditions described in the procedure result in the
addition of 30-400 A residues.
Many of the remarks made above apply also to the protocol on
amplifying 3¢-end partial cDNAs16 and should be noted. There is,
however, one major difference. The annealing temperature in the
first step of 5¢ RACE (48 1C) is lower than that used in successive
cycles (52–68 1C). This is because cDNA synthesis during the first
round depends on the interaction of the appended poly(A) tail and
the oligo(dT)-tailed QT primer. In all subsequent rounds, ampli-
fication can proceed using the QO primer, which is composed of
B60% GC and which can anneal to its complementary target at a
p
u
or
G
g
n ih si lb
uP
er
u ta
N
600 2
©
n
at
ur
ep
ro
to
co
ls
/
m
oc
.
er
ut a
n
.
w
w
w//:ptth
Reverse transcription
BiotinPi PO-5′
PTotal
BiotinPi PO-5′
Template switch
Biotin
3′ -RACE
GSP1
GSP2
GS-Hyb
PO
Pi
5′ -RACE
RT
5′-UO Ui 3′
5′-UO Ui 3′
Cap
5′-UO Ui 3′
3′-UO Ni
Cap
UO
Ui GSP-Hyb
RGSP1
RGSP2
Cap
Pi PO-5′
-
a
b c
Figure 4 | Schematic representation of cap-switching RACE. (a) Reverse
transcription, template switch and incorporation of adaptor sequences at the
3’-end of first strand of cDNA. Biotin-labeled primer Ptotal is used to initiate
reverse transcription through hybridization of the poly(dT) tract with the
mRNA poly(A) tail. After reaching the 5¢ end of the mRNA, oligo(dC) is added
by reverse transcriptase in a cap-dependent manner. Following this, through
template switch via base-pairing between the oligo(dC) and the oligo(dG)
at the end of cap finder Adaptor, the reverse complementary sequence of the
cap finder primer is incorporated into the first strand of the cDNA. Dotted
line, mRNA; solid line, cDNA; rectangle, primer. The bracket indicates the
known region. (b) The first round of PCR uses primer Uo and RGSP1 (reverse
gene-specific primer 1), the 2nd round, Ui and RGSP2. GSP-Hyb is also within
the known region, and it can be used to confirm the authenticity of the
RACE product. (c) 3¢-RACE. Reprinted with permission from ref. 35.
NATURE PROTOCOLS | VOL.1 NO.6 | 2006 | 2557
PROTOCOL
much higher temperature. In practice, the ideal melting tempera-
tures will vary with individual PCR machines made by different
companies.
Here we provide a detailed protocol for classic RACE, describing
the reverse transcription steps, addition of the poly(A) tail and the
subsequent rounds of PCR amplification.
MATERIALS
REAGENTS
.dNTP solution (containing all four dNTPs, each at 10mM)
.DTT (0.1 M)
.Tris–EDTA solution (10 mM Tris-HCl [pH 7.5], 1 mM EDTA [pH 8.0])
.Reverse transcription buffer, 5x (as supplied by manufacturer)
.RNase H
.RNasin
.SuperScript II reverse transcriptase (Invitrogen)
.Gene-specific primer, used for reverse transcription (GSP-RT primer;
100 ng/ml)
.Poly(A)+ RNA, or total RNA. Poly(A)+ RNA is used in preference to total
RNA for reverse transcription to reduce background, but it is unnecessary to
prepare it if only total RNA is available
.CoCl2 (25 mM)
.dATP solution (1 mM)
.Terminal deoxynucleotidyltransferase (Tdt, Invitrogen or Boehringer
Mannheim)
.Tailing buffer, 5x (125 mM Tris-HCl, pH 6.6, 1 M potassium cacodylate,
1250 mg/ml BSA)
.Hercules Hot-Start polymerase buffer (10x) m CRITICAL If the buffer
contains dNTPs already, do not add additional nucleotides to the mixture
.Common oligonucleotide primers (see REAGENT SETUP for primer design
details); for example:QT: 5¢- ccagtgagcagagtgacgaggactcgagctcaagcttttttttttttttt
tt-3¢
.QO: 5¢- ccagtgagcagagtgacg-3¢
.QI: 5¢-gaggactcgagctcaagc-3¢
.Gene-specific oligonucleotide primers (user-specific, see REAGENT SETUP
for primer design details)
EQUIPMENT
.Water baths or heating blocks preset to 371, 421, 501, 651, 701 and 80 1C
.QIAquick DNA clean-up spin columns (Qiagen) or equivalent
.Programmable thermal cycler
REAGENT SETUP
Primer design for 5¢ RACE QT is a multipurpose primer. It contains binding
sites for two, mostly non-overlapping smaller primers (QO (Qouter) and QI
(Qinner)) and an oligo dT sequence capable of annealing to the appended
poly(A) tail, terminated by a non-A nucleotide to force the primer to set at
the junction of the appended poly(A) tail and the bona fide cDNA sequence.
The oligo-dT needs to be at least 17 nucleotides in length to anneal at 48 1C.
The QO and QI primers should be designed to work well when