DOI: 10.1126/scitranslmed.3001720
, 61ra91 (2010);2 Sci Transl Med
, et al.Y. M. Dennis Lo
Mutational Profile of the Fetus
Maternal Plasma DNA Sequencing Reveals the Genome-Wide Genetic and
http://stm.sciencemag.org/content/2/61/61ra91.full.html
can be found at:
and other services, including high-resolution figures,A complete electronic version of this article
http://stm.sciencemag.org/content/suppl/2010/12/06/2.61.61ra91.DC1.html
can be found in the online version of this article at: Supplementary Material
http://stm.sciencemag.org/content/2/61/61ra91.full.html#ref-list-1
, 18 of which can be accessed free:cites 37 articlesThis article
http://www.sciencemag.org/about/permissions.dtl
in whole or in part can be found at: article
permission to reproduce this of this article or about obtaining reprintsInformation about obtaining
is a registered trademark of AAAS. Science Translational Medicinerights reserved. The title
NW, Washington, DC 20005. Copyright 2010 by the American Association for the Advancement of Science; all
last week in December, by the American Association for the Advancement of Science, 1200 New York Avenue
(print ISSN 1946-6234; online ISSN 1946-6242) is published weekly, except theScience Translational Medicine
o
n
D
ec
em
be
r 8
, 2
01
0
st
m
.s
ci
en
ce
m
ag
.o
rg
D
ow
nl
oa
de
d
fro
m
PRENATAL D IAGNOS I S
Maternal Plasma DNA Sequencing
Genome-Wide Genetic and Mutati
of the Fetus
.
K
I
te
e
e
ho
A
m
.
a
molecular basis of this observation is not known. Better understand-
allel sequencing to study the genomic sequence and size distribution
ping
ism (SNP) genotyping
extracted from paternal
S sample, with the Af-
0 system (table S1). The
s (Fig. 1). We defined
and mother were both
Category 2 SNPs were
both homozygous, but
for the same allele. Category 3 SNPs were those in which the father
which both the father
end, on DNA extracted
), equivalent to an av-
e, were aligned to the
e (Hg18 NCBI.36). For
amily, the fetus was an
obligate heterozygote. The fetal SNP allele inherited from the father
R E S EARCH ART I C L E
o
n
D
ec
em
be
r 8
, 2
01
0
st
m
.s
ci
en
ce
m
ag
.o
rg
D
ow
nl
oa
de
d
fro
m
Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, New Territories,
Hong Kong SAR, China. 4Sequenom Inc., San Diego, CA 92121–1331, USA.
wide genetic map of a fetus from the maternal plasma DNA sequences
and from information about the paternal genotype and maternal
haplotype.
were those in which the father was homozygo
heterozygous. Category 5 SNPs were those in
and the mother were heterozygous.
Sequencing of plasma DNA
We performed PE sequencing, 50 bp for each
from maternal plasma. Reads (3.931 billion
erage of 65-fold coverage of a human genom
non–repeat-masked reference human genom
each of the 45,392 category 1 SNPs in this f
1Centre for Research into Circulating Fetal Nucleic Acids, Li Ka Shing Institute of
Health Sciences, The Chinese University of Hong Kong, Prince of Wales Hospital,
Shatin, New Territories, Hong Kong SAR, China. 2Department of Chemical Pathology,
The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, New
Territories, Hong Kong SAR, China. 3Department of Obstetrics and Gynaecology, The
of fetal DNA in maternal plasma. We further constructed a genome- was heterozygous and the mother was homozygous. Category 4 SNPs
us and the mother was
ing of this size difference might allow one to develop methods for the
selective enrichment of fetal DNA from maternal plasma. It is also
not known whether the entire fetal genome is represented in maternal
plasma. Complete representation might make it possible to deduce
a whole-genome genetic map, or even the entire genomic sequence,
of a fetus noninvasively. However, this is a technically challenging
task because most (about 90%) of the DNA in maternal plasma is
derived from the mother, and the DNA molecules in plasma are
short fragments (8). Here, we have used paired-end (PE) massively par-
Single-nucleotide polymorphism genoty
Genome-wide single-nucleotide polymorph
for ~900,000 SNPs was performed for DNA
and maternal buffy coat samples, and the CV
fymetrix Genome-Wide Human SNP Array 6.
SNPs were classified into different categorie
category 1 SNPs as those for which the father
homozygous, but for a different allele each.
those in which the father and mother were
consistently reported to be shorter than maternal DNA (8), but the gestation. A portion of the CVS DNA was stored for the study.
Y. M. Dennis Lo,1,2* K. C. Allen Chan,1,2 Hao Sun,1,2 Eric Z
Fiona M. F. Lun,1,2 Yama W. Zheng,1,2 Tak Y. Leung,3 Tze
Charles R. Cantor,4 Rossa W. K. Chiu1,2
(Published 8 December 2010; Volume 2 Issue 61 61ra91)
Cell-free fetal DNA is present in the plasma of pregnant women.
primarily maternally derived DNA fragments. We sequenced a ma
genomic coverage. We showed that the entire fetal and maternal g
ma at a constant relative proportion. Plasma DNA molecules show
miniscent of nuclease-cleaved nucleosomes, with the fetal DNA s
peak relative to a 143-bp peak, when compared with maternal DN
map and determined the mutational status of the fetus from the
information about the paternal genotype and maternal haplotype
genome-wide scanning to diagnose fetal genetic disorders prenat
INTRODUCTION
During pregnancy, a median of 10% of the DNA in the plasma of
pregnant women is fetally derived (1, 2), offering opportunities for
noninvasive prenatal diagnosis (3). Thus far, detection of paternally
inherited traits [for example, sex (4) and rhesus D blood group status
(5)] and fetal chromosomal aneuploidies (6, 7) is the main appli-
cation. Yet, little is known about the physical and biological charac-
teristics of fetal DNA in maternal plasma. Circulating fetal DNA is
*To whom correspondence should be addressed. E-mail: loym@cuhk.edu.hk
www.Scien
Reveals the
onal Profile
Chen,1,2 Peiyong Jiang,1,2
. Lau,3
t consists of short DNA fragments among
rnal plasma DNA sample at up to 65-fold
nomes were represented in maternal plas-
d a predictable fragmentation pattern re-
wing a reduction in a 166–base pair (bp)
. We constructed a genome-wide genetic
aternal plasma DNA sequences and from
Our study suggests the feasibility of using
lly in a noninvasive way.
RESULTS
Clinical case
We recruited a pregnant couple attending an obstetrics clinic for the
prenatal diagnosis of b-thalassemia. The father was a carrier of the
-CTTT 4–base pair (bp) deletion of codons 41/42, and the pregnant
mother was a carrier of the A→G mutation at nucleotide −28 of the
HBB gene (9). Blood samples were taken from the father and from
the mother before chorionic villus sampling (CVS) at 12 weeks of
should be readily detected as a unique sequence in maternal plasma
ceTranslationalMedicine.org 8 December 2010 Vol 2 Issue 61 61ra91 1
pr
h
ec
to
6
o
t
ve
a
o
m
na
ze
N
m
in
R E S EARCH ART I C L E
o
n
D
ec
em
be
r 8
, 2
01
0
st
m
.s
ci
en
ce
m
ag
.o
rg
D
ow
nl
oa
de
d
fro
m
ternal DNA was largely constant across t
data have suggested that the GC bias aff
total DNA in maternal plasma is likely
related to the sequencing platform used (
indication of the differential representation
ent GC content. Our data therefore suggest
fetal and maternal genomes is relatively e
High-resolution plasma DNA size an
The size of each sequenced plasma DNA
from the genome coordinates of the ends
of the fetal and total sequences were determ
(Fig. 2C) and individually for each chromoso
dant total sequences (predominantly mater
The most significant difference in the si
fetal and the total DNA was that fetal D
the 166-bp peak (Fig. 2C) and a relative pro
The latter likely corresponded to the trimm
where p is the number of sequenced reads
of the fetal-specific allele (the A allele for
the category 1 SNP in Fig. 1) and q is the
read count of the other allele, which is
shared by the maternal and fetal genomes
(the C allele for the category 1 SNP in
Fig. 1). The values of f determined for
every chromosome were highly consist-
ent (Table 1). The depth of coverage of
fetal and maternal sequences (in 1-Mb
windows) across the genome is plotted in
Fig. 2B. It correlated with the GC content
of each genomic window (fig. S2). The
number of fetal sequences as a propor-
tion of the total sequences in each win-
dow was consistent with the fractional
fetal DNA concentration determined on a
1). These data indicated that the relative
and could be used for studying the dis-
tribution of fetal DNA sequences across
the genome in maternal plasma.
Figure 2A shows the number of times
the paternally inherited fetal alleles for
the category 1 SNPs were observed in ma-
ternal plasma as the depth of sequencing
increased. With data from 3.931 billion
reads, a fetal allele was observed at least
once for 93.94% of these SNPs (table S2).
These results were consistent with Poisson
distribution predictions assuming that the
whole fetal genome was evenly distributed
in maternal plasma (fig. S1).
The fractional fetal DNA concentration
in the maternal plasma, f, can be calcu-
lated from the sequencing data:
f ¼ 2p
pþ q
ment from a nucleosome to its core particle of ~146 bp (12). From
www.Scien
chromosomal level (Table
oportion of fetal and ma-
e entire genome. Previous
ting the measurement of
be an analytical artifact
, 7, 10, 11), rather than an
f DNA molecules of differ-
hat the distribution of the
n in maternal plasma.
lysis
molecule can be deduced
f the PE reads. The sizes
ined for the whole genome
e (fig. S3). The most abun-
l) were 166 bp in length.
distribution between the
A exhibited a reduction in
inence of the 143-bp peak.
g of a ~20-bp linker frag-
~143 bp and below, the distributions of both fetal and total DNA
demonstrated a 10-bp periodicity reminiscent of nuclease-cleaved nu-
cleosomes (12). These data suggest that plasma DNA fragments are
derived from the enzymatic processing of DNA from apoptotic cells.
In contrast, size analysis of reads that mapped to the non–histone-bound
mitochondrial genome did not show this nucleosomal pattern (Fig. 2C).
These results provide a molecular explanation for the previously reported
size differences between fetal and maternal DNA using Y chromo-
some and selected polymorphic genetic markers (8, 13, 14), and show
that such size differences exist across the entire genome.
General principles for constructing a fetal genetic map
After having demonstrated that the entire fetal genome was evenly rep-
resented in maternal plasma, we attempted to construct a genome-wide
genetic map of the fetus. Maternal plasma DNAmolecules are short frag-
ments and the fetal sequences are in the minority. Here, we used the
genetic structure of the parental genomes as scaffolds for assembling the
fetal genetic map from the maternal plasma DNA sequences. The map
resolution depends on the known resolution of the parental genomes.
First, we used the category 2 SNPs (Fig. 1), in which the father and
father can be regarded as category 3. Category 4 allows the inheritance status of the maternal
haplotype to be studied. One application is the tracking of fetal inheritance of a haplotype block close
to a mutation carried by the mother. Here, noninvasive fetal genomic analysis was carried out for a
family undergoing prenatal diagnosis for b-thalassemia. Asterisk denotes that information on the ma-
ternal haplotype is required for the RHDO analysis. Category 5 SNPs were not analyzed in this study,
but might be useful for the prenatal diagnosis of autosomal recessive disorders with consanguineous
parents or genetic diseases with a strong founder effect.
Fig. 1. Noninvasive fetal genomic analysis from maternal plasma DNA. Parental SNP combinations
can be grouped into five categories. Categories 1, 2, and 3 allow the basic parameters for maternal
plasma DNA sequencing to be established, including the percentage coverage of the fetal genome,
fractional concentration of fetal DNA, and sequencing error rate. Category 3 also allows the fetal in-
heritance status of SNP alleles unique to the father to be studied. Mutations uniquely carried by the
mother were both homozygous for the same allele, to estimate the error
ceTranslationalMedicine.org 8 December 2010 Vol 2 Issue 61 61ra91 2
3% The
th Supp
n .55%
le g in
ied
uirem
on s
as s
NPs
rent
rna
inhe
. To
ad i
Ps w
oth
ch o
bein
40 to 100 reads per SNP), and fetal-specific read sequencing depth (blue; range, 1 to 8 reads
ve
. S
gr
s
w
conc
ana
.
7 11.49
R E S EARCH ART I C L E
o
n
D
ec
em
be
r 8
, 2
01
0
st
m
.s
ci
en
ce
m
ag
.o
rg
D
ow
nl
oa
de
d
fro
m
per SNP). (C) Size distribution of fetal DNA (blue curve), total DNA (red cur
drial DNA (green broken curve). Numbers denote the DNA size at the peaks
tions of the structural organization of a nucleosome are shown above the
right, DNA double helix wound around a nucleosomal core unit with the
cleavage shown; a nucleosome core unit with ~146 bp of DNA (red tape)
and a nucleosomal core unit with an intact ~20-bp linker sequence.
www.Scien
), and mitochon-
chematic illustra-
aph. From left to
ites for nuclease
ound around it;
21 10.87
22 11.19
X 11.10
Whole genome 11.43
ceTranslationalMedi
8 11.53
9 11.51
10 11.36
11 11.51
12 11.41
13 11.47
14 11.38
15 11.07
16 11.08
17 11.17
18 11.60
19 11.55
20 11.33
Fig. 2. Sequencing of fetal and total DNA in maternal plasma. (A) Depth of coverage of
fetal-specific SNP alleles versus the number of sequenced reads. (B) Sequencing depth
and GC content across the whole genome. Chromosome ideograms (outer ring) are oriented
pter-qter in a clockwise direction (centromeres are shown in yellow). Other tracks (from
outside to inside): GC content (green; range, 30 to 55%), total sequencing depth (red; range,
rate of plasma DNA sequencing. For the 500,457 category 2 SNPs in
this family, the fetus would be homozygous for the alleles concerned. The
sequencing error rate was expressed as the number of reads with an un-
expected allele as a proportion of all reads covering the category 2 SNP
loci andwas 0.30
seen in 4.04% of
seen more than o
allele seen in at
4 11.49
5 11.66
6 11.43
2
3
cine.org 8 December 2010
11.57
11.59
1
11.57
Chromosome
Fetal DNA
concentration (%)
different chromosomes
calculated based on the
lysis of category 1 SNPs for
Table 1. Fractional
entrations of fetal DNA
had a 50% chance of
g inherited by the fetus.
erozygous and the m
one of the alleles. Ea
er was homozygous for
f the two paternal alleles
129,835 category 3 SN
here the father was het-
allele that the fetus h
nherited, we studied the
in a stepwise fashion
determine the paternal
of the fetus
We deduced the fetal
ritance from each parent
Deducing the pate
l inheritance
homozygous for diffe
alleles (table S2).
81.06% of category 1 S
, where both parents were
the fetal allele detecti
inherited fetal allele w
ensitivity. The paternally
een at least twice in only
specific alleles, the req
ent of two reads reduced
However, when appl
to the detection of fetal-
ast two reads, resultin
a specificity of 99.45%.
e category 2 SNP loci.
ce to be scored, only 0
ose that an allele must be
of such SNPs had a false
(99,467/32,828,899).
se unexpected alleles were
Vol 2 Issue 61 61ra91 3
e
rn
lo
d
e
es
re
co
te
al
y
pl
I a
ba
R E S EARCH ART I C L E
o
n
D
ec
em
be
r 8
, 2
01
0
st
m
.s
ci
en
ce
m
ag
.o
rg
D
ow
nl
oa
de
d
fro
m
Fig. 3. Relative haplotype dosage (RHDO) analysis. (A) In type a SNPs, pa
the maternal alleles on Hap I. In type b SNPs, paternal alleles are identic
Hap II. If the fetus inherits Hap I from the mother, it is homozygous for t
type b SNPs. (B) For type a SNPs, Hap I is overrepresented in maternal
there is no significant difference between the cumulative counts for Hap
the fetus in this case inherits Hap II from the father, the sequential pro
duces the inheritance of Hap I from the mother.
www.Scien
in the study that the approach can detect
“artificial” maternal meiotic recombina-
tions. If RHDO is used clinically, the
maternal haplotype can be deduced with-
out any fetal information by comparison
with genotype information for other
family members.
Figure 3 shows the RHDO process.
The two maternal haplotypes are Hap I
rnal alleles are identical to
to the maternal alleles on
pe a and heterozygous for
asma. (C) For type b SNPs,
nd Hap II SNPs. Given that
bility ratio test (SPRT) de-
ceTranslationalMedicine.org 8
differences in genotyping and sequencing
al-specific alleles observed once or twice
ci, respectively, were false positives. Given
ata indicated that the fetus inherited the
r as the homozygous maternal allele in
for fetal allele detection using the one-
95.51 and 99.42%, respectively (table
nsistent with the category 2 SNP results.
Deducing the maternal inheritance
of the fetus
Formaternal inheritance, we analyzed the
category 4 SNPs (Fig. 1), where themother
was heterozygous and the father was
homozygous, and asked whether a slight
allelic imbalance was present in maternal
plasma. An imbalance would indicate that
the fetus was homozygous for one mater-
nal allele. This analysis could, in principle,
be carried out for each SNP with locus-
specific approaches such as digital poly-
merase chain reaction (PCR) (15).However,
for genome-wide random sequencing, the
depth of coverage needed and hence the
costs would be prohibitive for clinical use.
Using nearby SNP alleles on the same
maternal chromosome as a haplotype,
wedeveloped anewapproach todetermine
whether there was a relative haplotype
dosage (RHDO) imbalance in maternal
plasma. Because of meiotic recombina-
tion, the final maternally derived hap-
lotype inherited by the fetus is a mosaic
of the two original maternal haplotypes.
Using RHDO analysis, the combination
of alleles inherited by the fetus from its
mother can then be deduced as a series
of inheritance blocks. The resolution for
detecting this depends on the number
and distribution of genetic markers known
for the mother’s genome.
In this proof-of-concept study, we de-
duced the maternal haplotype information
needed for RHDO analysis with genotype
information obtained from microarray
analysis of the CVS. This precluded the
direct observation of maternal meiotic
recombinations, but we do show later
53,070 loci. The CVS genotype data indicated that the fetus inherited
the paternal-specific alleles in 65,018 category 3 SNPs. Such paternally
inherited fetal alleles were observed at least once in 61,049 (that is,
93.90%) and at least twice in 52,697 (that is, 81.05%) loci, in good agree-
ment with the category 1 SNP values (table S2). If we assume that the
that the CVS genotyping
same allele from the fath
64,817 loci, the specificiti
and two-read criteria we
S2). These error rates are
The paternal-specific allele (as illustrated by the C allele in SNP
category 3; Fig. 1) was detected at least once among the sequenced
reads covering 63,962 category 3 loci and at least twice covering
genotyping was perfect, th
meant that sequenced pate
in 2913 and 373 category 3
and Hap II (Fig. 3A). Hap I is the actual
December 2010 Vol 2 Issue 61 61ra91 4
ap
rn
ou
et
).
n
c
(
ex
yz
o
a
fi
R E S EARCH ART I C L E
o
n
D
ec
em
be
r 8
, 2
01
0
st
m
.s
ci
en
ce
m
ag
.o
rg
D
ow
nl
oa
de
d
fro
m
Fig. 4. SPRT classification. (A and B) SPRT classification process for RHDO
type b SNPs in a region close to the pter of chromosome 1. The classi
direction from the telomeric end to the centromere. See also Tables 2 an
www.Scien
RHDO classifications for the type a
and b SNPs, respectively (0.6 and 1.2%
of these classifications). With the current
sequencing coverage, the mean sizes of
type a and b classification segments were
659,000 and 768,000 bp, respectively.
The presence of two meiotic recombina-
tions within such distances would be un-
likely (19). Therefore, we proposed to
accept a s