125 Chapter 9: Probability 126
SOLUTIONS FOR PART III
9. PROBABILITY
9.1. If A ⊂ B, then P(A) ≤ P(B)—TRUE. P(B) is the sum of the proba-
bilities assigned to points in A plus the sum of the proabilities assigned to
points in B − A.
9.2. If P(A) and P(B) are not zero, and P(A|B) = P(B|A), then P(A) =
P(B)—TRUE. The two quantities equal P(A∩ B)/P(B) and P(A∩ B)/P(A),
so they are equal if and only if the denominators are equal.
9.3. If P(A) and P(B) are not zero, and P(A|B) = P(B|A), then A and B
are independent—FALSE. If A and B are the same event, then P(A ∩ B) =
P(A) > P(A)P(B), and they are not independent.
9.4. If P(A) > 1/2 and P(B) > 1/2, then P(A ∩ B) > 0—TRUE. If P(A ∩
B) = 0, then A and B have no common points with positive probability, and
P(A ∪ B) = P(A) + P(B) > 1, which is impossible.
9.5. If A, B are independent, then A and Bc are independent—TRUE. P(A∩
Bc) = P(A)− P(A∩ B) = P(A)− P(A)P(B) = P(A)(1− P(B)) = P(A)P(Bc).
9.6. If A, B are independent, then Ac and Bc are independent—TRUE.
P(Ac ∩ Bc) = P(Ac) − P(Ac ∩ B) = 1 − P(A) − (P(B) − P(A ∩ B)) =
1 − P(A) − P(B) + P(A)P(B) = (1 − P(A))(1 − P(B)) = P(Ac)P(Bc).
9.7. An event and its complement are independent if and only if their prob-
abilities are 1 and 0. The probability of the intersection of A and Ac is 0. If
they are independent, then 0 = P(A ∩ Ac) = P(A)P(Ac) = P(A)(1 − P(A)).
This requires that P(A) or 1 − P(A) is 0.
9.8. Restaurant menu items. Let S denote the event that what the man
orders is out of stock. Let A and F denote the events of ordering pasta
and fish, respectively. We are given that P(A) = P(F) = .5 and P(S|A) =
P(S|F) = .5. By direct computation, P(S) = P(S ∩ A) + P(S ∩ F) =
P(S|A)P(A) + P(S|F)P(F) = .5(.5) + .5(.5) = .5.
The answer makes sense, since the probability that the dish is out
of stock is .5 every day, no matter what he orders. Arbitrary values may
arise for the values P(A), P(F), P(S|A), and P(F |A), in which case the
conclusion is obtained by the computation above.
9.9. Conditional probability for drawing marbles. The contents of the con-
tainers are R R, B B, and RB. In choosing a random marble from a random
container, there are six equally likely outcomes, three R and three B. The
number of outcomes in which the selected ball and the other in its con-
tainer are both black is two. The condition probability of the other ball
being black (event O), given that the selected ball is black (event B), is
computed as P(O|B) = P(O ∩ B)/P(B) = (1/3)/(1/2) = 2/3.
9.10. Rolling two dice, one red and one green. There are 36 equally likely
possible outcomes.
a) Given that the red die shows a six, the probability of double-sixes is
1/6. Given that the red die shows a six, the roll is double-sixes if and only
if the green die shows a six, which occurs with probability 1/6.
b) Given that at least one die shows a six, the probability of double-
sixes may be 1/6 or 1/11, depending on how the information was obtained.
If we saw one die and observed that it showed a six, then the probability
that the other die also shows a six is 1/6.
If someone tells us after seeing both dice that at least one shows a six,
then we can compute the conditional probability. The event "at least one
six" (B) consists of 11 equally likely rolls, of which exactly one is "double-
six" (A). The formal computation of conditional probability is P(A|B) =
P(A ∩ B)/P(B) = (1/36)/(11/36) = 1/11.
9.11. The TV game show problem. The prize must be behind the contes-
tant’s door (event C) or the other unopened door (event D). We seek the
conditional probabilities of these two events, given what has happened so
far. Let A be the door opened by the host. For each door, the probability of
being correct in the game is 1/3. Given event C , the conditional probability
of opening A is 1/2. Given event D, the conditional probability of open-
ing A is 1. Given that the prize is behind A, the conditional probability of
opening it is 0.
Thus the conditional probability of opening A, given the door chosen
by the contestant, is (1/3)(1/2) + (1/3)1 + (1/3)0 = 1/2. The probability
that C occurs and A is opened is (1/3)(1/2) = 1/6. The probability that
D occurs and A is opened is (1/3)(1) = 1/3. The probability that C occurs
given that A is opened is thus (1/6)/(1/2) = 1/3, and the probability that
D occurs given that A is opened is (1/3)/(1/2) = 2/3.
9.12. Bertrand’s Paradox. In the following probability models, let p be the
probability that the length of the randomaly generated chord of the unit
circle exceeds
√
3. Note that
√
3 is the length of the sides of an equilateral
triangle inscribed in the circle.
a) If the endpoints of the chord are generated by two random spins on
the circumference of the circle, then p = 1/3. After spinning once, draw
the inscribed triangle with a corner at the chosen point. The length of the
chord will exceed
√
3 if and only if the second chosen point lies between the
127 Chapter 9: Probability 128
other two corners of the triangle on the circle. Since probability of lying in
an arc is proportional to its length, the probability of this is 1/3.
b) If the midpoint of the chord is generated by throwing a dart at the
circle, then p = 1/4. The chord is generated from its midpoint P by drawing
the chord through P that is perpendicular to the segment from P to the
center of the circle. The midpoint of a chord of length
√
3 has distance 1/2
from the center of the circle. Thus the length exceeds
√
3 if and only if the
generated midpoint has distance less than 1/2 from the center. Since area
is proportional to the square of the radius and probability is proportional
to area, the circle of radius 1/2 has 1/4 of the probability.
c) A model where p = 1/2. We first pick the distance of the chord
from the center; then we choose a random chord with that distance from
the center. The length of the chord exceeds
√
3 if and only if its distance
from the center is less than 1/2. We pick this distance at random from the
interval [0, 1].
9.13. Triangles on triples among n equally spaced points on a circle. The(n
3
)
triples are equally like. Thus we count how many yield three equal
lengths, two equal lengths and one different length, and three different
lengths.
When 3|n, there are n/3 triples that form equilateral triangles; oth-
erwise there are none. To generate two equal segments, we choose their
common point in n ways and then their common length in b(n − 1)/2c ways.
When 3|n, we must subtract from n b(n − 1)/2c the n/3 equilateral trian-
gles. The resulting probabilities are listed below by the congruence class
of n modulo 6.
0 1 2 3 4 5
equilateral 2
(n−1)(n−2) 0 0
2
(n−1)(n−2) 0 0
isosceles 3n−8
(n−1)(n−2)
3
n−2
3
n−1
3n−5
(n−1)(n−2)
3
n−1
3
n−2
other n−4n−1
n−5
n−2
n−4
n−1
n−5
n−2
n−4
n−1
n−5
n−2
9.14. In Bertrand’s Ballot Problem with outcome (a, b) where a > b, the
probability that A is always ahead of B is a−ba+b , and the probability that the
score is tied at some point is 2ba+b . The two probabilities must sum to 1 when
a > b. A fails to be always ahead if and only if B has at least as much
at some point. Since A winds up ahead, B has at least as much at some
point if and only if there is a tie at some point. Thus it suffices to find the
probability that A is always ahead.
If A is always ahead of B, then A must get the first vote, and A must
get at least as many votes as B in every initial segment of the remainder.
The probability that A gets the first vote is a/(a + b). Given that A gets the
first vote, the probability that A never trails in the remaining portion of
the election is the solution to the Ballot Problem for an election ending at
(a − 1, b). We have computed this to be (a − b)/a. Since both these events
must occur, the probability that A is always ahead is aa+b
a−b
a = a−ba+b .
9.15. a) If m 0s and n 1s are placed in some order around a circle, then
there are exactly m − n positions such that every arc of the circle starting at
that position and moving clockwise contains more 0s than 1s. This holds
trivially when n = 1. For m ≥ n > 0, there is some pair of 1s with at
least one 0 between them. Let S be a set of two positions consisting of a 0
followed immediately by a 1.
Neither position in S is good; the 1 comes too soon. A position outside
S is good if and only if it is good in the smaller arrangement obtained by
deleting S. The number of good starting places thus equals the number of
good starting places in the smaller arrangement, which by the induction
hypothesis is m − 1 − (n − 1) = m − n.
b) There are 1n+1
(2n
n
)
ballot lists of length 2n. By part (a), every arrange-
ment of n 1s and n + 1 0s can be cut in exactly one place to obtain a list
in which all initial segments have more 0s than 1s. The first element (just
after the cut) must be a 0. Deleting this 0 yields a ballot list, and the pro-
cess is reversible. Thus the ballot lists and the cyclic arrangements of n
1s and n + 1 0s are equinumerous. Since n and n + 1 are relatively prime,
the number of these cyclic arrangements (and ballot lists of length 2n) is
exactly the Catalan number 12n+1
(2n+1
n+1
) = 1n+1(2nn ) = Cn.
9.16. A conditional probability computation. We are given three indepen-
dent random variables X1, X2, X3, each defined on [n] and equally likely to
take on any value in [n]. The variable X1 + X2 is less than 4 only for the
outcomes (1, 2), (1, 1), (2, 1), which together have probability 3/n2. Thus
P(X1 + X2 ≥ 4) = 1 − 3/n2. The events X1 + X2 + X3 ≤ 6 and X1 + X2 ≥ 4
both occur if and only if X3 = 2 and (X1, X2) ∈ {(3, 1), (2, 2), (1, 3)} or
X3 = 1 and (X1, X2) ∈ {(3, 1), (2, 2), (1, 3), (4, 1), (3, 2), (2, 3), (1, 4)}. Thus
the probability that both occur is 1n · 3n2 + 1n · 7n2 = 10n3 . The desired conditional
probability is 10n3 /
n2−3
n2 = 10n2−3 . The computation is valid only for n ≥ 4.
9.17. Application of Bayes’ Theorem. Against her four opponents, the ten-
nis player wins with probability .6, .5, .45, .4, respectively. These are the
conditional probabilities P(A|Bi ), where A is the event that she wins and Bi
is the event that she plays the ith opponent. We want to compute P(Bi |A)
for each i . Since she plays 30 matches against each of the first two and 20
matches against each of the last two opponents, we have P(Bi ) = .3, .3, .2,
.2, respectively. Thus we have all the data to compute The denominator,
which equals P(A), is .6 · .3 + .5 · .3 + .45 · .2 + .4 · .2 = .5. Thus the answers
are P(A|Bi )P(Bi ) · 2, which equal .36, .30, .18, .16, respectively.
129 Chapter 9: Probability 130
9.18. If half the females and one-third of the males in a class are smok-
ers, and two-thirds of the students are male, then of the smokers are female.
Consider choosing a random student in the class. Let A be the event of
being a smoker. Let B1, B2 be the events of being female or male, respec-
tively. We seek P(B1|A). We are given P(A|B1) = 1/2, P(A|B2) = 1/3, and
P(B2) = 2/3. By Bayes’ Formula, we compute
P(B1 + A) =
ai bi
6aj bj
= (1/2)(1/3)
(1/2)(1/3) + (1/3)(2/3) =
3
7
.
9.19. Simpson’s paradox - it is possible for A to have a higher batting av-
erage than B in day games and night games but for B to have a higher
average overall. If the daytime hits/attempts are a/b and c/d, respectively,
and the nighttime hits/attempts are w/x and y/z, respectively, then in-
equalities characterizing the situation described are a/b > c/d, w/x > y/z,
and (c + y)/(d + z) > (a + w)/(b + x). The situation described occurs if and
only if these three inequalities are satisfied by eight integers also satisfying
0 ≤ a ≤ b, 0 ≤ c ≤ d, 0 ≤ w ≤ x , 0 ≤ y ≤ z. Equivalently, the three inequal-
ities can be written as ad > bc, wz > xy, and (c + y)(b + x) > (a +w)(d + z).
Note that if b = d and x = z, then the inequalities lead to a contradic-
tion and the paradox cannot occur. This suggests that one way in which
the paradox can occur is if the poor daytime performance of B and the good
nighttime performance of A are unimportant to their season averages, as
would occur if B hardly ever plays in daylight and A hardly ever plays at
night. For example, the paradox occurs with the numbers (a, b) = (3, 10)
and (c, d) = (0, 1) for daytime and (w, x) = (1, 1) and (y, z) = (7, 10) for
nighttime. For the season, the averages are 4/11 and 7/11.
9.20. Simpson’s Paradox with university faculties of equal size.
Univ. H Univ. Y
Total Women Fraction Total Women Fraction
Asst. Profs. 40 25 .6 80 40 .5
Assoc. Profs. 30 5 .17 10 0 0
Full Profs. 30 5 .17 10 0 0
Total 100 35 .35 100 40 .4
9.21. A perfect game in bowling. If on each roll the probability of a strike
is p, then the probability that 12 consecutive rolls are strikes is p12. The
value of p such that p12 = .01 is approximately .60. This suggests that
perfect games are quite rare but not unheard of among very good bowlers.
9.22. Probabilities for people in a line. We have A, B, and n other people
in random order, meaning that all (n + 2)! permuations are equally likely.
To form a permutation with exactly k people between A and B, form an
arbitrary permutation of the others, decide which of A and B comes first,
and insert A and B so they will be separated by k. There are n + 1 − k ways
to make the insertion. Thus the probability is n!2(n+1−k)
(n+2)! , which simplifies
to 2(n+1−k)
(n+1)(n+2) . To check that these sum to 1, we compute∑n
k=0
(
2
n+2 − 2k(n+1)(n+2)
)
= 2(n+1)n+2 − 2(n+1)(n+2) (n+1)n2 = 2n+2−nn+2 = 1
9.23. Probability of first player getting first Head. The probability of heads
is p on each flip. If x is the probability that the first player wins, then x is
the sum of the probability of winning on the first flip and the probability
of winning later. Winning later requires tails from each player on the first
round, but then the rest of the game is the same as the original game.
Thus x = p + (1 − p)2x , which yields x = p/[1 − (1 − p)2] = 1/(2 − p).
When the coin is fair, the first player wins with probability 2/3. (Note: the
probability that the first player wins on the ith try is p(1− p)2(i −1). Thus
also x = ∑∞i=1 p(1 − p)2(i − 1).
9.24. A spinner with regions 1, 2 . . . , n and payoff 2k for region k.
a) the expected payoff per spin of the dial is (2n+1 − 2)/n. The payoff
is 2k with probability 1/n, so the expected payoff is (1/n)
∑n
k=1 2
k , which
equals (2n+1 − 2)/n.
b) Given a coin flip to move to a neighboring region, the gambler should
accept the gamble unless the current region is n. If the gambler already has
the maximum payoff, then switching is a guaranteed loss. If the current
payoff is the minimum, then switching is a guaranteed gain, with expected
payoff 12 2
n + 12 4. When 2 ≤ k ≤ n − 1, the expected payoff for switching is
1
2 2
k+1 + 12 2k−1. This equals 2k + 2k−2 and exceeds 2k .
The expected payoff under the optimal strategy is
1
n
(
2n−1 + 2 + 2n +
n−1∑
k=2
(2k + 2k−2)
)
= 1
n
(
2n+1 + 2n−1 + 2n−2 − 4)
9.25. Another gambler. The amounts in envelopes are a1 ≤ a2 · · · ≤ an. For
1 ≤ i ≤ n − 1 the probability is pi that the envelopes contain ai and ai+1
dollars. He opens one envelope and has the option to switch. If he sees a1
dollars, then he switches for a guaranteed gain. If he sees an dollars, then
switching would be a guaranteed loss.
For 2 ≤ k ≤ n −1, when he sees ak the other envelope may contain ak−1
or ak+1. Since he opened one of the two envelopes at random, the probabil-
ity that he sees ak and the other is ak−1 is pk−1/2, and the probability that
he sees ak and the other is ak+1 is pk/2. The probability that he sees ak is
131 Chapter 9: Probability 132
the sum of these, (pk−1 + pk)/2. The conditional probability that the other
is ak−1 is pk−1/(pk−1 + pk), and the conditional probability that the other is
ak+1 is pk/(pk−1 + pk).
Thus the expected payoff for switching when he sees ak is
ak−1 pk−1 + ak+1 pk
pk−1 + pk
.
He should accept the gamble to switch if this quantity exceeds ak .
9.26. If X is a random variable that takes values only in [n], then E(X) =∑n
k=1 P(X ≥ k). Let pi = P(X = i). We compute
E(X) =
n∑
i=1
i pi =
n∑
i=1
i∑
k=1
pi =
n∑
k=1
n∑
i=k
pi =
n∑
k=1
P(X ≥ k).
Essentially, we add up in two ways the probability pi for all pairs (k, i) such
that 1 ≤ k ≤ i ≤ n.
9.27. Expected number of selections to find the desired key.
a) Under random order without replacement, the expectation is (n +
1)/2. All permutations of the keys are equally likely for the selection order.
Thus the right key is equally likely to be in each position. With probability
1/n that the key is in position k, the expected position is (1/n)
∑n
1=i k =
(n + 1)/2.
b) Under random selection with replacement, the expectation is n. On
each selection, the probability of success is 1/n. Thus we seek the ex-
pectation for the geometric distribution with success probability 1/n. By
Proposition 9.29, the expectation is n.
9.28. The expected number of matching pairs among k socks pulled from a
pile of n pairs of socks is
(k
2
)
/(2n − 1). In each proof below, we express the
number X of matching pairs as a sum of indicator variables, and then we
use linearity of the expectation. In proof 1, we extract a random k-set of
socks. In proofs 2 and 3, we assume that the k socks are extracted in some
order. This produces every k-set in k! ways, so the results are the same
when we make the ordered selections equally likely as when we make the
k-sets equally likely.
Proof 1. Let X i be 1 if pair i appears; otherwise X i = 0. Of the
(2n
k
)
equally likely choices for the k socks,
(2n−2
k−2
)
include pair i . Hence
P(X i = 1) =
(
2n − 2
k − 2
)/(
2n
k
)
= k(k − 1)
2n(2n − 1) ,
and E(X i ) = P(X i, j = 1) = k(k−1)2n(2n−1) . Since X =
∑n
i=1 X i , and since the vari-
ables X1, . . . , Xn have the same distribution, E(X) = nE(X i ) =
(k
2
)
/(2n −1).
Proof 2. Let X i, j be 1 if the ith and jth socks match; otherwise X i, j =
0. Any two socks are equally likely to be chosen ith and jth. Of the (2n2 )
choices of two socks, n are matched pairs. Hence P(X i, j = 1) = n/
(2n
2
) =
1/(2n − 1), and E(X i, j ) = P(X i, j = 1) = 1/(2n − 1). Since X is the sum of(k
2
)
such variables, E(X) = (k2)/(2n − 1).
Proof 3. Let X j be 1 if the jth sock extracted completes a pair with
an earlier sock, so
∑k
j=1 X j counts the completed pairs of socks. The prob-
ability that the mate of the jth sock appears among the first j − 1 socks is
( j − 1)/(2n − 1). Thus E(X j ) = ( j − 1)/(2n − 1), and
E(X) =
k∑
j=1
E(X j ) =
1
2n − 1
k∑
j=1
( j − 1) =
(
k
2
)
/(2n − 1).
9.29. Given a set of n men and n women, the expected number of male-
female couples in a random pairing is n2/(2n − 1). Let X be the random
variable giving the number of male-female couples. We express X as the
sum of n2 indicator variables. Define X i, j by X i, j = 1 when the ith man
is paired with the jth woman, X i, j = 0 otherwise. By the linearity of
expectation, E(X) = n2 E(X i, j ).
By symmetry, an individual is equally likely to be paired with any one
of the other 2n − 1. Thus E(X i, j ) = 1/(2n − 1), and E(X) = n2/(2n − 1).
9.30. There are 34,650 arrangements of the letters in M I SSI SSI P P I .
Among the 11 letters, there are four Is, four Ss, two Ps, and one M. Thus
we compute
( 11
4,4,2,1
)
, which equals 11!4!4!2!1!
9.31. Solution 9.40 overestimates the probability of hitting for the cycle.
Not all the appearances of a batter count as an at-bat. Other outcomes are
possible; walks and sacrifices are not counted as at-bats.
9.32. A polynomial p such that p(n) = 3n for n = 0, 1, 2, 3, 4. The multi-
nomial expansion of (1 + 1 + 1)n is a sum of multinomial coefficients, since
always 1ki = 1. We write the coefficients as
(
n
k1, k2, k3
)
= n!
k1!k2!(n − k1 − k2)!
= n(n − 1) · · · (n − k1 − k2 + 1)
k1!k2!
For each choice of k1 and k2, this expression is a polynomial in n. If
we sum all such expressions that are nonzero when 0 ≤ n ≤ 4, then the
value at such n will be 3n.