1
进化博弈
Evolutionary Games
第13章
Chapter 13
Slide 2
进化博弈
Evolutionary Games
目前为止我们学过了具有多种不同特征的博弈:
We have so far studied games with many
different features:
同时和序贯博弈
Simultaneous and sequential moves
零和与非零和博弈
Zero-sum and non-zero-sum payoffs
操纵未来博弈规则的策略性行动
Strategic moves to manipulate rules of games to
come
一次性和重复博弈
One-shot and repeated play
许多人同时进行的集体博弈
Games of collective action in which a large number of
people play simultaneously
2
Slide 3
进化博弈
Evolutionary Games
所有这些博弈中的参与者都是理性的:每个参
与者……
All the players in all these games are
rational: each player……
……具有内在一致的价值体系
has an internally consistent value system
……能够计算其策略选择的后果
can calculate the consequences of her
strategic choices
……作出最符合其利益的选择
makes choice that best favors her interests
Slide 4
进化博弈
Evolutionary Games
对理性可能的替代方法可以从生物学的进化和进化动
力学中找到,在那里……
One possible alternative to rationality can be
found in the biological theory of evolution and
evolutionary dynamics, where……
……好的策略可以得到更多的奖励
good strategies will be rewarded with higher payoffs
……参与者可以观察或模仿成功者并试验新的策略
players can observe or imitate success and
experiment with new strategies
……随着参与者在参加博弈中获得经验,好的策略将会得到
更经常的使用,坏的策略得到更少的使用。
good strategies will be used more often and bad
strategies less often, as players gain experience
playing the game.
3
Slide 5
内容提要
Outline
*框架 The framework (*重点或难点)
*囚徒困境 Prisoners’ dilemma
小鸡 Chicken
保证博弈 The assurance game
*不同物种间作用 Interactions across species
鹰鸽博弈 The hawk-dove game
*种群中有三种
现型 Three Phenotypes in the
Population
一般理论 Some General Theory
群体博弈(略) Playing the field
合作与利他的进化
Slide 6
框架
The Framework
生物学中的进化过程提供了社会科学家使用的
博弈论的平行物。
The process of evolution in biology
offers a parallel to the theory of games
used by social scientists.
这一理论建立在三个基本原则上:
This theory rests on three fundamentals:
异质性 Heterogeneity
适应性 Fitness
选择 Selection
4
Slide 7
框架
The Framework
动物行为的相当一部分是由物种决定的;一个或多个
基因的联合体(基因型)支配着某一特定的行为模式
(称为行为表现型)。
A significant part of animal behavior is
generically determined; a complex of one or
more genes (genotype) governs a particular
pattern of behavior, called a behavior
phenotype.
例子 Examples
鸟翅膀的空气动力学特征
Aerodynamic characteristics of a bird’s wings
好斗或者合作的行为
Aggressive or cooperative behavior
筑巢的位置
Locations of nesting sites
Slide 8
框架
The Framework
自然界基因库的多样性保证了种群当中表现型
的异质性。
Natural diversity of the gene pool
ensures a heterogeneity of phenotypes
in the population.
某些行为比其他行为更适合于当前环境,一种
表现型的成功可以用适应性来定量测量。
Some behaviors are better suited than
others to the prevailing conditions, and
the success of a phenotype is given a
quantitative measure called its fitness.
5
Slide 9
框架
The Framework
更加适应的表现型在下一代中就会比更不适应
的表现型数量更多。
The fitter phenotypes then become
relatively more numerous in the next
generation than the less fit phenotypes.
这一选择过程就是一个改变基因型和表现型的
比例并最终导致稳定状态的动态过程。
This process of selection is the dynamic
that changes the mix of genotypes and
phenotypes and perhaps leads
eventually to a stable state.
Slide 10
框架
The Framework
不时的,偶然因素导致新的基因变异。
From time to time, chance produces new genetic
mutations.
许多这样的突变产生的行为(表现型)都不适应环境,逐
渐消失。
Many of these mutations produce behavior (that is,
phenotypes) that are ill suited to the environment,
and they die out.
偶尔地,变异导致的新的表现型更加适应环境。这样的变
异基因就能够成功地侵入种群,即扩展成为群体的重要部
分。
But occasionally a mutation leads to a new
phenotype that is fitter. Then such a mutant gene
can successfully invade a population – that is,
spread to become a significant proportion of the
population.
6
Slide 11
框架
The Framework
如果一个群体不能被任何变异侵入, 生物学家就将这一群
体的构成及其当前的表现型称为进化稳定的。
Biologists call a configuration of a population and
its current phenotypes evolutionary stable if the
population cannot be invaded successfully by any
mutant.
这是一个静态的检验;通常应用的是一个更加动态的
:
一个构成是进化稳定的,如果它是从群体的任意给定的表
现型构成开始的、动态选择过程的极限结果。
This is a static test; but often a more dynamic
criterion is applied: a configuration is evolutionary
stable if it is the limiting outcome of the dynamic
of selection, starting from any arbitrary mixture of
phenotypes in the population.
Slide 12
框架
The Framework
一种表现型的适应性取决于个别生物体与环境的关系。
The fitness of a phenotype depends on the
relationship of the individual organism to its
environment.
它还取决于存在于环境中的不同表现型的比例。
It also depends on the whole complex of the
proportions of different phenotypes that exist
in the environment.
对于我们的目的来说,这一物种内部表现型之间的相
互作用是最有意思的部分。
For our purpose, this interaction between
phenotypes within a species is the most
interesting part of the story.
7
Slide 13
框架
The Framework
进化的生物过程和博弈论是非常平行的。
The biological process of evolution
finds a ready parallel in game
theory.
表现型与策略
Phenotype vs. Strategy
适应性与收益
Fitness vs. Payoffs
Slide 14
框架
The Framework
因为种群是表现型的混合,从种群中选出的不同配对
就将不同的策略组合带入他们的相互作用。
Because the population is a mix of phenotypes,
different pairs selected from it will bring to
their interactions different combinations of
strategies.
某一表现型的适应性的实际定量测度标准是它在与种
群中其他表现型的所有相互作用中得到的平均收益。
The actual quantitative measure of the fitness
of a phenotype is the average payoff that it
gets in all its interactions with others in the
population.
8
Slide 15
框架
The Framework
进化博弈理论看起来像是通往博弈论的新途径
的一个现成框架,它放松了理性行为的假设。
The theory of evolutionary games seems
a ready-made framework for a new
approach to game theory, relaxing the
assumption of rational behavior.
策略遗传的思想可以在生物学之外其他的理论
应用中得到更广泛的阐释。
The idea of inheritance of strategies can
be interpreted more broadly in
applications of the theory other than
biology.
Slide 16
框架
The Framework
在社会经济博弈中,策略“优胜劣汰”的原因有
别于生物学中严格的遗传机制:
The reasons that the fitter strategies
proliferate and the less fit ones die out
in socioeconomic games differs from the
strict genetic mechanism of biology:
观察和模仿
Observations and Imitations
有目的的思考和对以往经验方法的修改
Purposive thinking and revision of previous
rules of thumb
有意识的实验
Conscious experimentation
9
Slide 17
框架
The Framework
为什么参与者要出这样的策略?
Why a player plays such a strategy?
理性选择 Rational choices
遗传 Genetics
社会化、文化背景、教育
Socialization, cultural background, educations
依据过去经历的经验方法
A rule of thumb based on past experience
社会会不会最后变成所有的政治家都只关心重新当选,
所有企业都只关心利润?
Will society end up with a situation in which all
politicians are concerned with reelection, and
all firms with profit?
Slide 18
框架
The Framework
生物博弈的进化稳定构成可以有两种。
Evolutionary stable configurations of biological
games can be of two kinds.
单态:单独一种表现型被
比其他表现型更适应,种群变
为仅由它构成。
Monomorphism: A single phenotype proves fitter
than any others and the population comes to consist
of it alone.
在这种情况下,这个唯一主导的策略被称为进化稳定策略。
In this case, the unique prevailing strategy is called
an evolutionary stable strategy (ESS).
多态:两个或更多表现型同样适应(并比其他没有出现的更
适应);因此他们可能以某种比例共存。
Polymorphism: Two or more phenotypes are equally
fit (and fitter than some others not played); so they
may be able to coexist in certain proportions.
10
Slide 19
框架
The Framework
组成进化博弈的完整设定是:
The whole set-up which constitutes an
evolutionary game is:
种群 The Population
其可能的表现型的集合
Its conceivable collection of phenotypes
表现型相互作用的收益矩阵
The payoffs matrix in the interactions of the
phenotypes
与其适应性相关的、表现型在种群中比例的进化规则
The rule for the evolution of population proportions
of the phenotypes in relation to their fitness
种群的进化稳定的构成可以称为进化博弈的一个均衡。
An evolutionary stable configuration of the
population can be called an equilibrium of the
evolutionary game.
Slide 20
囚徒困境
Prisoners’ Dilemma
假定种群由两种表现型组成:合作者和背叛者。
Suppose a population is made up of two
phenotypes: cooperators, defectors.
种群中的每一个体(合作者或者背叛者)被随
机地选择与另一个随机选择的对手竞争。
Each individual (either a cooperator or a
defectors) in the population is chosen at
random to compete against another
random rival.
11
Slide 21
囚徒困境
Prisoners’ Dilemma
324, 324216, 36026
(Cooperate)
360, 216288, 28820
(Defect)
ROW
26
(Cooperate)
20
(Defect)
COLUMN
Slide 22
囚徒困境
Prisoners’ Dilemma
用x表示种群中合作者的比例。
Let x be the proportion of cooperators in the
population.
则一个典型的合作者的预期收益为,
Therefore a typical cooperator’s expected
payoff is,
324x+216(1-x)
一个典型的背叛者的预期收益为,
A typical defector’s expected payoff is,
360x+288(1-x)
显然有,It is immediately apparent that,
360x+288(1-x)>324x+216(1-x),
for all x between 0 and 1.
12
Slide 23
囚徒困境
Prisoners’ Dilemma
因而背叛者有更高的预期收益,比合作者更适应。
Therefore a defector has a higher expected
payoff and is fitter than a cooperator.
这会导致背叛者比例的逐“代”上升(x下降),直到整
个种群都由背叛者组成。
This will lead to an increase in the proportion
of defectors (a decrease in x)from one
“generation” of players to the next, until the
whole population consists of defectors.
Slide 24
囚徒困境
Prisoners’ Dilemma
如果整个种群都由背叛者组成呢?
What if the population initially consists of all
defectors?
那么这种情况下不会有变异(试验性)的合作者可以
生存和繁殖以改变种群。
Then in this case no mutant (experimental)
cooperator will survive and multiply to take
over the population.
换句话说,背叛者的种群不能被变异的合作者成功侵
入。
In other words, the defector population cannot
be invaded successfully by mutant cooperators.
13
Slide 25
囚徒困境
Prisoners’ Dilemma
我们的分析表明背叛者比合作者有更高的适应性,一
个完全由背叛者组成的种群不能被变异的合作者侵入。
Our analysis shows that both that defectors
have higher fitness than cooperators and that
an all-defector population cannot be invaded
by mutant cooperators.
因而种群的进化稳定构成是单态的,由单一的策略或
表现型“背叛”组成。
Thus the evolutionary stable configuration of
the population is monomorphic, consisting of
the single strategy or phenotype Defect.
Slide 26
囚徒困境
Prisoners’ Dilemma
我们就把“背叛”称为这一进行困境博弈种群的
进化稳定策略。
We therefore call Defect the
evolutionary stable strategy for this
population engaged in this dilemma
game.
如果博弈有一个严格的优势策略,那么该策略
也将是ESS。
If a game has a strictly dominant
strategy, that strategy will also be the
ESS.
14
Slide 27
重复囚徒困境
The Repeated Prisoner’s Dilemma
648, 648504, 648T
(Tit-for tat)
648,504576, 576A
(Always
defect)
ROW
T
(Tit-for tat)
A
(Always
defect)
COLUMNTwice-Repeated
Prisoners’ dilemma
576=288*2,
648=324*2=360+288
504=216+288
Slide 28
重复囚徒困境
The Repeated Prisoner’s Dilemma
A只是弱优势的。容易看到A也是ESS。
A is only weakly dominant. And it is
easy to see that A is an ESS.
T是不是另一个ESS呢?
Is T another ESS?
注意到:(T, T)是该博弈的理性博弈理论分析的纳
什均衡。
Notice that: (T, T) is a Nash equilibrium in
the rational game theoretic analysis of this
game.
15
Slide 29
重复囚徒困境
The Repeated Prisoner’s Dilemma
如果种群一开始全是T,有少数几个变异者进入,那么变异者在大多数时
间内会遇到占统治地位的T型,在与T型的对决中,会和T型本身做得一
样好。
If the population is initially all T and a few mutants entered,
then the mutants would meet the predominant T types most of
the time and would do as well as T does against another T.
但是,偶尔的,一个A变异者将会遇到另一个A变异者,在这一对决中,
她会比T遇到A时做得更好。
But occasionally an A mutant would meet another A mutant,
and in this match she does better than would a T against A.
因此,变异者会比占统治地位表现型中的成员有略高的适应性。
Thus the mutants have just slightly higher fitness than that of
a member of the predominant phenotype.
这一优势导致种群中变异者的比例增加(虽然较慢)。因而全T种群可以
被A变异者成功入侵;T不是ESS。
This advantage leads to an increase, albeit a slow one, in the
proportion of mutants in the population.Therefore an all-T
population can be invaded successfully by A mutants; T is not
an ESS.
Slide 30
重复囚徒困境
The Repeated Prisoner’s Dilemma
我们的推理依赖于对ESS的二重检验。
Our reasoning relies on two tests for an ESS.
首先我们看当,当遇到占统治地位的类型时,变异者是否比占统
治地位的类型做得更好。
First we see if the mutant does better or worse than the
predominant phenotype when each is matched against
the predominant type.
如果这一主标准给出一个清楚的
,问题就解决了。
If this primary criterion give a clear answer, that settles
the matter.
如果该主标准给出平分,我们就使用一个“加时赛”,或次标准:
在遇到变异者时,变异者是否比占统治地位的类型做得更好?
But if the primary criterion gives a tie, then we use a tie-
breaking, or secondary, criterion: does the mutant fare
better or worse than a predominant phenotype when
each is matched against a mutant?
16
Slide 31
重复囚徒困境
The Repeated Prisoner’s Dilemma
972, 972792, 936T
(Tit-for tat)
936, 792864, 864A
(Always
defect)
ROW
T
(Tit-for tat)
A
(Always
defect)
COLUMNThrice-Repeated
Prisoners’ dilemma
864=288*3, 972=324*3
792=216+288*2
936=360+288*2
Slide 32
重复囚徒困境
The Repeated Prisoner’s Dilemma
两种类型的相对适应性依赖于种群构成:每种类型在
它已经在种群中占统治地位时都更适应。
The relative fitness of the two types depend on
the composition of the population: each type is
fitter when it already predominates in the
population.
因而当种群全是A时,T不能成功侵入,反之亦然。
Therefore T cannot invade successfully when
the population is all A, and vice versa.
现在有两个可能的种群的进化稳定构成:全A或全T。
Now there are two possible evolutionary stable
configurations of the population: all-A or all-T.
17
Slide 33
重复囚徒困境
The Repeated Prisoner’s Dilemma
x*=2/3 10
864
792
936
972
Fitness
Proportion x of
T types in Population
A type
T type
Slide 34
重复囚徒困境
The Repeated Prisoner’s Dilemma
如果开始时种群中恰好是x=2/3?
What if the initial population has exactly
x=2/3?
一旦任何一种类型的变异者出现,该构成就不能维持。
Such a configuration can sustain only until a
mutant of either type surfaces.
它可以被看成是一个不稳定的均衡;但从严格的生物
过程的逻辑来看,它根本不是一个均衡。
It can be regarded as an unstable
equilibrium;but in the strict logic of the
biological process, it is not an equilibrium at all.
不过,注意到双方出(2/3T, 1/3A)构成了该博弈的理性博弈
版本的混合策略纳什均衡。
Notice however (2/3T, 1/3A) for both players forms
a mixed-strategy Nash equilibrium in the rational-
player version of this game.
18
Slide 35
重复囚徒困境
The Repeated Prisoner’s Dilemma
324n, 324n288n-72,
288n+72
T
(Tit-for tat)
288n+72,
288n-72
288n, 288nA
(Always
defect)
ROW
T
(Tit-for tat)
A
(Always
defect)
COLUMNn-fold-Repeated
Dilemma
(n>2)
Slide 36
重复囚徒困境
The Repeated Prisoner’s Dilemma
n=3, x*=2/3
n=4, x*=2/4=1/2
……
n=10, x*=2/10=0.2,
……
但博弈重复次数越多,合作就可以从越大范围的初始
条件中产生。
Cooperation emerges from a larger range of
the initial conditions when the game is
repeated more times.
19
Slide 37
比较进化和理性参与者模型
Comparing the Evolutionary and
Rational-player Models
一个ESS必然是有相同收益结构、由有意识的
理性参与者进行的博弈的纳什均衡解。
An ESS must be a Nash equilibrium of
the game played by consciously rational
players with the same payoff structure.
因而,进化方法提供了对理性方法的隐含的支
持。
Thus the evolutionary approach
provides a backdoor justification for the
rational approach.
Slide 38
比较进化和理性参与者模型
Comparing the Evolutionary and
Rational-player Models
虽然ESS必然是对应的理性参与者博弈的纳什
均衡,反过来却未必。
Although an ESS must be a Nash
equilibrium of the corresponding
rational-player game, the converse is
not true.
因而稳定的生物学概念可以帮助我们从理性博
弈的多重纳什均衡中进行选择。
Thus the biological concept of stability
can help us select from a multiplicity of
Nash equilibria of a rationally played
game.
20
Slide 39
小鸡
Chicken
-2, -21, -1Macho
(Always
straight)
-1, 10, 0Wimp
(Always
swerve)
A
Macho
(Always
straight)
Wimp
(Always
swerve)
B
Slide 40
小鸡
Chicken
10
1
-2
-1
Fitness
Proportion x of
Machos in
Population
0
Macho
Wimp
1/2
50-50的混合是稳定的多态ESS。
The 50-50 mix will be the stable polymorphic ESS.
21
Slide 41
小鸡
Chicken
这一理性参与者的小鸡博弈有三个纳什均衡:两个纯
策略和一个混合策略。
The rational-player Chicken game has three
Nash equilibria: two in pure strategies and one
in mixed strategies.
均衡混合策略中的混合比例恰好是进化博弈中的种群
比例。
The mixture proportions in the equilibrium
mixed strategy are exactly the same as the
population proportions in the evolutionary
game.
Slide 42
保证博弈
The Assurance Game
2, 20, 0L
(Local Latte)
0, 01, 1S
(Starbucks)
MEN
L
(Local Latte)
S
(Starbucks)
WOMAN
22
Slide 43
保证博弈
The Assurance Game
10
2
1
Fitness
Proportion x of
Machos in Population
L type
S type
仅有两种极端的单态种群构成是可能的进化稳态。
Only the two extreme monomorphic configurations of the
population are possible evolutionary stable states.
2/3
Slide 44
物种之间的相互作用
Interaction Across Species
假定“男人”和“女人”仍然对在星巴克(S)或地方小店
(L)相遇感兴趣——无法相遇每人收益都是0——但
现在每一类型偏爱的咖啡店不同:L带给女人的收益为
2,男人为1,S恰好相反。
Assume that “men” and “women” are still
interested in meeting at either Starbucks (S)
or Local Latte(L) – no meeting yields each a
payoff of 0 – but now each type prefers a
different cafe: L gives a payoff of 2 to women
and 1 to men, and S the other way around.
这些偏好将两种类型区分开来。在生物学语言中,他
们属于不同物种。
These preferences distinguish the two types.
In the language of biology, they belong to
different species.
23
Slide 45
物种之间的相互作用
Interaction Across Species
假定在每一种群内部,所有成员都一致同意他们对S,
L和不能相遇的评价(收益)。
Suppose within each population, all the
members agree among themselves about the
valuation (payoffs) of S, L and no meeting.
但一些成员为强硬者,另一些为折衷者。
But some members are hard-liners and others
are compromisers.
一个强硬者永远选择他(她)自己所属物种偏爱的咖
啡店。一个折衷者认识到对方物种想法相反,为了和
睦,去了对方的地方。
A hard-liner will always go to his or her species’
preferred cafe. A compromiser recognizes that
the other species wants the opposite and goes
to that location, to get along.
Slide 46
物种之间的相互作用
Interaction Across Species
0, 01, 2Compromiser
2, 10, 0Hard-linerMEN
CompromiserHard-liner
WOMAN
24
Slide 47
物种之间的相互作用
Interaction Across Species
用x表示男人中强硬者的比例,y表示女人中该比例。
Let x be the proportion of hard-liners among the men
and y that among the women.
某一强硬男人的预期收益(适应性)为:
A particular hard-liner man’s expected payoff (fitness) is,
y*0+(1-y)*2=2(1-y)
一个折衷男人的预期收益(适应性)为,
A compromising man’s expected payoff (fitness) is,
y*1+(1-y)*0=y
则强硬男人更适应和增长得更快(x上升),当:
Therefore the hard-liner man is fitter and reproduce
faster (x is increase) when,
2(1-y)>y, or y<2/3
以此类推 And so on……
Slide 48
物种之间的相互作用
Interaction Across Species
2/3
2/30
1Proportion y
of hard-liners
among women
Proportion x of hard-liners
among women
x→
y↑
x←
y↑
x←
y↓
x →
y↓
1 ESS
ESS
25
Slide 49
鹰鸽博弈
The Hawk-Dove Game
该博弈不是由所指的两种鸟,而是由同一物种的两个
个体进行的,“鹰”和“鸽”只是他们策略的名字。
The game is played not by birds of these two
species, but by two animals of the same
species, and Hawk and Dove are merely the
names for their strategies.
鹰策略是侵略性的,试图抢夺价值为V的所有资源。
The hawk strategy is aggressive and fights to
try to get the whole resource of value V.
鸽策略是愿意分享,但如遇战斗则退缩。
The Dove strategy is to offer to share but to
shirk from a fight.
Slide 50
鹰鸽博弈
The Hawk-Dove Game
V/2, V/20, VDove
V, 0(V-C)/2, (V-C)/2HawkA
DoveHawk
B
26
Slide 51
理性策略选择与均衡
Rational Strategic Choice and
Equilibrium
如果V>C,那么该博弈为囚徒困境,每个参与者的优
势策略都是“鹰”,但(鸽,鸽)为双方都更好的结果。
If V>C, then the game is a prisoners’ dilemma
in which Hawk is the dominant strategy for
each, but (Dove, Dove) is the jointly better
outcome.
如果V
C(囚徒困境),鹰策略是唯一的ESS。
If V>C (prisoners’ dilemma), the Hawk
strategy is the only ESS.
如果V