CHAPTER 5

Generalizing the Extensive Form: Imperfect-Information Games

In Chapter 4, in our discussion of extensive-form games we allowed players to specify the action that they would take at every choice node of the game. This implies that players know the node they are in, and—recalling that in such games we equate nodes with the histories that led to them—all the prior choices, including those of other agents. For this reason we have called these perfect-information games.

We might not always want to make such a strong assumption about our players and our environment. In many situations we may want to model agents needing to act with partial or no knowledge of the actions taken by others, or even agents with limited memory of their own past actions. The sequencing of choices allows us to represent such ignorance to a limited degree; an “earlier” choice might be interpreted as a choice made without knowing the “later” choices. However, so far we could not represent two choices made in the same play of the game in mutual ignorance of each other.

5.1 DEFINITION

Imperfect-information games in extensive form address this limitation. An imperfect-information game is an extensive-form game in which each player’s choice nodes are partitioned into information sets; intuitively, if two choice nodes are in the same information set then the agent cannot distinguish between them.

Definition 5.1.1 (Imperfect-information game). An imperfect-information game (in extensive form) is a tuple (N, A, H, Z, χ, ρ, σ, u, I), where:

(N, A, H, Z, χ, ρ, σ, u) is a perfect-information extensive-form game; and

I = (I1, …, In), where Ii = (Ii,1, …, Ii,ki) is an equivalence relation on (i.e., a partition of) {h image H: ρ(h) = i} with the property that χ(h) = χ(h′) and ρ(h) = ρ(h′) whenever there exists a j for which h image Ii,j and himage Ii,j.

image

FIGURE 5.1: An imperfect-information game.

Note that in order for the choice nodes to be truly indistinguishable, we require that the set of actions at each choice node in an information set be the same (otherwise, the player would be able to distinguish the nodes). Thus, if Ii,j image Ii is an equivalence class, we can unambiguously use the notation x(Ii,j) to denote the set of actions available to player i at any node in information set Ii,j.

Consider the imperfect-information extensive-form game shown in Figure 5.1. In this game, player 1 has two information sets: the set including the top choice node, and the set including the bottom choice nodes. Note that the two bottom choice nodes in the second information set have the same set of possible actions. We can regard player 1 as not knowing whether player 2 chose A or B when he makes her choice between image and r.

5.2 STRATEGIES AND EQUILIBRIA

A pure strategy for an agent in an imperfect-information game selects one of the available actions in each information set of that agent.

Definition 5.2.1 (Pure strategies). Let G = (N, A, H, Z, χ, ρ, σ, u, I) be an imperfect-information extensive-form game. Then the pure strategies of player i consist of the Cartesian product image.

Thus perfect-information games can be thought of as a special case of imperfect-information games, in which every equivalence class of each partition is a singleton.

Consider again the Prisoner’s Dilemma game, shown as a normal-form game in Figure 1.1. An equivalent imperfect-information game in extensive form is given in Figure 5.2.

Note that we could have chosen to make player 2 choose first and player 1 choose second.

image

FIGURE 5.2: The Prisoner’s Dilemma game in extensive form.

Recall that perfect-information games were not expressive enough to capture the Prisoner’s Dilemma game and many other ones. In contrast, as is obvious from this example, any normal-form game can be trivially transformed into an equivalent imperfect-information game. However, this example is also special in that the Prisoner’s Dilemma is a game with a dominant strategy solution, and thus in particular a pure-strategy Nash equilibrium. This is not true in general for imperfect-information games. To be precise about the equivalence between a normal-form game and its extensive-form image we must consider mixed strategies, and this is where we encounter a new subtlety.

As we did for perfect-information games, we can define the normal-form game corresponding to any given imperfect-information game; this normal game is again defined by enumerating the pure strategies of each agent. Now, we define the set of mixed strategies of an imperfect-information game as simply the set of mixed strategies in its image normal-form game; in the same way, we can also define the set of Nash equilibria.1 However, we can also define the set of behavioral strategies in the extensive-form game. These are the strategies in which each agent’s (potentially probabilistic) choice at each node is made independently of his choices at other nodes. The difference is substantive, and we illustrate it in the special case of perfect-information games. For example, consider the game of Figure 4.2. A strategy for player 1 that selects A with probability .5 and G with probability .3 is a behavioral strategy. In contrast, the mixed strategy (.6(A, G), .4(B, H)) is not a behavioral strategy for that player, since the choices made by him at the two nodes are not independent (in fact, they are perfectly correlated).

In general, the expressive power of behavioral strategies and the expressive power of mixed strategies are noncomparable; in some games there are outcomes that are achieved via mixed strategies but not any behavioral strategies, and in some games it is the other way around.

image

FIGURE 5.3: A game with imperfect recall.

Consider for example the game in Figure 5.3. In this game, when considering mixed strategies (but not behavioral strategies), R is a strictly dominant strategy for agent 1, D is agent 2’s strict best response, and thus (R, D) is the unique Nash equilibrium. Note in particular that in a mixed strategy, agent 1 decides probabilistically whether to play L or R in his information set, but once he decides he plays that pure strategy consistently. Thus the payoff of 100 is irrelevant in the context of mixed strategies. On the other hand, with behavioral strategies agent 1 gets to randomize afresh each time he finds himself in the information set. Noting that the pure strategy D is weakly dominant for agent 2 (and in fact is the unique best response to all strategies of agent 1 other than the pure strategy L), agent 1 computes the best response to D as follows. If he uses the behavioral strategy (p, 1 p) (i.e., choosing L with probability p each time he finds himself in the information set), his expected payoff is

1 * p2 + 100 * p(1 − p) + 2 * (1 − p).

The expression simplifies to −99p2 + 98p + 2, whose maximum is obtained at p = 98/198. Thus (R, D) = ((0, 1), (0, 1)) is no longer an equilibrium in behavioral strategies, and instead we get the equilibrium ((98/198, 100/198), (0, 1)).

There is, however, a broad class of imperfect-information games in which the expressive power of mixed and behavioral strategies coincides. This is the class of games of perfect recall. Intuitively speaking, in these games no player forgets any information he knew about moves made so far; in particular, he remembers precisely all his own moves. A formal definition follows.

Definition 5.2.2 (Perfect recall). Player i has perfect recall in an imperfect-information game G if for any two nodes h, h′ that are in the same information set for player i, for any path h0, a0, h1, a1, h2, …, hn, an, h from the root of the game to h (where the hj are decision nodes and the aj are actions) and for any path h0, a0, h′1, a′1, h2, …, hm, am, h from the root to h it must be the case that:

1. n = m;

2. for all 0 ≤ jn, hj and hj are in the same equivalence class for player i and;

3. for all 0 ≤ jn, if ρ(hj) = i (i.e., hj is a decision node of player i), then aj = a′j.

G is a game of perfect recall if every player has perfect recall in it.

Clearly, every perfect-information game is a game of perfect recall.

Theorem 5.2.3 (Kuhn, 1953). In a game of perfect recall, any mixed strategy of a given agent can be replaced by an equivalent behavioral strategy, and any behavioral strategy can be replaced by an equivalent mixed strategy. Here two strategies are equivalent in the sense that they induce the same probabilities on outcomes, for any fixed strategy profile (mixed or behavioral) of the remaining agents.

As a corollary we can conclude that the set of Nash equilibria does not change if we restrict ourselves to behavioral strategies. This is true only in games of perfect recall, and thus, for example, in perfect-information games. We stress again, however, that in general imperfect-information games, mixed and behavioral strategies yield noncomparable sets of equilibria.

5.3 SEQUENTIAL EQUILIBRIUM

We have already seen that the Nash equilibrium concept is too weak for perfect-information games, and how the more selective notion of subgame-perfect equilibrium can be more instructive. The question is whether this essential idea can be applied to the broader class of imperfect-information games; it turns out that it can, although the details are considerably more involved.

Recall that in a subgame-perfect equilibrium we require that the strategy of each agent be a best response in every subgame, not only the whole game. It is immediately apparent that the definition does not apply in imperfect-information games, if for no other reason than we no longer have a well-defined notion of a subgame. What we have instead at each information set is a “subforest” or a collection of subgames. We could require that each player’s strategy be a best response in each subgame in each forest, but that would be both too strong a requirement and too weak. To see why it is too strong, consider the game in Figure 5.4.

image

FIGURE 5.4: Player 2 knows where in the information set he is.

The pure strategies of player 1 are {L,C, R} and of player 2 {U, D}. Note also that the two pure Nash equilibria are (L, U) and (R, D). But should either of these be considered “subgame perfect?” On the face of it the answer is ambiguous, since in one subtree U (dramatically) dominates D and in the other D dominates U. However, consider the following argument. R dominates C for player 1, and player 2 knows this. So although player 2 does not have explicit information about which of the two nodes he is in within his information set, he can deduce that he is in the rightmost one based on player 1’s incentives, and hence will go D. Furthermore player 1 knows that player 2 can deduce this, and therefore player 1 should go R. Thus, (R, D) is the only subgame-perfect equilibrium.

This example shows how a requirement that a sub-strategy be a best response in all subgames is too simplistic. However, in general it is not the case that subtrees of an information set can be pruned as in the previous example so that all remaining ones agree on the best strategy for the player. In this case the naive application of the SPE intuition would rule out all strategies.

There have been several related proposals that apply the intuition underlying subgame-perfection in more sophisticated ways. One of the more influential notions has been that of sequential equilibrium (SE). It shares some features with the notion of trembling-hand perfection, discussed in Section 3.6. Note that indeed trembling-hand perfection, which was defined for normal-form games, applies here just as well; just think of the normal form induced by the extensive-form game. However, this notion makes no reference to the tree structure of the game. SE does, but at the expense of additional complexity.

Sequential equilibrium is defined for games of perfect recall. As we have seen, in such games we can restrict our attention to behavioral strategies. Consider for the moment a fully mixed-strategy profile.2 Such a strategy profile induces a positive probability on every node in the game tree. This means in particular that every information set is given a positive probability. Therefore, for a given fully mixed-strategy profile, one can meaningfully speak of i’s expected utility, given that he finds himself in any particular information set. (The expected utility of starting at any node is well defined, and since each node is given positive probability, one can apply Bayes’ rule to aggregate the expected utilities of the different nodes in the information set.) If the fully mixed-strategy profile constitutes an equilibrium, it must be that each agent’s strategy maximizes his expected utility in each of his information sets, holding the strategies of the other agents fixed.

All of the preceding discussion is for a fully mixed-strategy profile. The problem is that equilibria are rarely fully mixed, and strategy profiles that are not fully mixed do not induce a positive probability on every information set. The expected utility of starting in information sets whose probability is zero under the given strategy profile is simply not well defined. This is where the ingenious device of SE comes in. Given any strategy profile S (not necessarily fully mixed), imagine a probability distribution µ(h) over each information set. µ has to be consistent with S, in the sense that for sets whose probability is nonzero under their parents’ conditional distribution S, this distribution is precisely the one defined by Bayes’ rule. However, for other information sets, it can be any distribution. Intuitively, one can think of these distributions as the new beliefs of the agents, if they are surprised and find themselves in a situation they thought would not occur. This means that agents’ expected utility is now well defined in any information set, including those having measure zero. For information set h belonging to agent i, with the associated probability distribution µ(h), the expected utility under strategy profile S is denoted by ui(S | h, µ(h)).

With this, the precise definition of SE is as follows.

Definition 5.3.1 (Sequential equilibrium). A strategy profile S is a sequential equilibrium of an extensive-form game G if there exist probability distributions µ(h) for each information set h in G, such that the following two conditions hold:

1. (S, µ) = limn→∞(Sn, µn) for some sequence (s1, µ1), (s2, µ2), …, where Sn is fully mixed, and µn is consistent with Sn (in fact, since Sn is fully mixed, µn is uniquely determined by Sn); and

2. For any information set h belonging to agent i, and any alternative strategy S’i of i, we have that

ui (S | h, µ(h)) ≥ ui ((S’, Si) | h, µ(h)).

Analogous to subgame perfection in games of perfect information, sequential equilibria are guaranteed to always exist.

Theorem 5.3.2. Every finite game of perfect recall has a sequential equilibrium.

Finally, while sequential equilibria are defined for games of imperfect information, they are obviously also well defined for the special case of games of perfect information. This raises the question of whether, in the context of games of perfect information, the two solution concepts coincide. The answer is that they almost do, but not quite.

Theorem 5.3.3. Every subgame-perfect equilibrium is a sequential equilibrium, but the converse is not true in general.

 

 

 

1Note that we have defined two transformations—one from any normal-form game to an imperfect−information game, and one in the other direction. However the first transformation is not one to one, and so if we transform a normal-form game to an extensive-form one and then back to normal form, we will not in general get back the same game we started out with. However, we will get a game with identical strategy spaces and equilibria.

2Again, recall that a strategy is fully mixed if, at every information set, each action is given some positive probability.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.26.138