Recall that in the last lecture we looked at the following example of a stage game.

A B
A \((3,3)\) \((0,5)\)
B \((5,0)\) \((1,1)\)

We have shown that \((s_G,s_G)\) is a Nash equilibrium for the restricted pure strategy set \(\{s_A,s_B,s_G\}\), when \(\delta>\frac{1}{2}\).

But is it a subgame perfect Nash equilibrium?

The next two calculations, being rather wordy, will be carried out on the slides rather than by hand.

At any point in the game, the future stages of the game (ie a subgame) are formally equivalent to the entire game.

However, how we arrive at these various subgames can vary.

These possible subgames can be divided into 4 classes:

  • Neither players has played \(B\);
  • Both players have played \(B\);
  • Player 1 used \(A\) in the last stage but player 2 did not;
  • Player 2 used \(A\) in the last stage but player 1 did not.

In case 1 neither player’s opponent has played \(B\) so the strategy \(s_G\) specifies that cooperation should continue until defection occurs. So the strategy pair that is specified in case 1 is \((s_G,s_G)\) which is a Nash equilibrium as it was one for the entire game.

In case 2 both players have already defected so the Nash equilibrium strategy \((s_G,s_G)\) specifies that both players should defect forever. That is, the strategy adopted for the subgame is \((s_B,s_B)\), which is a Nash equilibrium since it was one for the entire game.

In case 3 \(s_G\) specifies that player 1 should switch to \(B\) forever as her opponent has just player \(B\). But player 1 has not yet played \(B\) so player 2 should still use \(s_G\).

Thus the strategy implies that for the subgame the players should play \((s_B,s_G)\). But this is not a Nash equilibrium for the subgame as player 2 could get a better payoff from the pair \((s_B,s_B)\). (Case 4 is similar.) So \((s_G,s_G)\) is not a subgame perfect Nash equilibrium.

Although \((s_G,s_G)\) is not a subgame perfect Nash equilibrium, a very similar strategy does give a subgame perfect Nash equilibrium.

Let \(s_g\) be the strategy “Start by playing \(A\) and continue to cooperate until either player defects to \(B\), then defect forever”.

The pair \((s_g,s_g)\) is a Nash equilibrium for the same reason that \((s_G,s_G)\) was, and is subgame perfect because in cases 3 and 4 above it now specifies that the players should play \((s_B,s_B)\) in the subgame, which is a Nash equilibrium.

So far we have restricted our repeated games to a limited set of strategies when showing that we have a Nash equilibrium. Is there a way to deal with more general families of strategies?

Definition 5.14

A pair of strategies \((s,t)\) for a repeated stage game satisfies the one stage deviation condition if neither player can increase their payoff by deviating (unilaterally) from their strategy in any single stage and returning to the specified strategy thereafter.

Theorem 5.15: (The {one stage deviation principle})

A pair of strategies is a subgame perfect Nash equilibrium for a discounted repeated game if and only if it satisfies the one stage deviation condition.

Example 5.16

Show that the pair \((s_g,s_g)\) satisfies the one stage deviation condition.

At any given stage the game will be in one of two classes of subgame:

  • Both players have always cooperated on \(A\).
  • At least one player has defected to \(B\) in a previous round.

Case 1: If both players have always cooperated then \(s_g\) specifies cooperation at this stage.

If either player changes to play \(B\) then \(s_g\) specifies using \(B\) thereafter. So the expected future payoff for the player making the change is \[5+\frac{\delta}{1-\delta}\] which as we saw in Example 5.13 is no better than playing \(s_g\) if \(\delta\geq 1/2\).

Case 2: If either player plays \(B\) in the past then \(s_g\) specifies play \(B\) at this stage and thereafter.

If either player changes to \(A\) at this stage, \(s_g\) still specifies playing \(B\) thereafter. So the expected future payoff for the player making the change is \[0+\frac{\delta}{1-\delta}\] which is less than \[1+\frac{\delta}{1-\delta}.\]

In both cases a one stage deviation is pointless, provided that \(\delta\geq 1/2\).

So far we have only considered one example of a repeated stage game, based on a variant of the Prisoners’ Dilemma.

We have seen that with finitely many repeats there was no incentive to cooperate as the Nash equilibrium was just made up of repeated copies of the unique Nash equilbrium for the stage game.

To see how we might introduce cooperation, we then considered an infinite repeated version of the game, where we had to use a discounted version to calculate meaningful payoffs. In this version we could induce cooperation as a sensible strategy.

It may have seemed somewhat unsatisfactory to have to resort to an infinite repeated game. We will now see that for certain games we can encourage cooperation even in a finite game.

Rather than considering a repeated game, we will consider a two stage game, where stage one is given by the game

M F
M \((4,4)\) \((-1,5)\)
F \((5,-1)\) \((1,1)\)

and stage two is given by the game

L G
L \((0,0)\) \((-4,-1)\)
G \((-1,-4)\) \((-3,-3)\)

We will also assume that there is a discount factor of \(\delta\) (although this is not required).

The first game is a version of the Prisoners’ Dilemma, and has a unique Nash equilibrium \((F,F)\) with payoff \((1,1)\).

The second game has three Nash equilbria: \((L,L)\) with payoff \((0,0)\), \((G,G)\) with payoff \((-3,-3)\), and a mixed Nash equilibria \((\underline{x},\underline{x})\) where \(\underline{x}=(0.5,0.5)\).

We would like to use the second game as a threat (or reward) to encourage the players to cooperate in the first game and get the Pareto-optimal first game payoff of \((4,4)\).

Unlike for the iterated Prisoner’s dilemma, we now have multiple Nash equilibria in the final game, with different payoffs, and we will use this fact to help us develop cooperation.

Our goal is to find a subgame perfect Nash equilibrium for this two stage game where at the first stage both players play \(M\).

We know that in the final stage of the game the strategy must reduce to a Nash equilibrium for the second game (as otherwise we do not have a subgame perfect strategy).

We want to use the “good” equilibrium \((L,L)\) as an incentive for the players to cooperate in stage 1, with the “bad” equilbrium \((G,G)\) as a punishment.

Let \(s\) be the strategy where a player plays \(M\) in the first stage, and in the second stage plays \(L\) if \((M,M)\) was played in stage 1 but plays \(G\) otherwise.

Example 5.17

For the two stage discounted game described above, show that \((s,s)\) is a subgame perfect Nash equilibrium, provided that \(\delta>\frac{1}{3}\).

Example completed by hand in the lecture

In general we can encourage cooperation in multistage games in a similar way, provided

  • There are at least two Nash equilbria in the final stage which can be used as a punishment and reward.
  • The discount factor is large enough that the future punishment or reward outweighs the usual defection strategy at earlier stages.

Unfortunately, there are other subgame perfect Nash equilibria for this two stage game which are not so desirable.

For example, we can find an equilibrium where player 1 plays \(F\) and player 2 plays \(M\)! (See Exercise Sheet 9, Question 6 for details.)