New Dilemmas for the Prisoner
Dictators and Extortionists
One mischievous strategy might be called the dictator: It unilaterally sets the other player’s long-term average score to any value between the mutual-defection payment and the mutual-cooperation payment. (For the standard payoff values 0, 1, 3, 5, that means anywhere between 1 and 3.)
Consider the strategy (4/5, 2/5, 2/5, 1/5), where the four numbers again indicate the probability of choosing to cooperate after cc, cd, dc, dd, respectively. If Y plays this strategy against X, then X’s average score per round will converge on the value 2.0 after a sufficiently long series of games, no matter what strategy X chooses to play. In the lower illustration on the previous page X responds to Y’s coercion with four different strategies, but in each case X’s average score gravitates ineluctably toward 2.0. It should be emphasized that dictating X’s score does not require Y to make any active adjustments or responses as the game proceeds. Y can set the four probabilities and then “go to lunch,” as Press and Dyson put it.
A second form of mischief manipulates the ratio between X’s score and Y’s score. If SX and SY are the players’ long-term average scores, the strategy allows Y to enforce the linear relation SY = 1 + M(SX – 1), where M is an arbitrary constant greater than 1. X has the option of playing an always-defect strategy, which consigns both players to the minimal payoff of one point per round. But if X takes any steps to improve this return, every increment to SX will increase SY by M times as much. Press and Dyson call the technique extortion. As an example they cite the strategy (11/13, 1/2, 7/26, 0), which sets M = 3. If Y adopts this rule, X can play always-defect (or tit-for-tat) to limit both players to one point per round. When X chooses other strategies, however, Y comes out ahead. If X plays Pavlov, the scores are approximately SX = 1.46 and SY = 2.36. To maximize his or her score, X must cooperate unconditionally, earning an average of 1.91 points, but then Y gets 3.73 points.
The discovery of dictatorial and extortionate strategies came as a great surprise, and yet there were precedents. Aspects of the discovery were anticipated in the 1990s by Maarten C. Boerlijst, Martin A. Nowak, and Karl Sigmund. Moreover, not all of the zero-determinant strategies are exotic ideas that no one ever thought of trying. Tit-for-tat, the most famous of all IPD rules, is in fact an extortionate zero-determinant strategy. It sets M = 1, forcing equality of scores.
Watching the coercive strategies in action (or playing against them), I can’t help feeling there is something uncanny going on. In a game whose structure is fully symmetrical, how can one contestant wield such power over the other? In the case of the dictatorial strategies, the symmetry isn’t so much broken as transformed: When I take control of your score, I lose control of my own; although there’s nothing you can do to alter your own score, you have the power to set mine.
The extortionate strategies can’t be explained away so easily. There really is an asymmetry, with one player grabbing an unfair share of the spoils, and the only defense is to retreat to the policy of universal defection that leaves everyone impoverished. IPD seems to be back in the same dreary jail cell where it all began.
» Post Comment