Nash equilibrium. Scientific electronic library

Subscribe
Join the “koon.ru” community!
In contact with:

Situations when there is an equilibrium in the dominant strategies in the game are quite rare. And not all games can find a solution by discarding strictly dominated strategies. A corresponding example of a game is presented in Table 16.8.

The second player will choose strategy A if he assumes that the first will choose strategy Z; at the same time, strategy B is preferable for him if the first one chooses Y.

Table 16.8.

It is natural to assume that in the absence of dominant strategies for all players, each player's choice depends on expectations of what the others' choices will be. Next we will look at a solution concept based on this idea.

16.2.4 Nash equilibrium

In addition to the situations discussed in the previous section, there are situations14 that are natural to model based on the following assumptions:

When making decisions, players are guided by the expected actions of their partners;

expectations are equilibrium (coincide with the actions actually chosen by the partners).

If we assume that all players are rational, so that each chooses the strategy that gives him the greatest payoff given his expectations, then these assumptions lead to a concept of decision called Nash equilibrium. In equilibrium, each player has no reason to revise his expectations.

Formally, the Nash equilibrium is defined as follows.

Definition 90:

A set of strategies x X is a Nash equilibrium15 if

1) the strategy x i of each player is his best response to the expected strategies of other players xe −i:

ui (xi , xe −i ) = max ui (xi , xe −i ) i = 1, . . . , n;

x iX i

14 One can imagine a population of type A players (say, a cat) and type B players (say, a mouse). A player of type A, when meeting a player of type B, has expectations regarding the behavior of a partner of type B, justified by his own or someone else’s experience, and is guided by them in advance (and vice versa). However, this is not the only type of situation in which the approach under consideration is adequate.

15 American mathematician John Nash received Nobel Prize in Economics in 1994 together with J. Harsanyi and R. Selten "for their pioneering analysis of equilibria in the theory of non-cooperative games." The concept of equilibrium was proposed in the following articles: J. F. Nash: Equilibrium Points in N-Person Games,

Proceedings of the National Academy of Sciences of the United States of America 36 (1950): 48–49; J. F. Nash: NonCooperative Games, Annals of Mathematics 54 (1951): 286–295 (Russian translation. J. Nash: Non-Cooperative Games, in the book Matrix Games, N. N. Vorobyov (ed.), M.: Fizmatgiz, 1961: 205–221).

It should be noted that Nash himself did not introduce expectations into the definition. Nash's original definition coincides with the property discussed below.

xe −i = x−i i = 1, . . . , n

Note that when using the Nash equilibrium to model game situations, questions about whether the players know the goals of their partners, whether they know about the rationality of their partners, whether they know how to calculate them, etc., fade into the background. The way in which expectations are formed is taken outside the scope of analysis; All that matters here is that the expectations are in equilibrium.

But if the analysis of Nash equilibrium does not care whether a player knows the goals of other players, then one may question the validity of considering the Nash concept in the context of games with perfect information. The thing is that the term " full information"in game theory has a rather narrow meaning. It actually only implies completeness of information about the types of partners (the term “player type” is explained in the paragraph on Bayesian games).

As is easy to see, the above definition of the Nash equilibrium is equivalent to the following property, which is usually used as a definition:

A set of strategies x X is a Nash equilibrium if the strategy xi of each player is its best response to the strategies of other players x−i:

ui (xi , x−i ) = max ui (xi , x−i ) i = 1, . . . , n

x iX i

This property can also be written in terms of so-called response functions (maps).

Definition 91:

Displaying the response of the i-th player,

Ri : X−i 7→Xi

assigns to each set of strategies of other players, x−i X−i , the set of strategies of the i-th player, each of which is the best response to x−i . In other words,

ui (yi , x−i ) = max ui (xi , x−i ) x−i X−i , yi Ri (x−i )x i X i

The introduction of response mappings allows us to write the definition of a Nash equilibrium more compactly: a set of strategies x X is a Nash equilibrium if

xi Ri (x−i ) i = 1, . . . , n

If the response of each player is unambiguous (is a function), then the set of Nash equilibria coincides with the set of solutions to the system of equations:

xi = Ri (x−i ) i = 1, . . . , n.

In Table 16.8, player response displays are depicted by highlighting payoffs corresponding to optimal actions. The Nash equilibrium in this game is cell (B, Y), since the payoffs of both players in it are emphasized.

Let's illustrate the use of response functions using the example of a game in which players have a continuum of strategies.

Game 5. “International Trade”

Two countries simultaneously choose the level of customs duties, τi. The volume of trade between countries16, x, depends on the established duties as

x = 1 − τ1 − τ2

The goal of each country is to maximize income ui = τi x.

We maximize the winnings of the 1st country,

τ1 (1 − τ1 − τ2 )

according to τ1, considering the duty level set by the 2nd country to be fixed. The first order condition has the form

1 − 2τ1 − τ2 = 0

Since the function being maximized is strictly concave, the first-order condition corresponds to a global maximum.

The first order condition for the problem of maximizing the gain of the 2nd country is found similarly:

1 − τ1 − 2τ2 = 0

Having solved a system of two linear equations, we find the Nash equilibrium:

τ1 = τ2 = 1/3

The optimal response of the 1st country to the level of customs duty established by the 2nd country is described by the function

τ1 (τ2 ) =1 − τ 2

Similarly, the response function of the 2nd country is

τ2 (τ1 ) =1 − τ 1 2

To find the Nash equilibrium, you need to solve the system of equations

τ1 (τ2) = τ1,

τ2 (τ) = τ .

The search for the Nash equilibrium is shown graphically in Fig. 16.3. The points lying on the optimal response curves τ1 (τ2) and τ2 (τ1) are characterized by the fact that the tangents to the players’ indifference curves are parallel to the corresponding coordinate axis. Recall that the indifference curve is the set of points at which the utility of the individual in question is the same (ui (x) = const). Equilibrium is found as the point of intersection of the response curves.

The advantage of using the Nash equilibrium concept is that it is possible to find a solution in those games in which discarding dominated strategies does not allow this. However, the concept itself may be more controversial because it relies on strong assumptions about player behavior.

The relationship between the introduced solution concepts is described by the following statements:

16 In this game, for simplicity, we do not distinguish between exports and imports.

(τ2)

equilibrium

τ2 (τ1 )

Rice. 16.3. Nash equilibrium in the game of international trade

Theorem 151:

If x = (x1, . . . , xm) is a Nash equilibrium in a certain game, then none of its constituent strategies can be discarded as a result of applying the procedure for sequentially discarding strictly dominated strategies.

The converse theorem is true in the case of uniqueness.

Theorem 152:

If, as a result of sequential rejection of strictly dominated strategies, each player is left with a single strategy, xi, then x = (x1, . . . , xm) is the Nash equilibrium in this game.

Evidence for these two statements is given in Appendix B (p. 641). It is important for us here that Nash's concept does not contradict the ideas of rationality inherent in the procedure for discarding strictly dominated strategies.

Apparently, it is natural to assume that a reasonably defined equilibrium cannot be discarded by successively discarding strictly dominated strategies. The first of the theorems can be seen as confirmation that Nash's concept is quite reasonable. Note that this result applies only to strict dominance. One can give an example of a Nash equilibrium with one or more weakly dominated strategies (see, for example, Table 16.11 on p. 652).

16.2.5 Nash equilibrium in mixed strategies

It is not difficult to construct examples of games in which there is no Nash equilibrium. The following game provides an example of such a situation.

Game 6. “Inspection”

In this game, the first player (the person being tested) is faced with a choice - to pay or not to pay income tax. The second, the tax inspector, decides whether or not to check this particular taxpayer. If an inspector “catches” an unscrupulous taxpayer, he imposes a fine on him and receives a promotion that more than compensates for his costs; in the case of an inspection of a healthy taxpayer, the inspector, although not receiving incentives, nevertheless bears the costs associated with the inspection. The payoff matrix is ​​presented in Table 16.9.

Table 16.9.

Inspector

check

don't check

violate

Verifiable

do not violate

If the inspector is confident that the taxpayer will choose not to pay the tax, then it is beneficial for the inspector to check it. On the other hand, if the taxpayer is confident that he will be audited, then he is better off paying the tax. Similarly, if the inspector is confident that the taxpayer will pay the tax, then it is not beneficial for the inspector to check him, and if the taxpayer is confident that the inspector will not check him, then he will prefer not to pay the tax. Optimal responses are shown in the table by highlighting the corresponding payoffs. Obviously, none of the cells can be a Nash equilibrium, since none of the cells emphasizes both payoffs at the same time.

In such a game, each player is interested in ensuring that his partner cannot guess which strategy he chose. This can be achieved by introducing an element of uncertainty into the choice of strategy.

The strategies that we considered earlier are usually called pure strategies. Pure strategies in static games essentially coincide with the actions of the players. But in some games it is natural to introduce mixed strategies as well. Under mixed strategy understand probability distributions on pure strategies. In the special case when the set of pure strategies of each player is finite,

Xi = (x1 i , . . . , xn i i )

(the corresponding game is called finite), the mixed strategy is represented by a vector of probabilities of the corresponding pure strategies:

µi = (µ1 i , . . . , µn i i )

Let us denote the set of mixed strategies of the i-th player by Mi:

Mi = µi µk i > 0, k = 1, . . . , ni; µ1 i + · · · + µn i i = 1

As we have already noted, the standard assumption of game theory (like economic theory) is that if the payoff is random value, then players prefer actions that bring them the greatest expected payoff. The expected payoff of the i-th player, corresponding to the set of mixed strategies of all players, (µ1, . . . , µm), is calculated by the formula

The expectation is calculated under the assumption that players choose strategies independently (in a statistical sense).

Mixed strategies can be represented as the result of a player’s randomization of his actions, that is, as a result of their random choice. For example, to choose each of two possible strategies with equal probability, a player might flip a coin.

This interpretation implies that the choice of strategy depends on some signal that the player himself can observe, but his partners cannot17. For example, a player can choose a strategy depending on his mood, if he knows the probability distribution of his moods, or on what foot he got up on that day18.

Definition 92:

The set of mixed strategies µ = (µ1 , . . . , µm ) is Nash equilibrium in mixed strategies, If

1) the strategy µ i of each player is his best response to the expected strategies of other players µe −i:

U(µi , µe −i ) = max U(µi , µe −i ) i = 1, . . . , n;

µ iM i

2) expectations coincide with the actually chosen strategies:

µe −i = µ−i i = 1, . . . , n.

Note that the Nash equilibrium in mixed strategies is the ordinary Nash equilibrium in the so-called mixed extension of the game, i.e., a game whose pure strategies are mixed strategies of the original game.

Let's find the Nash equilibrium in mixed strategies in Game 16.2.5.

Let us denote by µ the probability that the taxpayer does not pay income tax,

A through ν - the probability that the tax inspector checks the taxpayer.

IN In these notations, the taxpayer's expected gain is equal to

U1 (µ, ν) = µ[ν · (−1) + (1 − ν) · 1] + (1 − µ)[ν · 0 + (1 − ν) · 0] =

= µ(1 − 2ν),

A the inspector's expected payoff is

U2 (µ, ν) = ν[µ 1 + (1 − µ) (−1)] + (1 − µ)[µ 0 + (1 − µ) 0] = = ν(2µ − 1 )

If the probability of verification is small (ν< 1/2), то налогоплательщику выгодно не платить налог, т. е. выбрать µ = 1. Если вероятность проверки велика, то налогоплательщику выгодно заплатить налог, т. е. выбрать µ = 0. Если же ν = 1/2, то налогоплательщику все равно, платить налог или нет, он может выбрать любую вероятность µ из интервала . Таким образом, отображение отклика налогоплательщика имеет вид:

Arguing in a similar way, we find the response of the tax inspector:

0 if µ< 1/2

ν(µ) = , if µ = 1/2

1 if µ > 1/2.

17 If the signals observed by players are statistically dependent, then this can help players coordinate their actions. This leads to the concept of correlated equilibrium.

18 Subsequently we will consider how the randomization effect can be achieved within the framework of Bayesian equilibrium.

The response graphs of both players are presented in Fig. 16.4. The axes on this diagram represent probabilities (ν and µ, respectively). They have a single common point (1/2, 1/2). This point corresponds to the Nash equilibrium in mixed strategies. In this equilibrium, as is always the case in equilibria with nondegenerate mixed strategies (that is, in equilibria in which no strategy is chosen with probability 1), each player randomizes strategies that provide him with the same expected utility. The probabilities of using the corresponding pure strategies chosen by the player are determined not by the payoff structure of this player, but by the payoff structure of his partner, which can cause certain difficulties in interpreting this decision.

Rice. 16.4. Displaying feedback in the game "Inspection"

Unlike equilibrium in pure strategies, equilibrium in mixed strategies in finite games always exists19, which is a consequence of the following general statement.

Theorem 153:

Let us assume that in the game G = hI, (Xi )i I , (ui )i I i for any player the set of strategies Xi is non-empty, compact and convex, and the payoff function ui (·) is concave in xi and continuous. Then the game G has a Nash equilibrium (in pure strategies).

The existence of a mixed strategy Nash equilibrium in games with a finite number of pure strategies is a consequence of the fact that a mixed strategy equilibrium is a pure strategy equilibrium in a mixed extension of the game.

Theorem 154 (Corollary (Nash's Theorem)):

A Nash equilibrium in mixed strategies exists in any finite game.

Note that the existence of an equilibrium in a game in pure strategies does not exclude the existence of an equilibrium in non-degenerate mixed strategies.

Consider in Game 16.2.1 “Choosing a Computer” a case where the benefits of compatibility are significant, i.e. a< c и b < c. В этом варианте игры два равновесия в чистых стратегиях: (IBM, IBM) и (Mac, Mac). Обозначим µ и ν вероятности выбора компьютера IBM PC первым и вторым игроком соответственно. Ожидаемый выигрыш 1-го игрока равен

U1 (µ, ν) = µ[ν · (a + c) + (1 − ν) · a] + (1 − µ)[ν · 0 + (1 − ν) · c] = = µ[ν · 2c − (c − a)] + (1 − ν)c

and its response looks like

µ(ν) = ,

The expected payoff of the 2nd player is

if ν< (c − a)/2c

if ν = (c − a)/2c

if ν > (c − a)/2c.

U2 (µ, ν) = ν[µ c + (1 − µ) 0] + (1 − ν)[µ b + (1 − µ) (b + c)] =

= ν[µ 2c − (b + c)] + b + (1 − µ)c

and its response looks like

ν(µ) = ,

if µ< (b + c)/2c

if µ = (b + c)/2c

if µ > (b + c)/2c.

Response display graphs and points corresponding to the three equilibria are shown in Fig. 16.5. As can be seen, in the game under consideration, in addition to two equilibria in pure strategies, there is one equilibrium in non-degenerate mixed strategies. The corresponding probabilities are equal

µ = b + c and ν = c − a

Rice. 16.5. The case when in the game “Computer Choice” there are three equilibria, one of which is an equilibrium in non-degenerate mixed strategies

Appendix A

The theorem is repeated, the number is updated, there is no link to this application. A and B can be swapped

Theorem 155:

Let us assume that in the game G = hI, (Xi )i I , (ui0 )i I i for any player the set of strategies Xi is non-empty, compact and convex, and the payoff function ui (·) is concave in xi and continuous. Then there is a Nash equilibrium.

Proof: Let us prove that the response map, Ri (·), of each player is upper semicontinuous and its value for each x−i X−i is non-empty and convex. Non-emptiness follows from Weierstrass's theorem (a continuous function on a compact set reaches a maximum).

16.2. Static games with complete information

Let us prove convexity. Let z0 , z00 Ri (x−i ). Obviously, u(z0 , x−i ) = u(z00 , x−i of concavity in xi of the function ui (·) it follows that for α

u(αz0 + (1 − α)z00 , x−i ) > αu(z0 , x−i ) + (1 − α)u(z00 , x−i ) =

U(z0 , x−i ) = u(z00 , x−i )

Since the function ui (·) reaches a maximum at points z0 and z00, then the strict inequality

impossible. Thus,

αz0 + (1 − α)z00 Ri (x−i )

Let us now prove the upper semicontinuity of the mapping Ri (·). Consider a sequence xn i converging to x¯i and a sequence xn −i converging to x¯−i , and xn i Ri (xn −i ). Note that due to the compactness of the sets Xj x¯i Xi and x¯−i X−i . We need to prove that x¯i Ri (x¯−i ). By definition of response mapping

u(xn i , xn −i ) > u(xi , xn −i ) xi Xi , n

From the continuity of the function ui (·) it follows that

u(¯xi , x¯−i ) > u(xi , x¯−i ) xi Xi

Thus, according to the definition of response mapping introduced above, x¯i Ri (x¯−i ). Based on the just proven properties of the map Ri (·) and Kakutani’s theorem,

let us prove the existence of a Nash equilibrium, that is, such a set of strategies x X for

which is completed

xi Ri (x−i ) i = 1, . . . , n

We define the mapping R(·) from X to X as follows:

R(x) = R1 (x−1 ) × · · · × Rn (x−n )

Note that this mapping satisfies the same properties as each of the mappings Ri (·), since it is their Cartesian product.

The mapping R(·) and the set X satisfy the properties that are necessary for Kakutani's theorem to hold. Thus, there is a fixed point of mapping

Obviously, point x is a Nash equilibrium.

Appendix B

In this appendix, we formally prove statements about the connection between the Nash equilibrium and the procedure for sequentially discarding strictly dominated strategies.

First, we formally define a procedure for sequentially discarding strictly dominated strategies. Let the original game be given as

G = hI, (Xi )I , (ui )I i.

Let us define a sequence of games (G[t] )t=0,1,2,... , each of which is obtained from the subsequent game by discarding strictly dominated strategies. Games differ from each other in the sets of acceptable strategies:

G[t] = hI, (Xi [t] )I , (ui )I i

The procedure starts with G= G.

The set of admissible strategies of the i-th player at step t + 1 of the procedure under consideration is taken equal to the set of non-strictly dominated strategies of the i-th player in the game of the t-th step. We will denote sets of non-strictly dominated strategies by NDi (see the definition of strictly dominated strategies (Definition89, p. 631)). Formally

NDi = xi Xi yi Xi : ui (yi , x−i ) > ui (xi , x−i ) x−i X−i

Thus, we can write the step of the procedure in question as follows:

X i = ND i [t]

where NDi [t] is the set of strictly non-dominated strategies in the game G[t] .

Let us now present the proofs of Theorems 151 and 152 (p. 636). Theorem 151 states the following:

: If x = (x1, . . . , xm) is a Nash equilibrium in some game, then none of the strategies can be discarded as a result of applying the procedure for sequentially discarding strictly dominated strategies.

Using the notation just introduced, Theorem 151 states that if x is a Nash equilibrium in the original game G, then at any step t

xi Xi [t] , i I, t = 1, 2, . . .

x X[t] , t = 1, 2, . . .

Proof (Proof of Theorem 151): Let there be a step τ such that the strategy xi of some player i I must be discarded at it. It is assumed that no strategies were discarded in the previous steps:

x X[t] , t = 1, . . . , τ.

By the definition of strict dominance, there is another strategy for player i, x0 i Xi [τ], which gives this player in the game G[τ] a higher payoff for any other choices

ui (x0 i , x−i ) > ui (xi , x−i ) x−i X− [τ i ]

In particular, this relation must hold for x−i, since we assumed that the strategies x−i were not discarded in the previous steps of the procedure (x−i X− [τ i ]). Means,

: If, as a result of sequentially discarding strictly dominated strategies, each player is left with a single strategy, xi, then x = (x1, . . . , xm) is the Nash equilibrium in this game.

This theorem applies to the case when, in the process of discarding strictly dominated

strategies starting from some step ¯ there is only one set of strategies left, i.e. t x

The theorem states that x is the unique Nash equilibrium of the original game.

Proof (Proof of Theorem 152): Since, according to the theorem just proved, none of the Nash equilibria can be discarded, we only have to prove that the specified set of strategies x is a Nash equilibrium. Let's assume that this is not the case. This means that there is a strategy x˜i of some player i such that

ui (xi , x−i )< ui (˜xi , x−i )

By assumption, strategy x˜i was discarded at some step τ because it does not coincide with xi. Thus, there is some strictly dominant strategy x0 i Xi [τ] , so

ui (x0 i , x−i ) > ui (˜xi , x−i ) x−i X− [τ i ]

In particular, this inequality holds for x−i = x−i:

ui (x0 i , x−i ) > ui (˜xi , x−i )

Strategy x0 i cannot coincide with strategy xi, since in this case the above inequalities contradict each other. In turn, it follows from this that there must be a strategy x00 i that dominates the strategy x0 i at some step τ0 > τ, i.e.

(x00

[τ0 ]

−i

Including

ui (x00 i , x−i ) > ui (x0 i , x−i )

It can again be argued that the strategy x00 i cannot coincide with the strategy xi, otherwise the above inequalities would contradict each other.

Continuing these arguments, we obtain a sequence of steps τ< τ0 < τ00 < . . .

and the corresponding admissible strategies x0 i , x00 i , x000 i , . . ., not coinciding with xi . This is counter-

/ 667. Two players place some object on the plane, that is, choose its coordinates (x, y). Player 1 is at point (x 1, y1), and player 2 is at the point (x2, y2). Player 1 chooses the x coordinate and player 2 chooses the y coordinate. Everyone strives to have the object as close to him as possible. Show that in this game each player has a strictly dominant strategy.

/ 668. Prove that if in some game each player has a strictly dominant strategy, then these strategies constitute the only Nash equilibrium.

/ 669. Explain why the dominant strategy equilibrium must also be a Nash equilibrium. Give an example of a game in which there is a dominant strategy equilibrium and, in addition, there are Nash equilibria that do not coincide with the dominant strategy equilibrium.

Find all Nash equilibria in the following games.

/ 670. Game 16.2.1 (p.625), the winnings of which are presented in the Table??////??

/ 671. “Nuts”

Two players share 4 nuts. Everyone makes their bid for nuts: xi = 1, 2 or 3. If x1 + x2 6 4, then everyone gets what they asked for, otherwise both get nothing.

/ 672. Two teachers from the Faculty of Economics are writing a textbook. The quality of the textbook (q) depends on their efforts (e1 and e2 respectively) according to the function

q = 2(e1 + e2 ).

The objective function of each has the form

ui = q − ei ,

i.e. quality minus effort. You can select effort level 1, 2 or 3.

/ 673. “The third wheel” Each of the three players chooses one of the sides of the coin: “heads” or “tails”. If

If the players' choices coincide, then each player is given 1 ruble. If the choice of one of the players differs from the choice of the other two, then he pays them 1 ruble.

/ 674. Three players choose one of three alternatives: A, B or C. The alternative is selected by majority vote. Each player votes for one and only one alternative. If none of the alternatives receives a majority, then alternative A will be chosen. The players' payoffs, depending on the chosen alternative, are as follows:

u1 (A) = 2, u2 (A) = 0, u3 (A) = 1,

u1 (B) = 1, u2 (B) = 2, u3 (B) = 0,

u1 (C) = 0, u2 (C) = 1, u3 (C) = 2.

/ 675. Two electoral blocs are being formed that will compete for seats in the city’s legislative assembly N-sk. Each block can choose one of three orientations: “left” (L), “right” (R) and “ecological” (E). Each orientation can attract 50, 30 and 20% of voters, respectively. It is known that if the orientation they are interested in is not represented in the elections, then voters from the corresponding group will not vote. If the blocs choose different orientations, then each will receive a corresponding share of votes. If the blocs choose the same orientation, then the votes of the corresponding group of voters will be divided equally between them. The goal of each block is to get greatest number votes.

/ 676. Two players place a point on the plane. One player chooses the abscissa, the other -

ordinate Their winnings are given by the functions:

a) ux (x, y) = −x2 + x(y + a) + y2 , uy (x, y) = −y2 + y(x + b) + x2 ,

b) ux (x, y) = −x2 − 2ax(y + 1) + y2 , uy (x, y) = −y2 + 2by(x + 1) + x2 , c) ux (x, y) = − x − y/x + 1/2y2 , uy (x, y) = −y − x/y + 1/2x2 ,

(a, b - coefficients).

/ 677. "Ice Cream Men on the Beach"

Two ice cream makers sell ice cream on the beach on a hot day. The beach can be thought of as a single segment. Ice cream makers choose where on the beach they should be, that is, they choose the coordinate xi. Customers are evenly dispersed along the beach and buy ice cream from the vendor closest to them. If x1< x2 , то первый обслуживают (x1 + x2 )/2 долю пляжа, а второй - 1 − (x1 + x2 )/2. Если мороженщики расположатся в одной и той же точке (x1 = x2 ), покупатели поровну распределятся между ними. Каждый мороженщик стремиться обслуживать как можно a large share beach

/ 678. "Auction" Consider an auction like the one described in Game 16.2.2, provided that the winner

auction the player pays the price he named.

/ 679. Analyze Game 16.2.1 “Selecting a Computer” (p. 624) and find answers to the following questions:

a) Under what conditions on parameters a, b and c will there be an equilibrium in the dominant strategies? What will this equilibrium be like?

b) Under what conditions on the parameters is the outcome when both choose IBM a Nash equilibrium? When is this equilibrium the only one? Could it also be an equilibrium in dominant strategies?

/ 680. Each of the two neighbors in the entrance chooses whether he will sweep the entrance once a week or not. Let everyone evaluate the benefit for themselves from double purity at a > 0 monetary units, the benefit from single cleanliness is in b > 0 units, from an uncleaned entrance - in 0, and your costs for personal participation in cleaning - in c > 0. At what ratios between a, b and c in the game will there be equilibria of the form: (0 ) no one cleans, (1) one cleans, (2) both clean?

/ 681. Suppose that in a certain game of two players, each of whom has 2 strategies, there is a unique Nash equilibrium. Show that in this game at least one of the players has a dominant strategy.

/ 682. Each of the two players (i = 1, 2) has 3 strategies: a, b, c and x, y, z, respectively. Taking his name as an infinite sequence of characters like iwaniwaniwan. . . , set the payoffs of the first player as follows: u1 (a, x) = “and”, u1 (a, y) = “in”, u1 (a, z) = “a”, u1 (b, x) = “n” , u1 (b, y) = “and”, u1 (b, z) = “in”, u1 (c, x) = “a”, u1 (c, y) = “n”, u1 (c, z ) = "and". Instead of each letter of the name, substitute its number in the alphabet, for which use Table 16.10. Using the last name in the same way, define the payoffs of the second player, u2 (·).

1) Are there dominant and strictly dominant strategies in your game? If so, do they form an equilibrium in the dominant strategies?

2) What will be the result of the consistent rejection of strictly dominated strategies?

3) Find the Nash equilibria of this game.

Table 16.10.

/ 683. Make up a matrix game of three players using their first name, last name and patronymic, each of whom has 2 strategies. Answer the questions in the previous task.

/ 684. Fill in the missing winnings in the following table so that in the resulting game. . .

(0) there was no Nash equilibrium,

there was one Nash equilibrium,

there were two Nash equilibria,

there were three Nash equilibria,

(4) there were four Nash equilibria.

/ 685. 1) Explain why in any Nash equilibrium the payoff The i-th player cannot be less than

min max ui (xi , x−i ).

x −iX −ix iX i

2) Explain why in any Nash equilibrium the payoff of the i-th player cannot be

less than

x iX ix −iX −i

Game theory is a science that uses mathematical methods to study the behavior of participants in probable situations related to decision-making. The subject of this theory is game situations with pre-established rules. During the game, various joint actions are possible - coalitions of players, conflicts...

It is often noted that oligopoly is really a game of character - a game in which, just as in chess or poker, each player must predict the opponent's actions - his bluff, his counter-actions, his counter-bluff - as best as possible. Oligopoly economists were therefore delighted by the appearance in 1944 of a voluminous and highly mathematical book entitled Game Theories and Economic Behavior.

The players' strategy is determined by the objective function, which shows the participant's win or loss. The forms of these games are varied. Most simple variety– a game with two participants. If a game involves at least three players, coalitions may form, which complicates the analysis. From the point of view of the payment amount, games are divided into two groups - zero-sum and non-zero-sum. Zero-sum games are also called zero-sum games: the gain of some is exactly equal to the loss of others, and total amount the winnings are equal to 0. Based on the nature of the preliminary agreement, games are divided into cooperative and non-cooperative.

The most famous example of a non-cooperative non-zero-sum game is the prisoner's dilemma.

So. 2 thieves were caught red-handed and charged with a number of thefts. Each of them faces a dilemma - whether to admit to old (unproven) thefts or not. If only 1 of the thieves confesses, then the one who confesses receives a minimum prison term of 1 year, and the other a maximum of 10 years. If both thieves confess at the same time, then both will receive a small leniency - 6 years, but if both do not confess, then they will be punished only for the last theft - 3 years. The prisoners are sitting in different cells and cannot agree with each other. This is a non-cooperative game with a non-zero (negative) sum. A characteristic feature of this game is that it is unprofitable for both participants to be guided by their private interests. The “prisoner's dilemma” clearly shows the features of oligopolistic pricing.

3.1. Nash equilibrium

(Named after John Forbes Nash) in game theory, a type of decision in a game of two or more players in which no one participant can increase the payoff by unilaterally changing his decision when the other participants do not change their decisions. This set of strategies chosen by the participants and their payoffs is called the Nash equilibrium.

The concept of Nash equilibrium (NE) is not exactly coined by Nash, Antoine Augustine Cournot showed how to find what we call a Nash equilibrium in a Cournot game. Accordingly, some authors call it the Nash-Cournot equilibrium. However, Nash was the first to show in his dissertation Noncooperative Games (1950) that Nash equilibria must exist for all finite games with any number of players. Before Nash, this had only been proven for 2-player zero-sum games by John von Neumann and Oskar Morgernstern (1947).

Formal definition.

Let us assume that is a game of n persons in normal form, where is a set of pure strategies, and is a set of payoffs. When each player selects a strategy in the strategy profile, the player receives a win. Note that the winnings depend on the entire profile of strategies: not only on the strategy chosen by the player himself, but also on the strategies of others. The strategy profile is a Nash equilibrium if changing one’s strategy is not beneficial to any player, that is, for any:

A game can have a Nash equilibrium in pure strategies or in mixed ones (that is, choosing a pure strategy stochastically with a fixed frequency). Nash proved that if mixed strategies are allowed, then in every game of n players there will be at least one Nash equilibrium.

Throughout his life, a person is forced to make certain decisions on a wide variety of issues, ranging from everyday disputes - who will clean the rooms in the house or how to improve his city, to international negotiations, multimillion-dollar auctions and even military actions. And in all these situations, a person seeks to maximize his own gain. But at the same time, he always has to choose: to cooperate with other people or to think only about his own benefit, without caring about the benefit of others. A classic example that shows that in pursuit of personal gain it is not always possible to achieve best result, speaks "Prisoner's Dilemma".

Two prisoners A and B are suspected of committing a crime for which they face up to 10 years in prison. But there is no direct evidence yet. Therefore, the investigation offers each of the prisoners to make a deal - to confess to what they did and blame the initiative for the crime on another. If one confesses and the other prisoner remains silent, the first prisoner will have his sentence reduced to three years for assisting the investigation, and the second one will be imprisoned for 10 years.

If both make a deal with the investigation and confess to their crime, then each will receive 5 years. However, if both remain silent, then due to lack of evidence, they will be released. The prisoners are in different cells so that they cannot agree with each other and coordinate their behavior during interrogation. Neither of them knows exactly what the other will do. What decision will each of them make? What will happen?

Every prisoner has a choice: remain silent or confess. This is the prisoner’s dilemma: should he incriminate another or should he try his luck and not confess, taking great risks? Depending on the prisoners' choices in this situation, there are four possible outcomes.

Let's look at them:

1. If both prisoners confess, each of them receives five years in prison;

2. If prisoner A remains silent, and prisoner B testifies against him, then the first will be imprisoned for 10 years, and the second - for three years;

3. And vice versa, if prisoner A confesses, and prisoner B remains silent, then the first will be imprisoned for three years, and the second - for 10 years;

4. And if both remain silent, then due to lack of evidence they will be released.

Which of these outcomes is the most realistic? To answer this question, you need to know how each of them reasons. This is how Prisoner A reasons:

« Let's say that prisoner B confesses. If I also confess, I will get 5 years. If I remain silent, I will get 10 years. So, if prisoner B confesses, it’s better for me to also confess to what I did.

If prisoner B remains silent, what should I do? If I confess, I'll get 3 years. And if I also remain silent, I will go free. Of course, perfect option, but I’m not sure that prisoner B will remain silent, I don’t trust him. So I better testify.

So, no matter what Prisoner B does, I better confess.”

Prisoner B's reasoning is similar, and he also comes to the conclusion that it is better for him to confess, regardless of what Prisoner A does.

What happens? Each of the prisoners chose a strategy that, although it does not lead to the best result (release to freedom), is the best for each of them for any behavior of the opponent. Since the goal of each prisoner is to minimize his sentence without caring about the other prisoner, then confessing and incriminating the other is the most profitable strategy for each of them. Simply put, no matter what the other does, everyone will gain more by betraying. Therefore, the prisoners will choose the “Confess” strategy and will receive 5 years in prison.

So, in this example, we saw that the decision made by one player affects the decision of another (and vice versa) and ultimately affects the final outcome of the game.

Other examples of games in which people with divergent (opposite) goals participate, when the result depends on the decisions of all participants, are poker, chess, penalties in football and many other games.

But, along with traditional games, there are also such games between people. serious relationship such as market competition, arms race, pollution environment, elections, trade, etc. For example, companies competing in the market must look at the actions of competitors when making decisions. Or another illustrative example - the arms race between Soviet Union and the USA in the 1950-1990s. For almost half a century, the two great countries spent a lot of money on weapons, keeping pace with each other. If there was trust between them, they would not spend so much money on weapons, but would spend it wisely O greater benefits (for education, healthcare, pensions, etc.) and both sides would benefit from this. But instead, each country, not trusting the other, continued to produce weapons and no one benefited from this.

All these serious relationships are also called games, because in them, as in ordinary games, the result depends on the decisions (strategies) of all participants. And the science that studies these serious relationships is called game theory . Therefore, the word "game" in in this case should not lead you into confusion. This concept is interpreted more broadly in game theory than in everyday life.

Nash equilibrium

So, in the Prisoner's Dilemma, the situation develops in such a way that, acting individually rationally and reasonably, the prisoners end up getting five years in prison. However, as we have already noted, this is not the most optimal outcome. There is a better option: to be released if both remain silent.

Surely each of the prisoners, when making a decision, reasoned like this: “If we both remain silent, we will go free. Of course, this is better than being jailed for five years. But where is the guarantee that the second one will also remain silent? After all, if I remain silent and someone else testifies, then I will go to prison for 10 years! No, I’d rather admit what I did.”

It is obvious that mutual distrust of each other does not allow a situation to materialize when everyone is released. In addition, the prisoners are sitting in different cells and each makes a decision without knowing about the other's decision and each is tempted to testify against the other and get 3 years instead of 5 or 10 years. It turns out that the best outcome - to be released - is unreliable and unstable. That is why the prisoners chose strategies that led, if not to the best outcome, but to a reliable one that eliminated the risk of deception and betrayal. This outcome is called a Nash equilibrium.

Nash equilibrium (Nash equilibrium) is a combination of players' strategies and their winnings such that none of the players can increase their winnings by changing their strategy, unless other participants change their strategies. Note: Nash equilibria exist in games in which players act independently of each other and cannot team up and coordinate their actions.

In simple terms, a Nash equilibrium is a situation where each player's strategy is the best response to the other players' strategies and it is disadvantageous for any individual player to change his strategy.

The Nash equilibrium is not the best possible outcome, but in a situation where everyone plays for themselves, it is the optimal outcome for each player because it eliminates the risks and losses each player would have if the other player decided to deceive or betray him.

The Nash equilibrium is a stable equilibrium because it is to the players' advantage to maintain it, since any change will make them worse off. But if cooperation appears in the relationship between the players, the Nash equilibrium ceases to be an equilibrium, because it becomes possible to achieve a better outcome. For example, if in the “Prisoner’s Dilemma” the players had the opportunity to agree on cooperation, namely, the two of them should remain silent, or if they had no doubts that the other would not betray and would also remain silent, then the situation could end for both with a better outcome - release.

Conclusion: The Nash equilibrium shows that each player can win more if there is cooperation, trust and honesty between the players, and each player, by doing better for others, will do better for himself.

Illustration from postnauka.com

And Oscar Morgenstern became the founders of a new interesting branch of mathematics, which was called “game theory.” In the 1950s, the young mathematician John Nash became interested in this area. The theory of equilibrium became the topic of his dissertation, which he wrote when he was 21 years old. This is how a new game strategy called “Nash Equilibrium” was born, which deserved the Nobel Prize many years later - in 1994.

The long gap between writing a dissertation and general recognition became a test for the mathematician. Genius without recognition resulted in serious mental disorders, but John Nash was able to solve this problem thanks to his excellent logical mind. His theory of “Nash equilibrium” won a Nobel Prize, and his life was filmed in the film “Beautiful Mind”.

Briefly about game theory

Since Nash equilibrium theory explains human behavior in interactional settings, it is therefore worth reviewing the basic concepts of game theory.

Game theory studies the behavior of participants (agents) in conditions of interaction with each other like a game, when the outcome depends on the decisions and behavior of several people. The participant makes decisions based on his predictions about the behavior of others, which is called the game strategy.

There is also a dominant strategy, in which a participant obtains the optimal result for any behavior of other participants. This is the player's best win-win strategy.

The Prisoner's Dilemma and Scientific Breakthrough

The prisoner's dilemma is a game where participants are forced to make rational decisions to achieve a common goal in the face of conflicting alternatives. The question is which of these options he will choose, recognizing his personal and general interest, as well as the impossibility of getting both. Players seem to be locked into strict gaming conditions, which sometimes forces them to think very productively.

This dilemma was studied by an American mathematician. The equilibrium he derived was revolutionary in its kind. This new idea has especially clearly influenced the opinion of economists about how market players make choices, taking into account the interests of others, with close interaction and intersection of interests.

The best way to study game theory is specific examples, since this mathematical discipline itself is not dryly theoretical.

Prisoner's dilemma example

For example, two people committed a robbery, fell into the hands of the police and are being interrogated in separate cells. At the same time, the police officers offer each participant profitable terms, under which he will be released if he testifies against his partner. Each criminal has the following set of strategies that he will consider:

  1. Both testify at the same time and receive 2.5 years in prison.
  2. Both remain silent at the same time and receive 1 year each, since in this case the evidence base for their guilt will be small.
  3. One testifies and gets freedom, while the other remains silent and gets 5 years in prison.

Obviously, the outcome of the case depends on the decision of both participants, but they cannot come to an agreement, since they are sitting in different cells. The conflict of their personal interests in the struggle for common interests is also clearly visible. Each prisoner has two options for action and 4 possible outcomes.

Chain of logical conclusions

So, criminal A considers the following options:

  1. I am silent and my partner is silent - we will both get 1 year in prison.
  2. I hand over my partner and he turns me in - we both get 2.5 years in prison.
  3. I remain silent, and my partner rats me out - I get 5 years in prison, and he gets freedom.
  4. I hand over my partner, but he remains silent - I get freedom, and he gets 5 years in prison.

Let's present the matrix possible solutions and outcomes for clarity.

Table of probable outcomes of the prisoner's dilemma.

The question is, what will each participant choose?

“You can’t be silent, you can’t speak” or “You can’t be silent, you can’t speak”

To understand the participant’s choice, you need to follow the chain of his thoughts. Following the reasoning of criminal A: if I remain silent and my partner remains silent, we will receive a minimum sentence (1 year), but I cannot know how he will behave. If he testifies against me, then I better testify as well, otherwise I could go to prison for 5 years. It’s better for me to serve 2.5 years than 5 years. If he remains silent, then all the more I need to testify, because this way I will get freedom. Participant B argues in exactly the same way.

It is not difficult to understand that the dominant strategy for each of the criminals is to testify. The optimal point of this game comes when both criminals testify and receive their “prize” - 2.5 years in prison. Nash game theory calls this an equilibrium.

Suboptimal Nash optimal solution

The revolutionary nature of Nash's view is not optimal if we consider the individual participant and his personal interest. After all best option- is to remain silent and go free.

Nash equilibrium is a point of convergence of interests, where each participant chooses an option that is optimal for him only if other participants choose a certain strategy.

Considering the option when both criminals remain silent and receive only 1 year, we can call it a Pareto-optimal option. However, it is only possible if the criminals could come to an agreement in advance. But even this would not guarantee this outcome, since the temptation to retreat from the agreement and avoid punishment is great. The lack of complete trust in each other and the danger of getting 5 years force us to choose the option of confession. It is simply irrational to think that participants will stick to the silent option while acting in concert. This conclusion can be drawn by studying the Nash equilibrium. Examples only prove the point.

Selfish or rational

Nash equilibrium theory produced stunning findings that refuted previously existing principles. For example, Adam Smith viewed the behavior of each participant as absolutely selfish, which brought the system into balance. This theory was called " invisible hand market".

John Nash saw that if all participants acted in pursuit of only their own interests, this would never lead to an optimal group outcome. Given that rational thinking is inherent in each participant, the choice offered by the Nash equilibrium strategy is more likely.

A purely male experiment

A prime example is the Blonde Paradox game, which, although seemingly inappropriate, is a clear illustration of how Nash game theory works.

In this game you need to imagine that a group of free guys came to a bar. There is a group of girls nearby, one of whom is preferable to the others, say a blonde. How should guys behave to get best friend for myself?

So, the guys’ reasoning: if everyone starts getting acquainted with the blonde, then most likely no one will get her, then her friends won’t want to meet her either. Nobody wants to be the second choice. But if the guys choose to avoid the blonde, then the likelihood of each of the guys finding a good friend among the girls is high.

The Nash equilibrium situation is not optimal for guys, since, pursuing only their own selfish interests, everyone would choose the blonde. It can be seen that pursuing only selfish interests will amount to the collapse of group interests. A Nash equilibrium would mean that each guy acts in his own self-interest, which is in line with the interests of the entire group. This is not the optimal option for everyone personally, but it is optimal for everyone based on the overall strategy for success.

Our whole life is a game

Making decisions in the real world is very similar to a game, where you expect certain rational behavior from other participants. In business, at work, in a team, in a company, and even in relationships with the opposite sex. From big deals to ordinary ones life situations everything obeys one law or another.

Of course, considered game situations with the criminals and the bar are just great illustrations to demonstrate Nash's balance. Examples of such dilemmas arise very often in the real market, and this is especially true in cases with two monopolists controlling the market.

Mixed strategies

Often we are involved in not one, but several games at once. Choosing one of the options in one game, guided by a rational strategy, but ends up in another game. After several rational decisions you may find that you are not happy with your results. What to do?

Let's consider two types of strategy:

  • A pure strategy is the behavior of a participant that comes from thinking about the possible behavior of other participants.
  • A mixed strategy or random strategy is the alternation of pure strategies at random or the selection of a pure strategy with a certain probability. This strategy is also called randomized.

Considering this behavior we get A New Look for Nash equilibrium. If earlier it was said that the player chooses a strategy once, then another behavior can be imagined. It is possible to assume that players choose a strategy randomly with a certain probability. Games in which Nash equilibria cannot be found in pure strategies always have them in mixed ones.

The Nash equilibrium in mixed strategies is called a mixed equilibrium. This is an equilibrium where each participant chooses the optimal frequency of choosing his strategies, provided that other participants choose their strategies with a given frequency.

Penalties and mixed strategy

An example of a mixed strategy can be given in the game of football. The best illustration of a mixed strategy is perhaps the penalty shootout. So, we have a goalkeeper who can only jump into one corner, and a player who will take the penalty.

So, if the first time the player chooses the strategy of making a shot into the left corner, and the goalkeeper also falls into this corner and catches the ball, then how can events develop the second time? If a player hits the opposite corner, it is most likely too obvious, but hitting the same corner is no less obvious. Therefore, both the goalkeeper and the kicker have no choice but to rely on a random choice.

Thus, by alternating random selection with a specific pure strategy, the player and goalkeeper try to get the maximum result.

Current version of the page so far not checked experienced participants and may differ significantly from versions, accessed May 9, 2012; checks required 2 edits.

Go to: navigation,search

John Forbes Nash, November 2006

Nash equilibrium(EnglishNash equilibrium) named after John Forbes Nash- so in game theory is a type of decision in a game of two or more players in which no participant can increase the winnings by changing his decision unilaterally, when other participants do not change their decisions. This set of strategies chosen by the participants and their payoffs is called the Nash equilibrium .

The concept of Nash equilibrium (NE) was not first used by Nash; Antoine Auguste Cournot showed how to find what we call the Nash equilibrium in the Cournot game. Accordingly, some authors call it Nash-Cournot equilibrium. However, Nash was the first to show in his dissertation on non-cooperative games in 1950, that such equilibria should exist for all finite games with any number of players. Before Nash, this had only been proven for 2-player games with zero sumJohn von Neumann And Oscar Morgenstern(1947).

Formal definition

Let's say - a gamen persons in normal form, where is a set of pure strategies, and is a set of payoffs. When every player selects a strategy in the strategy profile , the player receives a win. Note that the winnings depend on the entire profile of strategies: not only on the strategy chosen by the player himself, but also on the strategies of others. A strategy profile is a Nash equilibrium if changing one's strategy is not beneficial to any player, that is, for any

A game can have a Nash equilibrium in pure strategies or in mixed(that is, when choosing a pure strategy stochastically with a fixed frequency). Nash proved that if we allow mixed strategies, then in every game n players will have at least one Nash equilibrium.

Literature

    Vasin A. A., Morozov V. V. Game theory and models of mathematical economics - M.: Moscow State University, 2005, 272 p.

    Vorobyov N. N. Game theory for cybernetic economists - M.: Nauka, 1985

    Mazalov V.V. Mathematical game theory and applications - Lan Publishing House, 2010, 446 pp.

    Petrosyan L. A., Zenkevich N. A., Shevkoplyas E. V. Game theory - St. Petersburg: BHV-Petersburg, 2012, 432 p.

Pareto efficiency

Material from Wikipedia - the free encyclopedia

Go to: navigation,search

Pareto optimality- such a state of the system in which the value of each particular criterion describing the state of the system cannot be improved without deteriorating the position of other elements.

Thus, in the words of himself Pareto: “Any change that causes no loss to anyone and benefits some people (in their own estimation) is an improvement.” This means that the right to all changes that do not bring additional harm to anyone is recognized.

The set of Pareto-optimal states of a system is called the “Pareto set”, “the set of Pareto-optimal alternatives”, or the “set of Pareto-optimal alternatives”.

The situation when Pareto efficiency is achieved is a situation when all the benefits from the exchange have been exhausted.

Pareto efficiency is one of the central concepts for modern economic science. Based on this concept, the First and Second Fundamental Theorems are built welfare. One of the applications of Pareto optimality is the so-called. Pareto distribution of resources (labor and capital) during international economic integration, that is, the economic unification of two or more states. It is interesting that the Pareto distribution before and after international economic integration was adequately described mathematically (Dalimov R. T., 2008). The analysis showed that the added value of sectors and the income of labor resources move in the opposite direction in accordance with the well-known equation of thermal conductivity, similar to a gas or liquid in space, which makes it possible to apply the analysis methodology used in physics in relation to economic problems of migration of economic parameters.

Pareto optimum states that welfare society reaches a maximum, and the distribution of resources becomes optimal, if any change in this distribution worsens the welfare of at least one subject economic system.

Pareto-optimal market state- a situation where it is impossible to improve the position of any participant in the economic process without simultaneously reducing the well-being of at least one of the others.

According to the Pareto criterion (a criterion for the growth of social welfare), movement towards the optimum is possible only with such a distribution of resources that increases the welfare of at least one person without harming anyone else.

Return

×
Join the “koon.ru” community!
In contact with:
I am already subscribed to the community “koon.ru”