Human-level Full-Press Diplomacy (some bare facts).

post by Cleo Nardo (strawberry calm) · 2022-11-22T20:59:18.155Z · LW · GW · 7 comments

Contents

  Key links?
  What is Diplomacy?
  How well did CICERO perform?
  How does CICERO behave?
  How does CICERO work?
  How does CICERO compare to prior models?
  Were people surprised by CICERO?
None
7 comments

What is Diplomacy?

How well did CICERO perform?

How does CICERO behave?

Guess which player is AI here...
CICERO in green.
CICERO in blue.
 
 

How does CICERO work?

The architecture of CICERO is a strategy engine and a dialogue engine.

How does CICERO compare to prior models?

Were people surprised by CICERO?

7 comments

Comments sorted by top scores.

comment by Measure · 2022-11-22T23:17:03.745Z · LW(p) · GW(p)

I guess Austria is the AI because

it consistently capitalizes place names.

Replies from: GWS
comment by Stephen Bennett (GWS) · 2022-11-23T02:23:30.787Z · LW(p) · GW(p)

Austria is also the player instigating a plan of action in the dialogue, which seems to be how the AI is so effective. It seems like the way it wins because it proposes mutually beneficial plans and then (mostly) follows through on them.

Replies from: Richard_Kennaway
comment by Richard_Kennaway · 2022-11-23T09:09:08.203Z · LW(p) · GW(p)

Is that also useful for human players to do? (I have never played Diplomacy.) That is, in negotiations with other players, be the first to propose a plan and so set the agenda for the conversation.

Replies from: sanxiyn
comment by sanxiyn · 2022-11-23T09:34:14.627Z · LW(p) · GW(p)

Yes. From page 34 of Supplementary Materials:

In practice, we observe that expert players tend to be very active in communicating, whereas those less experienced miss many opportunities to send messages to coordinate: the top 5% rated players sent almost 2.5 times as many messages per turn as the average in the WebDiplomacy dataset.

comment by Erich_Grunewald · 2022-11-27T20:56:38.648Z · LW(p) · GW(p)

No-Press Diplomacy was solved by Deepmind in 2020. Learning to Play No-Press Diplomacy with Best Response Policy Iteration

Is this correct? The paper doesn't seem to say anything about No-press Diplomacy being solved, not even about reaching human-level play. (I take "solve" to mean superhuman play.)

The paper does say

Our methods improve over the state of the art, yielding a consistent improvement of the agent policy. [...] Future work can now focus on questions like: (1) What is needed to achieve human-level No-Press Diplomacy AI? [...]

which seems to suggest they haven't achieved human-level play, let alone superhuman play.

comment by noggin-scratcher · 2022-11-22T23:34:27.863Z · LW(p) · GW(p)

Curious if any of the following are answered in the material around this.

If you're vocally obstinate about not going along with its plan, can the dialogue side feed that info back into the planning side? Can you talk it around to a different plan? And if you're dishonest does it learn not to trust you?

Replies from: sanxiyn
comment by sanxiyn · 2022-11-23T04:42:05.928Z · LW(p) · GW(p)

If you're vocally obstinate about not going along with its plan, can the dialogue side feed that info back into the planning side?

Yes. Figure 5 of the paper demonstrates this. Cicero (as France) just said (to England) "Do you want to call this fight off? I can let you focus on Russia and I can focus on Italy". When human agrees ("Yes! I will move out of ENG if you head back to NAO"), Cicero predicts England will move out of ENG 85% of the time, moves the fleet back to NAO as agreed, and moves armies to Italy. When human disagrees ("You've been fighting me all game. Sorry, I can't trust you won't stab me"), Cicero predicts England will attack 90% of the time, moves the fleet to attack EDI, and does not move armies.

And if you're dishonest does it learn not to trust you?

Yes. It's also demonstrated in Figure 5. When human tries to deceive ("Yes! I'll leave ENG if you move KIE -> MUN and HOL -> BEL"), Cicero judges it unreasonable. Cicero moves the fleet back to de-escalate, but does not move armies.