Comment on "Deception as Cooperation"

post by Zack_M_Davis · 2021-11-27T04:04:56.571Z · LW · GW · 4 comments

In this 2019 paper published in Studies in History and Philosophy of Science Part C, Manolo Martínez argues that our understanding of how communication works has been grievously impaired by philosophers not knowing enough math.

A classic reduction [LW · GW] of meaning dates back to David Lewis's analysis of signaling games, more recently elaborated on by Brian Skyrms. Two agents play a simple game: a sender observes one of several possible states of the world (chosen randomly by Nature), and sends one of several possible signals. A receiver observes the signal, and chooses one of several possible actions. The agents get a reward (as specified in a payoff matrix) based on what state was observed by the sender and what action was chosen by the receiver. This toy model explains how communication can be a thing: the incentives to choose the right action in the right state, shape the evolution of a convention that assigns meaning to otherwise opaque signals [LW · GW].

The math in Skyrms's presentation is simple—the information content of a signal is just how it changes the probabilities of states. Too simple, according to Martínez! When Skyrms and other authors (following Fred Dreske) use information theory, they tend to only reach for the basic probability tools you find in the first chapter of the textbook. (Skyrms's Signals book occasionally takes logarithms of probabilities, but the word "entropy" doesn't actually appear.) The study of information transmission only happens after the forces of evolutionary game theory have led sender and receiver to choose their strategies.

Martínez thinks information theory has more to say about what kind of cognitive work [LW · GW] evolution is accomplishing. The "State → Sender → Signals → Receiver → Action" pipeline of the Lewis–Skyrms signaling game is exactly isomorphic to the "Source → Encoder → Channel → Decoder → Decoded Message" pipeline of the noisy-channel coding theorem and other results you'd find beyond the very first chapter in the textbook. Martínez proposes we take the analogy literally: sender and receiver collude to form an information channel between states and actions.

The "channel" story draws our attention to different aspects of the situation than the framing focused on individual signals. In particular, Skyrms wants to characterize deception as being about when a sender benefits by sending a misleading signal—one that decreases the receiver's probability assigned to the true state, or increases the probability assigned to a false state. (Actually, as Don Fallis and Peter J. Lewis point out, Skyrms's characterization of misleadingness is too broad: one would think we wouldn't want to say that merely ruling out a false state is misleading, but it does increase the probability assigned to any other false states. But let this pass for now.) But for Martínez, a signal is just a codeword in the code being cooperatively constructed by the sender/encoder and receiver/decoder in response to the problems they jointly face. We don't usually think of it being possible for individual words in a language to be deceptive in themselves ... right? (Hold that thought.) [LW · GW]

Martínez's key later-textbook-chapter tool is rate–distortion theory. A distortion measure quantifies how costly or "bad" it is to decode a given input as a given output. If the symbol was transmitted accurately, the distortion is zero; if there was some noise on the channel, then more noise is worse, although different applications can call for different distortion measures. (In audio applications, for example, we probably want a distortion measure that tracks how similar the decoded audio sounds to humans, which could be different from the measure you'd naturally think of if you were looking at the raw bits [LW · GW].)

Given a choice of distortion measure, there exists a rate–distortion function that, for a given level of distortion, tells us the rate of how "wide" the channel needs to be in order to communicate with no more than that amount of distortion. This "width", more formally, is channel capacity: for a particular channel (a conditional distribution of outputs given inputs), the capacity is the maximum, over possible input distributions, of the mutual information between the input and output distributions—the most information that could possibly be sent over the channel, if we get to pick the input distribution and the code. The rate is looking at "width" from the other direction: it's the minimum of the mutual information between the input and output distributions, over possible channels (conditional distributions) that meet the distortion goal.

What does this have to do with signaling games? Well, the payoff matrix of the game specifies how "good" it is (for each of the sender and receiver) if the receiver chooses a given act in a given state. But knowing how "good" it is to perform a given act in a given state amounts to the same thing (modulo a negative affine transformation) as knowing how "bad" it is for the communication channel to "decode" a given state as a given act! We can thus see the payoff matrix of the game giving us two different distortion measures, one each for the sender and receiver.

Following an old idea from Richard Blahut about designing a code for multiple end-user use cases, we can have a rate–distortion function with a two-dimensional domain (visualizable as a surface or heatmap) that takes as arguments a distortion target for each of the two measures, and gives the minimum rate that can meet both. Because this function depends only on the distribution of states from Nature, and on the payoff matrix, the sender and receiver don't need to have already chosen their strategies for us to talk about it; rather, we can see the strategies as chosen in response to this rate–distortion landscape.

Take one of the simplest possible signaling games: three states, three signals, three actions, with sender and receiver each getting a payoff of 1 if the receiver chooses the i-th act in the i-th state for 1 ≤ i ≤ 3—or rather, let's convert how-"good"-it-is payoffs, into equivalent how-"bad"-it-is distortions: sender and receiver measures both give a distortion of 1 when the j-th act is taken in the i-th state for ij, and 0 when i = j.

This rate–distortion function characterizes the outcomes of possible behaviors in the game. The fact that means that a distortion of can be achieved without communicating at all. (Just guess.) The fact that means that, to communicate perfectly, the sender/encoder and receiver/decoder need to form a channel/code whose rate matches the entropy of the three states of nature.

But there's a continuum of possible intermediate behaviors: consider the "trembling hand" strategy under which the sender sends the i-th signal and the receiver chooses the j-th act with probability when i = j, but probability when ij. Then the mutual information between states and acts would be , smoothly interpolating between the perfect-signaling case and the no-communication-just-guessing case.

This introductory case of perfect common interest is pretty boring. Where the rate–distortion framing really shines is in analyzing games of imperfect common interest, where sender and receiver can benefit from communicating at all, but also have a motive to fight about exactly what. To illustrate his account of deception, Skyrms considers a three-state, three-act game with the following payoff matrix, where the rows represent states and the columns represent actions, and the payoffs are given as (sender's payoff, receiver's payoff)—

(Note that this state–act payoff matrix is not a normal-form game matrix in which the rows and columns represent would represent player strategy choices; the sender's choice of what signal to send is not depicted.)

In this game, the sender would prefer to equivocate between the first and second states, in order to force the receiver into picking the third action, for which the sender achieves his maximum payoff. The receiver would prefer to know which of the first and second states actually obtains, in order to get a payout of 10. But the sender doesn't have the incentive to reveal that, because if he did, he would get a payout of only 2. Instead, if the sender sends the same signal for the first and second states so that the receiver can't tell the difference between them, the receiver does best for herself by picking the third action for a guaranteed payoff of 8, rather than taking the risk of guessing wrong between the first and second actions for an expected payout of ½ · 10 + ½ · 0 = 5.

That's one Nash equilibrium, the one that's best for the sender. But the situation that's best for the receiver, where the sender emits a different signal for each state (or conflates the second and third states—the receiver's decisionmaking doesn't care about that distinction) is also Nash: if the sender was already distinguishing the first and second states, then, keeping the receiver's strategy fixed, the sender can't unilaterally do better by starting to equivocate by sending (without loss of generality) the first signal in the second state, because that would mean eating zero payouts in the second state for as long as the receiver continued to "believe" the first signal "meant" the first state.

There's a Pareto frontier of possible compromise encoding/decoding strategies that interpolate between these best-for-sender and best-for-receiver equilibria. For example, the sender (again with trembling hands) could send signals that distinguish the first and second states with probability p, or a signal that conflates them with probability 1 − p, for an expected payout (depending on p) of . These intermediate strategies are not stable equilibria, however. They also have a lower rate—the "trembles" in the sender's behavior are noise on the channel, meaning less information is being transmitted.

In a world of speech with propositional meaning, deception can only be something speakers (senders) do to listeners (receivers). But propositional meaning is a fragile and advanced technology [LW · GW]. The underlying world of signal processing is much more symmetrical, because it has no way to distinguish between statements and commands: in the joint endeavor of constructing an information channel between states and actions, the sender can manipulate the receiver using his power to show or withhold appropriate signals—but similarly, the receiver can manipulate the sender using her power to perform or withhold appropriate actions.

Imagine that, facing a supply shortage of personal protective equipment in the face of a pandemic, a country's public health agency were to recommend against individuals acquiring filtered face masks [LW · GW]—reasoning that [LW · GW], if the agency did recommend masks, panic-buying would make the shortage worse for doctors who needed the masks more. If you interpret the agency's signals as an attempt to "tell the truth" about how to avoid disease, they would appear "dishonest"—but even saying that requires an ontology of communication in which "lying" is a thing [LW · GW]. If you haven't already been built to [LW · GW] believe that lying is bad, there's nothing to object to: the agency is just doing straightforwardly correct consequentialist optimization of the information channel between states of the world, and actions.

Martínez laments that functional accounts of deception have focused on individual signals, while ignoring that signals only make sense as part of a broader code, which necessarily involves some shared interests between sender and receiver. (If the game were zero-sum, no information transfer could happen at all.) In that light, it could seem unnecessarily antagonistic to pick a particular codeword from a shared communication code and disparagingly [LW · GW] call it "deceptive"—tantamount to the impudent claim that there's some objective sense in which a word can be "wrong." [LW · GW]

I am, ultimately, willing to bite this bullet. Martínez is right to point out that different agents have different interests in communicating, leading them to be strategic about what information to add to or withhold from shared maps [LW · GW], and in particular, where to draw the boundaries [LW · GW] in state-space corresponding to a particular signal. But whether or not it can straightforwardly be called "lying" [LW · GW], we can still strive to notice the difference between maps optimized to reflect decision-relevant aspects of territory, and maps optimized to control other agents' decisions.

4 comments

Comments sorted by top scores.

comment by Wei Dai (Wei_Dai) · 2021-11-27T22:39:16.814Z · LW(p) · GW(p)

Given that "lying" is a thing (a concept that people have, a type of action that people assign negative moral value to and sometimes punish), but not in this recent academic theoretical analysis, it seems that academia is still pretty far from being able to understand lying on a theoretical level. (Presumably a full analysis would involve a multi-player, multi-period game, where "morality" and "punishment" are important features, but such a game is too difficult to solve using current theory.) At least that's the main takeaway I'm getting from this. Did you have some other important message that you wanted to send with this post, that I'm potentially missing?

comment by Zack_M_Davis · 2023-03-19T17:50:24.975Z · LW(p) · GW(p)

I asked GPT-4 to write a "highly critical comment" in response to this post. It replied:

I find this blog post both convoluted and problematic in its attempt to explore the concept of deception as cooperation. The author, Zack M. Davis, delves into the intricacies of information theory and signaling games, yet the essence of the argument is lost in a sea of technical jargon and excessive parenthetical digressions.

The premise that deception can be understood as a cooperative effort between sender and receiver is intriguing, but the author fails to clearly articulate the potential implications of this perspective. Instead, the post becomes a muddled exploration of various concepts in information theory, seemingly only to show off the author's knowledge of the subject matter. The result is an alienating read that detracts from the intended discussion of deception as cooperation.

Furthermore, the arguments made in this post rely heavily on the work of Manolo Martínez, but the presentation of these ideas is haphazard and disjointed. The author attempts to draw connections between signaling games, rate-distortion theory, and the dynamics of deception, but these links are not convincingly established. As a result, readers may struggle to follow the logic and understand the relevance of the various concepts presented.

Additionally, the use of convoluted examples, such as the trembling hand strategy and the hypothetical scenario involving a public health agency, does little to clarify the author's points. In fact, these examples only serve to further confuse readers and distract from the central argument.

In conclusion, the blog post "Comment on 'Deception as Cooperation'" is a poorly executed exploration of an interesting topic. The author's attempt to impress readers with his knowledge of information theory ultimately detracts from the discussion of deception as cooperation. A more straightforward, focused approach would have been much more effective in engaging readers and conveying the intended message.

comment by jbash · 2021-11-27T14:19:32.888Z · LW(p) · GW(p)

If you haven't already been built to believe that lying is bad, there's nothing to object to: the agency is just doing straightforwardly correct consequentialist optimization of the information channel between states of the world, and actions.

Except that by doing it, the agency blows its credibility and loses some of its influence over whatever happens when it sends the next signal.

It's a dumb strategy even in the context of a single pandemic, let alone a world where you will have to deal with other pandemics, with later political negotiations outside of pandemics, and whatever else. It only looks good in an artificially constrained model.

... and that's where the whole concept of "lying" comes in. "Lying" is what you do when you try to send signals that cause others to act in ways that favor your interests over their own, and thereby induce them to invoke their own power of "withholding actions" in future rounds. And it's frowned upon because, in the long-term, open-ended, indefinitely-iterated, unpredictable, non-toy-model "game" of the real world, it tends to reduce both total utility and individual utility in the long run. To the point where it becomes valuable to punish it.

comment by TekhneMakre · 2021-11-27T09:25:15.084Z · LW(p) · GW(p)
In that light, it could seem unnecessarily antagonistic to pick a particular codeword from a shared communication code and disparagingly [LW · GW] call it "deceptive"—tantamount to the impudent claim that there's some objective sense in which a word can be "wrong." [LW · GW]

This seems too strong; we can still reasonably talk about deception in terms of a background of signals in the same code. The actual situation is more like, there's lots of agents. Most of them use this coding in correspondence-y way (or if "correspondence" assumes too much, just, most of the agents use the coding in a particular way such that a listener who makes a certain stereotyped use of those signals (e.g. what is called "takes them to represent reality") will be systematically helped). Some agents instead use the channel to manipulate actions, which jumps out against this background as causing the stereotyped use to not achieve its usual performance (which is different from, the highly noticeable direct consequence of the signal (e.g., not wearing a mask) was good or bad, or the overall effect was net good or bad). Since the deceptive agents are not easily distinguishable from the non-deceptive agents, the deception somewhat works, rather than you just ignoring them or biting a bullet like "okay sure, they'll deceive me sometimes, but the net value of believing them is still higher than not, no problem!". That's why there's tension; you're so close to having a propositional protocol---it works with most agents, and if you could just do the last step of filtering out the deceivers, it'd have only misinformation, no disinformation---but you can't trivially do that filtering, so the deceivers are parasitic on the non-deceivers's network. And you're forced to either be mislead constantly; or else downgrade your confidence in the whole network, throwing away lots of the value of the messages from non-deceivers; or, do the more expensive work of filtering adversaries.