Optimized Propaganda with Bayesian Networks: Comment on "Articulating Lay Theories Through Graphical Models"

zack_m_davis

Optimized Propaganda with Bayesian Networks: Comment on "Articulating Lay Theories Through Graphical Models"

post by Zack_M_Davis · 2020-06-29T02:45:08.145Z · LW · GW · 10 comments

10 comments

Derek Powell, Kara Weisman, and Ellen M. Markman's "Articulating Lay Theories Through Graphical Models: A Study of Beliefs Surrounding Vaccination Decisions" (a conference paper from CogSci 2018) represents an exciting advance in marketing research, showing how to use causal graphical models [LW · GW] to study why ordinary people have the beliefs they do, and how to intervene to make them be less wrong.

The specific case our authors examine is that of childhood vaccination decisions: some parents don't give their babies the recommended vaccines, because they're afraid that vaccines cause autism. (Not true.) This is pretty bad—not only are those unvaccinated kids more likely to get sick themselves, but declining vaccination rates undermine the population's herd immunity, leading to new outbreaks of highly-contagious diseases like the measles in regions where they were once eradicated.

What's wrong with these parents, huh?! But that doesn't have to just be a rhetorical question—Powell et al. show how we can use statistics to make the rhetorical hypophorical and model specifically what's wrong with these people! Realistically, people aren't going to just have a raw, "atomic" dislike of vaccination for no reason: parents who refuse to vaccinate their children do so because they're (irrationally) afraid of giving their kids autism, and not afraid enough of letting their kids get infectious diseases. Nor are beliefs about vaccine effectiveness or side-effects uncaused, but instead depend on other beliefs.

To unravel the structure of the web of beliefs, our authors got Amazon Mechanical Turk participants to take surveys about vaccination-related beliefs, rating statements like "Natural things are always better than synthetic alternatives" or "Parents should trust a doctor's advice even if it goes against their intuitions" on a 7-point Likert-like scale from "Strongly Agree" to "Strongly Disagree".

Throwing some off-the-shelf Bayes-net structure-learning software at a training set from the survey data, plus some ancillary assumptions (more-general "theory" beliefs like "skepticism of medical authorities" can cause more-specific "claim" beliefs like "vaccines have harmful additives", but not vice versa) produces a range of probabilistic models that can be depicted with graphs where nodes representing the different beliefs are connected by arrows that show which beliefs "cause" others: an arrow from a naturalism node (in this context, denoting a worldview that prefers natural over synthetic things) to a parental expertise node means that people think parents know best because they think that nature is good, not the other way around.

Learning these kinds of models [LW · GW] is feasible because not all possible causal relationships are consistent with the data: if $A$ and $B$ are statistically independent of each other, but each dependent with $C$ (and are conditionally dependent given the value of $C$ ), it's kind of hard to make sense of this except to posit that $A$ and $B$ are causes with the common effect $C$ .

Simpler models with fewer arrows might sacrifice a little bit of predictive accuracy for the benefit of being more intelligible to humans. Powell et al. ended up choosing a model that can predict responses from the test set at r = .825, explaining 68.1% of the variance. Not bad?!—check out the full 14-node graph in Figure 2 on page 4 of the PDF.

Causal graphs are useful as a guide for planning interventions: the graph encodes predictions about what would happen if you changed some of the variables. Our authors point out that since previous work showed that people's beliefs about vaccine dangers were difficult to influence, that suggests trying to intervene on the other parents of the intent-to-vaccinate node in the model: if the hoi polloi won't listen to you when you tell them the costs are minimal (vaccines are safe), instead tell them about the benefits (diseases are really bad and vaccines prevent disease).

To make sure I really understand this, I want to adapt it into a simpler example with made-up numbers where I can do the arithmetic myself. Let me consider a graph with just three nodes—

vaccines are safe → vaccinate against measles ← measles are dangerous

Suppose this represents a structural equation model where an anti-vaxxer-leaning parent-to-be's propensity-to-vaccinate-against-measles $C$ is expressed in terms of belief-in-vaccine-safety $A$ and belief-in-measles-danger $B$ as—

$C = 0.7 \cdot A + 0.3 \cdot B$

And suppose that we're a public health authority trying to decide whether to spend our budget (or what's left of it after recent funding cuts) on a public education initiative that will increase $A$ by 0.1, or one that will increase $B$ by 0.3.

We should choose the program that intervenes on $B$ , because $(0.3) (0.3) = 0.09$ is bigger than $(0.7) (0.1) = 0.07$ . That's actionable advice that we couldn't have derived without a quantitative model of how the lay audience thinks. Exciting!

At this point, some readers may be wondering why I've described this work as "marketing research" about constructing "optimized propaganda." A couple of those words usually have negative connotations, but educating people about the importance of vaccines is a positive thing. What gives?

The thing is, "Learn the causal graph of why they think that and compute how to intervene on it to make them think something else" is a symmetric weapon—a fully general persuasive technique that doesn't depend on whether the thing you're trying to convince them of is true.

In my simplified example, the choice to intervene on $B$ was based on numerical assumptions that amount to the claim that it's sufficiently easier to change $B$ than it is to change $A$ , such that intervening on $B$ is more effective at changing $C$ than intervening on $A$ (even though $C$ depends on $A$ more than it does on $B$ ). But this methodology is completely indifferent to what $A$ , $B$ , and $C$ mean. It would have worked just as well, and for the same reasons if the graph had been—

Coca-Cola isn't unhealthy → drink Coca-Cola ← Coca-Cola tastes great

Suppose that we're advertising executives for the Coca-Cola Company trying to decide how to spend our budget (or what's left of it after recent funding cuts). If consumers won't listen to us when we tell them the costs of drinking Coke are minimal (lying that it isn't unhealthy), we should instead tell them about the benefits (Coke tastes good).

Or with different assumptions about the parameters—maybe $C = 0.8 \cdot A + 0.2 \cdot B$ actually—then intervening to increase belief in "Coca-Cola isn't unhealthy" would be the right move (because $(0.8) (0.1) = 0.08 > 0.06 = (0.2) (0.3)$ ). The marketing algorithm [LW(p) · GW(p)] that just computes what belief changes will flip the decision node, doesn't have any way to notice or care whether those belief changes are in the direction of more or less accuracy.

To be clear—and I really shouldn't have to say this—this is not a criticism of Powell–Weisman–Markman's research! The "Learn the causal graph of why they think that" methodology is genuinely really cool! It doesn't have to be deployed as a marketing algorithm: the process of figuring out which belief change would flip some downstream node is the same thing as what we call locating a crux [LW · GW].^[1] The difference is just a matter of forwards or backwards direction [LW · GW]: whether you first figure out if the measles vaccine or Coca-Cola are safe and then use whatever answer you come up with to guide your decision [LW · GW], or whether you write the bottom line first [LW · GW].

Of course, most people on most issues don't have the time or expertise to do their own research. For the most part, we can only hope that the sources we trust as authorities are doing their best to use their limited bandwidth [LW · GW] to keep us genuinely informed, rather than merely computing what signals to emit [LW · GW] in order to control our decisions.

If that's not true, we might be in trouble—perhaps increasingly so, if technological developments grant new advantages to the propagation of disinformation over the discernment of truth. In a possible future world [LW · GW] where most words are produced by AIs running a "Learn the causal graph of why they think that and intervene on it to make them think something else" algorithm hooked up to a next-generation GPT [? · GW], even reading plain text from an untrusted source could be dangerous [AF(p) · GW(p)].

Thanks to Anna Salamon [LW · GW] for this observation. ↩︎

10 comments

Comments sorted by top scores.

comment by noggin-scratcher · 2020-06-29T10:08:30.568Z · LW(p) · GW(p)

plus some ancillary assumptions (more-general "theory" beliefs like "skepticism of medical authorities" can cause more-specific "claim" beliefs like "vaccines have harmful additives", but not vice versa)

This jumped out to me because it seems potentially untrue; I would expect there to exist at least some instances where people's belief about the specifics comes prior to, and is what causes, their beliefs about the general theory.

comment by waveman · 2020-06-29T03:52:47.359Z · LW(p) · GW(p)

how to use causal graphical models [LW · GW] to study why ordinary people have the beliefs they do, and how to intervene to make them be less wrong.

Note to self - a good reason to listen carefully to people's reasons for their beliefs, even/especially when they are nonsensical. They may have structure that can be exploited.

Replies from: Kaj_Sotala, MrMind

↑ comment by Kaj_Sotala · 2020-06-30T17:04:34.408Z · LW(p) · GW(p)

He was bitter. He was angry. He wouldn’t look at her. And, she could recognize, what he was saying didn’t make sense. She recognized, what he was saying didn’t fit with what he had just said. ... What Crowley was saying didn’t make sense, not on the surface. The individual pieces of what he said were incoherent, which, she knew, meant that there must be some other layer, some deeper layer, where they did make sense.

↑ comment by MrMind · 2020-06-30T08:46:27.676Z · LW(p) · GW(p)

How probable is that someone knows their internal belief structure? How probable is that someone who knows their internal belief structure tells you that truthfully instead of using a self-serving lie?

comment by johnswentworth · 2020-06-29T15:22:45.862Z · LW(p) · GW(p)

(and are conditionally independent given the value of $C$ )

I believe this should say "conditionally dependent".

Replies from: Zack_M_Davis

↑ comment by Zack_M_Davis · 2020-06-30T04:28:41.440Z · LW(p) · GW(p)

Thanks, you are right and the thing I actually typed was wrong. (For the graph A → C ← B, the collider C blocks the path between A and B, but conditioning on the collider un-blocks it.) Fixed.

comment by Matt Goldenberg (mr-hire) · 2020-06-30T14:39:56.606Z · LW(p) · GW(p)

Hmm, this is cool!

it seems sort of like a group version of consideration factoring.

Since this is a symmetric weapon, do you think that it's best to spread it globally (because someone using the symmetric weapons to point at truth will outcompete those using them to point at falsehoods) to try to keep the spread to champions of the truth, something else?

Trying to figure out how people are orienting to the idea of symmetric strategies.

comment by ryan_b · 2020-06-29T17:46:59.850Z · LW(p) · GW(p)

The thing is, "Learn the causal graph of why they think that and compute how to intervene on it to make them think something else" is a symmetric weapon—a fully general persuasive technique that doesn't depend on whether the thing you're trying to convince them of is true.

Is there a technical term for the difference between locally-symmetric and symmetric-under-repetition? It remains the case that the people who use the method to prevent vaccinations will have a lot more casualties on their side, which would normally be telling over time.

Replies from: Pattern

↑ comment by Pattern · 2020-07-02T20:17:41.540Z · LW(p) · GW(p)

Why deploy the method against your side?

comment by Mart_Korz (Korz) · 2020-06-30T17:32:32.797Z · LW(p) · GW(p)

Up-voted for thoroughly putting the idea into less wrong context - i enjoyed being reminded of all the related ideas

A thought: I am a bit surprised that one can distil a single belief network explaining a whole lot of the variance of beliefs across many people. This makes me take the idea more seriously that a large number of people regularly do have very similar beliefs (down to the argumentative structure). Remembering You Have About Five Words [LW · GW] this surprises me as I would expect a less reliable transmission of beliefs? (It might well be that I am just misunderstanding something)

Optimized Propaganda with Bayesian Networks: Comment on "Articulating Lay Theories Through Graphical Models"

Contents

10 comments