[Linkpost] Vague Verbiage in Forecasting

post by trevor (TrevorWiesinger) · 2024-03-22T18:05:53.902Z · LW · GW · 9 comments

This is a link post for https://goodjudgment.com/vague-verbiage-forecasting/

Contents

  1. Language is open to interpretations. Numbers are not.
  2. Language avoids accountability. Numbers embrace it.
  3. Language can’t provide feedback to demonstrate a track record. Numbers can.
None
9 comments

“What does a ‘fair chance’ mean?”

It is a question posed to a diverse group of professionals—financial advisers, political analysts, investors, journalists—during one of Good Judgment Inc’s virtual workshops. The participants have joined the session from North America, the EU, and the Middle East. They are about to get intensive hands-on training to become better forecasters. Good Judgment’s Senior Vice President Marc Koehler, a Superforecaster and former diplomat, leads the workshop. He takes the participants back to 1961. The young President John F. Kennedy asks his Joint Chiefs of Staff whether a CIA plan to topple the Castro government in Cuba would be successful. They tell the president the plan has a “fair chance” of success.

The workshop participants are now asked to enter a value between 0 and 100—what do they think is the probability of success of a “fair chance”?

When they compare their numbers, the results are striking. Their answers range from 15% to 75% with the median value of 60%.

Figure 1. Meanings behind vague verbiage according to a Good Judgment poll. Source: Good Judgment.

It sure would be nice if we could get one of these with the numbers based on the actual results, rather than the subject's impressions of the numbers. You'd need a lot of data from a wide variety of people, and it would need to cover a pretty diverse variety of events.

The story of the 1961 Bay of Pigs invasion is recounted in Good Judgment co-founder Philip Tetlock’s Superforecasting: The Art and Science of Prediction (co-authored with Dan Gardner). The advisor who wrote the words “fair chance,” the story goes, later said what he had in mind was only a 25% chance of success. But like many of the participants in the Good Judgment workshop some 60 years later, President Kennedy took the phrase to imply a more positive assessment of success. By using vague verbiage instead of precise probabilities, the analysts failed to communicate their true evaluation to the president. The rest is history: The Bay of Pigs plan he approved ended in failure and loss of life.

Vague verbiage is pernicious in multiple ways.

1. Language is open to interpretations. Numbers are not.

According to research published in the Journal of Experimental Psychology, “maybe” ranges from 22% to 89%, meaning radically different things to different people under different circumstances. Survey research by Good Judgment shows the implied ranges for other vague terms, with “distinct possibility” ranging from 21% to 84%. Yet, “distinct possibility” was the phrase used by White House National Security Adviser Jake Sullivan on the eve of the Russian invasion in Ukraine.

Figure 2. How people interpret probabilistic words. Source: Andrew Mauboussin and Michael J. Mauboussin in Harvard Business Review.

Other researchers have found equally dramatic perceptions of probability that people attach to vague terms. In a survey of 1,700 respondents, Andrew Mauboussin and Michael J. Mauboussin found, for instance, that the probability range that most people attribute to an event with a “real possibility” of happening spans about 20% to 80%.

2. Language avoids accountability. Numbers embrace it.

Pundits and media personalities often use such words as “may” and “could” without even attempting to define them because these words give them infinite flexibility to claim credit when something happens (“I told you it could happen”) and to dodge blame when it does not (“I merely said it could happen”).

“I can confidently forecast that the Earth may be attacked by aliens tomorrow,” Tetlock writes. “And if it isn’t? I’m not wrong. Every ‘may’ is accompanied by an asterisk and the words ‘or may not’ are buried in the fine print.”

Those who use numbers, on the other hand, contribute to better decision-making.

“If you give me a precise number,” Koehler explains in the workshop, “I’ll know what you mean, you’ll know what you mean, and then the decision-maker will be able to decide whether or not to proceed with the plan.”

Tetlock agrees. “Vague expectations about indefinite futures are not helpful,” he writes. “Fuzzy thinking can never be proven wrong.”

If we are serious about making informed decisions about the future, we need to stop hiding behind hedge words of dubious value.

3. Language can’t provide feedback to demonstrate a track record. Numbers can.

In some fields, the transition away from vague verbiage is already happening. In sports, coaches use probability to understand the strengths and weaknesses of a particular team or player. In weather forecasting, the standard is to use numbers. We are much better informed by “30% chance of showers” than by “slight chance of showers.” Furthermore, since weather forecasters get ample feedback, they are exceptionally well calibrated: When they say there’s a 30% chance of showers, there will be showers three times out of ten—and no showers the other seven times. They are able to achieve that level of accuracy by using numbers—and we know what they mean by those numbers.

Another well-calibrated group of forecasters are the Superforecasters at Good Judgment Inc, an international team of highly accurate forecasters selected for their track record among hundreds and hundreds of others. When assessing questions about geopolitics or the economy, the Superforecasters use numeric probabilities that they update regularly, much like weather forecasters do. This involves mental discipline, Koehler says. When forecasters are forced to translate terms like “serious possibility” or “fair chance” into numbers, they have to think carefully about how they are thinking, to question their assumptions, and to seek out arguments that can prove them wrong. And their track record is available for all to see. All this leads to better informed and accurate forecasts that decision-makers can rely on.

9 comments

Comments sorted by top scores.

comment by Dagon · 2024-03-22T19:29:27.117Z · LW(p) · GW(p)

In my experience, it’s also contextual - a fair chance of snow next week is a different distribution than a fair chance of getting a ticket of you park illegally in front of a cop.

I strongly suspect that the ambiguity is intentional (or at least useful) in most cases. In many cases, the lack of precision means that one is willing to make the statement at all, because it can’t be tracked or come back to haunt you. There are plenty of times I’ll give a vague prediction in a sort of “don’t make plans that ignore this possibility” way, but would plead ignorance if someone demanded a number.

Replies from: TrevorWiesinger
comment by trevor (TrevorWiesinger) · 2024-03-22T19:45:12.487Z · LW(p) · GW(p)

I can see that— language evolving plausible deniability over time, due to the immense instinctive focus on fear of being called out for making a mistake.

comment by CstineSublime · 2024-03-22T23:48:52.836Z · LW(p) · GW(p)

How does one manage the need for expedience and find the point where increasing precision has diminishing returns? As ambiguous as some of these modal adverbs are they are usually precise enough for the statements one might try to make. If I say "It'll likely rain tomorrow, best to take an umbrella" whether I think it's 55% or 98% is not really that important as it has exceeded the threshold I have for "umbrella weather". In other cases though such ambiguity is unacceptable.

As a side note, "Fair" is a particularly ambiguous adjective as it is often[1] employed to mean a uniform probability distribution (i.e. the most equitable), or in accordance with custom or moral imperatives (i.e. "He adjudicated fairly"), a large or advantageous degree or amount (i.e. "Hulkenberg got a fair amount of laps in before the red flag"), something which is pleasing to look at (i.e. if you want to employ pseudo-medieval tropes make sure to refer to a young woman as a 'fair maiden') and finally and least relevant - something pale as in "fair complexion". I'm sure etymologically these all are examples of drift from one original meaning. However someone uses the phrase "fair chance" is likely coloured by at least one of these meanings.

  1. ^

    I'm aware of the irony of using a word like "often" in a discussion about the ambiguity of chance related words. Here I mean each variation on the meaning of "fair" is used in discourse frequently enough to earn entries in respected dictionaries, however: you try concisely putting that in a sentence. 

comment by Nathan Young · 2024-03-22T21:28:39.284Z · LW(p) · GW(p)

I would not put the whole thing in quotes. I find it harder to read.

Replies from: TrevorWiesinger
comment by trevor (TrevorWiesinger) · 2024-03-22T22:42:48.741Z · LW(p) · GW(p)

That's strange, I looked closely but couldn't see how that would cause an issue. Could you describe the issue so I can see what you're getting at? I put a poll up in case there's a clear consensus that this makes it hard to read.

I'm on PC, is this some kind of issue with mobile? I really, really, really don't think people should be using smartphones for browsing Lesswrong.

Replies from: Nathan Young, epirito
comment by Nathan Young · 2024-03-22T23:37:13.240Z · LW(p) · GW(p)

I don't like looking at it. Also it's basically the whole article so it feels unnecessary. Something about it narrowing the text. 

I'm on a laptop. But I think people should be able to use their phones to visit almost any website.

Replies from: TrevorWiesinger
comment by trevor (TrevorWiesinger) · 2024-03-22T23:57:44.384Z · LW(p) · GW(p)

Now that I think about it, I can see it being a preference difference- the bar might be more irksome for some people than others, and some people might prefer to go to the original site to read it whereas others would rather read it on LW if it's short. I'll think about that more in the future.

comment by Epirito (epirito) · 2024-03-22T23:42:07.934Z · LW(p) · GW(p)

Oh, not the client device police!

comment by avancil · 2024-03-23T00:10:23.115Z · LW(p) · GW(p)

Ironic that "Maybe" seems to have one of the narrower ranges of probabilities...