Dishonest Update Reporting

post by Zvi · 2019-05-04T14:10:00.742Z · score: 55 (14 votes) · LW · GW · 22 comments

Related to: Asymmetric JusticePrivacyBlackmail

Previously (Paul Christiano): Epistemic Incentives and Sluggish Updating

The starting context here is the problem of what Paul calls sluggish updating. Bob is asked to predict the probability of a recession this summer. He said 75% in January, and how believes 50% in February. What to do? Paul sees Bob as thinking roughly this:

If I stick to my guns with 75%, then I still have a 50-50 chance of looking smarter than Alice when a recession occurs. If I waffle and say 50%, then I won’t get any credit even if my initial prediction was good. Of course if I stick with 75% now and only go down to 50% later then I’ll get dinged for making a bad prediction right now—but that’s little worse than what people will think of me immediately if I waffle.

Paul concludes that this is likely:

Bob’s optimal strategy depends on exactly how people are evaluating him. If they care exclusively about evaluating his performance in January then he should always stick with his original guess of 75%. If they care exclusively about evaluating his performance in February then he should go straight to 50%. In the more realistic case where they care about both, his optimal strategy is somewhere in between. He might update to 70% this week.

This results in a pattern of “sluggish” updating in a predictable direction: once I see Bob adjust his probability from 75% down to 70%, I expect that his “real” estimate is lower still. In expectation, his probability is going to keep going down in subsequent months. (Though it’s not a sure thing—the whole point of Bob’s behavior is to hold out hope that his original estimate will turn out to be reasonable and he can save face.)

This isn’t ‘sluggish’ updating, of the type we talk about when we discuss the Aumann Agreement Theorem and its claim that rational parties can’t agree to disagree. It’s dishonest update reporting. As Paul says, explicitly.

I think this kind of sluggish updating is quite common—if I see Bob assign 70% probability to something and Alice assign 50% probability, I expect their probabilities to gradually inch towards one another rather than making a big jump. (If Alice and Bob were epistemically rational and honest, their probabilities would immediately take big enough jumps that we wouldn’t be able to predict in advance who will end up with the higher number. Needless to say, this is not what happens!)

Unfortunately, I think that sluggish updating isn’t even the worst case for humans. It’s quite common for Bob to double down with his 75%, only changing his mind at the last defensible moment. This is less easily noticed, but is even more epistemically costly.

When Paul speaks of Bob’s ‘optimal strategy’ he does not include a cost to lying, or a cost to others getting inaccurate information.

This is a world where all one cares about is how one is evaluated, and lying and deceiving others is free as long as you’re not caught. You’ll get exactly what you incentivize.

What that definitely won’t get you are a lot more than just accurate probability estimates.

The only way to get accurate probability estimates from Bob-who-is-happy-to-strategically-lie is to use a mathematical formula to reward Bob based on his log likelihood score. Or to have Bob bet in a prediction market, or another similar robust method. And then use that as the entirety of how one evaluates Bob. If human judgment is allowed in the process, the value of that will overwhelm any desire on Bob’s part to be precise or properly update.

Since Bob is almost certainly in a human context where humans are evaluating him based on human judgments, that means all is mostly lost.

As Paul notes, consistency is crucial in how one is evaluated. Even bigger is avoiding mistakes. 

Given the asymmetric justice of punishing mistakes and inconsistency that can be proven and identified, the strategic actor must seek cognitive privacy. The more others know about the path of your beliefs, the easier it will be for them to spot an inconsistency or a mistake. It’s hard enough to give a reasonable answer once, but updating in a way that never can be shown to have ever made a mistake or been inconstant? Impossible.

A mistake or inconsistency are the bad things one must avoid getting docked points for.

Thus, Bob’s full strategy, in addition to choosing probabilities that sound best and give the best cost/benefit payoffs in human intuitive evaluations of performance, is to avoid making any clear statements of any kind. When he must do so, he will do his best to be able to deny having done so. Bob will seek to destroy the historical record of his predictions and statements, and their path. And also prevent the creation of any common knowledge, at all. Any knowledge of the past situation, or the present outcome, could be shown to not be consistent with what Bob said, or what we believe Bob said, or what we think Bob implied. And so on.

Bob’s optimal strategy is full anti-epistemology. He is opposed to knowledge.

In that context, Paul’s suggested solutions seem highly unlikely to work.

His first suggestion is to exclude information – to judge Bob only by the aggregation of all of Bob’s predictions, and ignore any changes. Not only does this throw away vital information, it also isn’t realistic. Even if it was realistic for some people, others would still punish Bob for updating.

Paul’s second suggestion is to make predictions about others’ belief changes, which he himself notes ‘literally wouldn’t work.’ And that it is ‘a recipe for epistemic catastrophe.’ The whole thing is convoluted and unnatural at best.

Paul’s third and final suggestion is social disapproval of sluggish updating. As he notes, this twists social incentives potentially in good ways but likely in ways that make things worse:

Having noticed that sluggish updating is a thing, it’s tempting to respond by just penalizing people when they seem to update sluggishly. I think that’s a problematic response:

Bob already isn’t excited about updating. He’d prefer to not update at all. He’s upset about having had to give that 75% answer, because now if there’s new information (including others’ opinions) he can’t keep saying ‘probably’ and has to give a new number, again giving others information to use as ammunition against him.

The reason he updated visibly, at all, was that not updating would have been inconsistent or otherwise punished. Punish updates for being too small on top of already looking bad for changing at all, and the chance you get the incentives right here are almost zero. Bob will game the system, one way or another. And now, you won’t know how Bob is doing it. Before, you could know that Bob moving from 75% to 70% meant going to something lower, perhaps 50%. Predictable bad calibration is much easier to fix. Twist things into knots and there’s no way to tell.

Meanwhile, Bob is going to reliably get evaluated as smarter and more capable than Alice, who for reasons of principle is going around reporting her probability estimates accurately. Those observing might even punish Alice further, as someone who does not know how the game is played, and would be a poor ally.

The best we can do, under such circumstances, if we want insight from Bob, is to do our best to make Bob believe we will reward him for updating correctly and reporting that update honestly, then consider Bob’s incentives, biases and instincts, and attempt as best we can to back out what Bob actually believes.

As Paul notes, we can try to combat non-epistemic incentives with equal and opposite other non-epistemic incentives, but going deep on that generally only makes things more complex and rewards more attention to our procedures and how to trick us, giving Bob an even bigger advantage over Alice.

A last-ditch effort would be to give Bob sufficient skin in the game. If Bob directly benefits enough from us having accurate models, Bob might report more accurately. But outside of very small groups, there isn’t enough skin in the game to go around. And that still assumes Bob thinks the way for the group to succeed is to be honest and create accurate maps. Whereas most people like Bob do not think that is how winners behave. Certainly not with vague things that don’t have direct physical consequences, like probability estimates.

What can be done about this?

Unless we care enough, very little. We lost early. We lost on the meta level. We didn’t Play in Hard Mode.

We accepted that Bob was optimizing for how Bob was evaluated, rather than Bob optimizing for accuracy. But we didn’t evaluate Bob on that basis. We didn’t place the virtues of honesty and truth-seeking above the virtue of looking good sufficiently to make Bob’s ‘look good’ procedure evolve into ‘be honest and seek truth.’ We didn’t work to instill epistemic virtues in Bob, or select for Bobs with or seeking those virtues.

We didn’t reform the local culture.

And we didn’t fire Bob the moment we noticed.

Game over.

I once worked for a financial firm that made this priority clear. On the very first day. You need to always be ready to explain and work to improve your reasoning. If we catch you lying, about anything at all, ever, including a probability estimate, that’s it. You’re fired. Period.

It didn’t solve all our problems. More subtle distortionary dynamics remained, and some evolved as reactions to the local virtues, as they always do. For these and other reasons, that I will not be getting into here or in the comments, it ended up not being a good place for me. Those topics are for another day.

But they sure as hell didn’t have to worry about the likes of Bob.

22 comments

Comments sorted by top scores.

comment by Davidmanheim · 2019-05-05T05:33:24.391Z · score: 34 (8 votes) · LW · GW

There is a strategy that is almost mentioned here, but not pursued, that I think is near-optimal - explaining your reasoning as a norm. This is the norm I have experienced in the epistemic community around forecasting. (I am involved in both Good Judgment, where I was an original participant, and have resumed work, and on Metaculus's AI instance. Both are very similar in that regard.)

If such explanation is a norm, or even a possibility, the social credit for updated predictions will normally be apportioned based on the reasoning as much as the accuracy. And while individual brier scores are useful, forecasters who provide mediocre calibration but excellent public reasoning and evidence which others use are more valuable for an aggregate forecast than excellent forecasters who explain little or nothing.

If Bob wants social credit for his estimate in this type of community, he needs to publicly explain his model - at least in general. (This includes using intuition as an input - there are superforecasters who I update towards based purely on claims that the probability seems too low / high.) Similarly, if Bob wants credit for updating, he needs to explain his updated reasoning - including why he isn't updating based on evidence that prompted Alice's estimate, which would usually have been specified, or updated based on Alice's stated model and her estimate itself. If Bob said 75% initially, but now internally updates to think 50%, it will often be easier to justify a sudden change based on an influential datapoint, rather than a smaller one using an excuse.

comment by Zvi · 2019-05-05T10:45:30.844Z · score: 10 (6 votes) · LW · GW

Right. I kinda implied it was part of the solution but didn't say it explicitly enough, and may edit.

The problem for implementation, of course, is that explaining your reasoning is toxic in worlds with the models we describe. It's the opposite of not taking positions, staying hidden and destroying records. It opens you up to being blamed for any aspect of your reasoning. That's pretty terrible. It's doubly terrible if you're in any sort of double-think equilibrium (see SSC here). Because now, you can't explain your reasoning.

comment by Davidmanheim · 2019-05-06T07:17:53.912Z · score: 3 (2 votes) · LW · GW

Political contexts are poisonous, of course, in this and so many other ways, so politics should be kept as small as possible. In most contexts, however, including political ones, the solution is to give no credit for those that don't explain, or even to assign negative credit for punditry that isn't demonstrably more accurate than the corwd - which leads to a wonderful incentive to shut up unless you can say something more than "I think X will happen."

And in collaborative contexts, people are happy to give credit for mostly correct thinking that assist their own, rather than attack for mistakes. We should stay in those contexts and build them out where possible - positive sum thinking is good, and destroying, or at least ignoring, negative sum contexts is often good as well.

comment by orthonormal · 2019-05-05T23:41:37.310Z · score: 21 (7 votes) · LW · GW

The ideal thing is to judge Bob as if he were making the same prediction every day until he makes a new one, and log-score all of them when the event is revealed. (That is, if Bob says 75% on January 1st and 60% on February 1st, and then on March 1st the event is revealed to have happened, Bob's score equals 31*log(.25) + 28*log(.4). Then Bob's best strategy is to update his prediction to his actual current estimate as often as possible; past predictions are sunk costs.

The real-world version is remembering to dock people's bad predictions more, the longer they persisted in them. But of course this is hard.

538 did do this with their self-evaluation, which is a good way to try and establish a norm in the domain of model-driven reporting.

comment by Zvi · 2019-05-06T12:12:36.917Z · score: 3 (3 votes) · LW · GW

Yes, that seems right, if it can be used as the sole criteria, and be properly normalized for the time frames and questions involved. There are big second-level Goodhart traps lying in wait if people care about this metric.

comment by Vladimir_Nesov · 2019-05-04T16:04:38.882Z · score: 19 (4 votes) · LW · GW

In a prediction market your belief is not shared, but contributes to the consensus (market price of a futures). Many traders become agnostic about a question (close their position) before the underlying fact of the matter is revealed (delivery), perhaps shortly after stating the direction in which they expect the consensus to move (opening the position), to contribute (profit from) their rare knowledge while it remains rare. Requiring traders to own up to a prediction (hold to delivery) interferes with efficient communication of rare information into common knowledge (market price).

So consider declaring that the consensus is shifting in a particular direction, without explaining your reasoning, and then shortly after bow out of the discussion (taking note of how the consensus shifted in the interim). This seems very strange when compared to common norms, but I think something in this direction could work.

comment by Zvi · 2019-05-04T23:56:44.463Z · score: 7 (4 votes) · LW · GW

A key active ingredient here seems to be that exact ability to disguise your true position. Even if someone knows your trades, they don't know why you did them. You could have a different fair value (probability estimate), you could be hedging risk, you could expect the price to move in a direction without thinking that move is going to be accurate, and so on.

By not requiring the trader to be pinned down to anything (except profit and loss) we potentially extract more information.

And all of that applies to non-prediction markets, too.

comment by Davidmanheim · 2019-05-06T07:23:18.990Z · score: 5 (3 votes) · LW · GW

Note that most markets don't have any transparency about who buys or sells, and external factors are often more plausible reasons than a naive outsider expects. A drop in the share price of a retailer could be reflecting lower confidence in their future earnings, or result from a margin call on a firm that made a big bet on the firm that it needed to unwind, or even be because a firm that was optimistic about the retailer decided to double down, and move a large call options position out 6 months, so that their counterparty sold to hedge their delta - there is no way to tell the difference. (Which is why almost all market punditry is not only dishonest, but laughable once you've been on the inside.)

comment by Dagon · 2019-05-04T18:13:03.398Z · score: 3 (2 votes) · LW · GW

In a (deep enough, which is an unsolved problem) prediction market, there is a clear mechanism to be rewarded for indicating that your private beliefs differ from the consensus. When they no longer differ, it doesn't matter whether you close out your position or not.

In fact, you're right that you're really publishing a difference between current consensus and your private beliefs about future consensus, which may differ from truth, but that difference is opportunity for future participants who will get paid when the prediction resolves.

comment by Vladimir_Nesov · 2019-05-04T18:56:27.976Z · score: 7 (4 votes) · LW · GW

Holding to delivery is already familiar for informal communication. But short-term speculation is a different mode of contributing rare knowledge into consensus that doesn't seem to exist for discussions of beliefs that are not on prediction markets, and breaks many assumptions about how communication should proceed. In particular it puts into question the virtues of owning up to your predictions and of regularly publishing updated beliefs.

comment by Dagon · 2019-05-04T19:15:53.236Z · score: 3 (2 votes) · LW · GW

I'm confused whether we're talking about informal communication, where holding to delivery is the norm because nobody actually cares about the results, or about endorsed public predictions that we want to make decisions based on. I don't think the problems nor their solutions are the same for these different kinds of predictions.

comment by Vladimir_Nesov · 2019-05-04T19:31:17.849Z · score: 3 (2 votes) · LW · GW

By "informal" I meant that the belief is not on a prediction market, so you can influence consensus only by talking, without carefully keeping track of transactions. (I disagree with it being appropriate not to care about results in informal communication, so it's not a distinction I was making.)

comment by Dagon · 2019-05-04T22:40:43.017Z · score: 2 (1 votes) · LW · GW

exploring here, not sure where it'll go.

What is the value, to whom, of the predictions being correct? The interesting cases are one where there is something performing the function of a prediction market in feeding back some value for correct and surprising predictions. All else is "informal" and mostly about signaling rather than truth.

comment by Vladimir_Nesov · 2019-05-04T23:50:26.603Z · score: 7 (2 votes) · LW · GW

The value of caring about informal reasoning is in training the same skills that apply for knowably important questions, and in seemingly unimportant details adding up in ways you couldn't plan for. Existence of a credible consensus lets you use a belief without understanding its origin (i.e. without becoming a world-class expert on it), so doesn't interact with those skills.

When correct disagreement of your own beliefs with consensus is useful at scale, it eventually shifts the consensus, or else you have a source of infinite value. So almost any method of deriving significant value from private predictions being better than consensus is a method of contributing knowledge to consensus.

(Not sure what you were pointing at, mostly guessing the topic.)

comment by Dagon · 2019-05-05T15:59:10.869Z · score: 3 (2 votes) · LW · GW

For oneself, caring about reasoning and correct predictions is well worthwhile. And it requires some acknowledgement that your beliefs are private, and that they are separate from your public claims. Forgetting that this applies to others as well as yourself seems a bit strange.

I may be a bit too far on the cynicism scale, but I start with the assumption that informal predictions are both oversimplified to fit the claimant's model of their audience, and adjusted in direction (from the true belief) to have a bigger impact on their audience.

That is, I think most public predictions are of the form "you should have a higher credence in X than you seem to", but for greater impact STATED as "you should believe X".

comment by romeostevensit · 2019-05-04T18:55:41.009Z · score: 14 (9 votes) · LW · GW

I don't like reifying this as dishonesty when the outside view on taking ideas seriously says that it's pretty reasonable to update slowly as you gather more kinds of evidence than just logical argument.

comment by Zvi · 2019-05-04T23:48:21.345Z · score: 4 (6 votes) · LW · GW

I think it's definitely not dishonest to actually update too slowly versus what would be ideal. As you say, almost everyone does it.

What's dishonest is for Bob to think 50% and say 70% (or 75%) because it will look better.

comment by romeostevensit · 2019-05-05T00:51:45.523Z · score: 3 (2 votes) · LW · GW

agree, in this situation he should state that he feels incentivized to state 70% and that that's a problem.

comment by Benquo · 2019-05-05T05:58:53.056Z · score: 0 (0 votes) · LW · GW

.

comment by Dagon · 2019-05-04T17:26:38.613Z · score: 6 (3 votes) · LW · GW

This is an important line of thought, but I find myself very distracted by use of the word "updating" when you actually mean "publishing". In my mind, "updating a belief" strongly implies an internal state change, which may or may not be externally visible. It's a completely separate question of whether publishing or communicating a partial set of beliefs (because we can't yet publish our entire belief state) is helpful or harmful to one's goals.

All human interaction is a mix of cooperative and adversarial motives. Looking for mechanisms to increase cooperation and limit competitive motives is excellent, but we need to be clear that this isn't about updating beliefs, it's about broader human goal alignment.

comment by Zvi · 2019-05-04T23:50:11.078Z · score: 6 (3 votes) · LW · GW

Agreed. Changed to dishonest update reporting.

comment by Dagon · 2019-05-05T17:09:11.306Z · score: 2 (3 votes) · LW · GW

My experience has been that everyone is Bob, at least some of the time in some contexts, and that leads to many situations being comprised mostly of Bobs. Bob is simply correct - he has a more accurate map than you seem to - on the topic of whether sharing his true predictions will improve or harm his future experiences.

I don't even know how to formulate the problem statement that describes this - it feels like "humans are barely-evolved apes and consistently optimize for local/individual benefit at the expense of cooperative potential outcomes" is a bit too big to take on, but any narrower definition is missing an important root cause.

Designing mechanisms to align individual reward with the designers' goals is one way to approach this, and prediction markets are the best suggestion I've heard on the topic. And they fall prey to the same underlying problem: most people aren't seeking to improve group consensus of truth, so don't really want to participate in activities where they don't have some comparative advantage.