Can We Place Trust in Post-AGI Forecasting Evaluations?

ozziegooen

Can We Place Trust in Post-AGI Forecasting Evaluations?

post by ozziegooen · 2019-02-17T19:20:41.446Z · LW · GW · 16 comments

  Motivation
  Idea
  Downsides
  Questions for Others
None
16 comments

TLDR
Think "A prediction market, where most questions are evaluated shortly after an AGI is developed." We could probably answer hard questions more easily post-AGI, so delaying them would have significant benefits.

Motivation

Imagine that select pre-AGI legal contracts stay valid post-AGI. Then a lot of things are possible.

There are definitely a few different scenarios out there for economic and political consistency post-AGI, but I believe there is at least a legitimate chance (>20%) that legal contracts will exist for what seems like a significant time (>2 human-experiential years.)

If these contracts stay valid, then we could have contracts set up to ensure that prediction evaluations [LW · GW] and prizes happen.

This could be quite interesting because post-AGI evaluations could be a whole lot better than pre-AGI evaluations. They should be less expensive and possibly far more accurate.

One of the primary expenses now with forecasting setups is the evaluation specification and execution. If these could be pushed off while keeping relevance, that could be really useful.

Idea

What this could look like is something like a Prediction Tournament or Prediction Market where many of the questions will be evaluated post-AGI. Perhaps there would be a condition that the questions would only be evaluated if AGI happens within 30 years, and in those cases, the evaluations would happen once a specific threshold is met.

If we expect a post-AGI world to allow for incredible reasoning and simulation abilities, we could assume that it could make incredibly impressive evaluations.

Some example questions:

To what degree is each currently-known philosophical system accurate?
What was the expected value of Effective Altruist activity Y, based on the information available at the time to a specific set of humans?
How much value has each Academic field created, according to a specific philosophical system?
What would the GDP of the U.S. have been in 2030, conditional on them doing policy X in 2022?
What were the chances of AGI going well, based on the information available at the time to a specific set of humans?

Downsides

My guess is that many people would find this quite counterintuitive. Forecasting systems are already weird enough.

There's a lot of uncertainty around the value systems and epistemic l states of authoritative agencies, post-AGI. Perhaps they would be so incredibly different to us now that any answers they could give us would seem arcane and useless. Similar to how it may become dangerous to extrapolate one's volition "too far", it may also be dangerous to be "too smart" when making evaluations defined by less intelligent beings.

That said, the really important thing isn't how the evaluations will actually happen, but rather what forecasters will think of it. Whatever evaluation system motivates forecasters to be as accurate and useful as possible (while minimizing cost) is the one to strive for.

My guess is that it's worth trying out, at least in a minor capacity. There should, of course, be related forecasts for things like, "In 2025, will it be obvious that post-AGI forecasts are a terrible idea?"

Questions for Others

This all leaves a lot of questions open. Here are a few specific ones that come to mind:

What kinds of legal structures could be most useful for post-AGI evaluations?
What, in general, would people think of post-AGI evaluations? Could any prediction community take them seriously and use them for additional accuracy?
What kinds of questions would people want to see forecasted, if we could have post-AGI evaluations?
What other factors would make this a good or bad thing to try out?

16 comments

Comments sorted by top scores.

comment by Donald Hobson (donald-hobson) · 2019-02-17T21:01:29.480Z · LW(p) · GW(p)

As your belief about how well AGI is likely to go affects both the likelihood of a bet being evaluated, and the chance of winning, so bets about AGI are likely to give dubious results. I also have substantial uncertainty about the value of money in a post singularity world. Most obviously is everyone getting turned into paperclips, noone has any use for money. If we get a friendly singleton super-intelligence, everyone is living in paradise, whether or not they had money before. If we get an economic singularity, where libertarian ASI(s) try to make money without cheating, then money could be valuable. I'm not sure how we would get that, as an understanding of the control problem good enough to not wipe out humans and fill the universe with bank notes should be enough to make something closer to friendly.

Even if we do get some kind of ascendant economy, given the amount of resources in the solar system (let alone wider universe), its quite possible that pocket change would be enough to live for aeons of luxury.

Given how unclear it is about whether or not the bet will get paid and how much the cash would be worth if it was, I doubt that the betting will produce good info. If everyone thinks that money is more likely than not to be useless to them after ASI, then almost no one will be prepared to lock their capital up until then in a bet.

Replies from: ozziegooen

↑ comment by ozziegooen · 2019-02-17T22:03:32.564Z · LW(p) · GW(p)

Thanks for the considered comment.

I think the main crux here is how valuable money will be post-AGI. My impression is that it will still be quite valuable. Unless there is a substantial redistribution effort (which would have other issues), I imagine economic growth will make the rich more money than the poor. I'd also think that even though it would be "paradise", many people would care about how many resources they have. Having one-millionth of all human resources may effectively give you access to one-millionth of everything produced by future AGIs.

Scenarios where AGI is friendly (not killing us) could be significantly more important to humans than ones in which it is not. Even if it has a 1% chance of being friendly, in that scenario, it's possible we could be alive for a really long time.

Last, it may not have to be the case that everyone thinks money will be valuable post-AGI, but that some people with money think so. In those cases, they could exchange with others pre-AGI to take that specific risk.

So I generally agree there's a lot of uncertainty, but think it's less than you do. That said, this is, of course, something to apply predictions to.

comment by ChristianKl · 2020-11-23T12:23:24.977Z · LW(p) · GW(p)

That said, the really important thing isn't how the evaluations will actually happen, but rather what forecasters will think of it.

No, empirical feedback is important for getting good predictions.

comment by rossry · 2019-02-18T13:15:17.761Z · LW(p) · GW(p)

Assuming that your AI timelines are well-approximated by "likely more than three years", Zvi's post on prediction market desiderata [LW · GW] suggests that post-AGI evaluation is pretty dead-on-arrival for creating liquid prediction markets. Even laying aside the conditional-on-AGI dimension, the failures of "quick resolution" (years) and "probable resolution" (~20%, by your numbers) are crippling for the prospect of professionals or experts investing serious resources in making profitable predictions.

Replies from: Radamantis, ozziegooen

↑ comment by NunoSempere (Radamantis) · 2020-11-23T09:48:37.466Z · LW(p) · GW(p)

the failures of "quick resolution" (years)

Note that you can solve this by chaining markets together, i.e., having a market every year asking what the next market will predict, where the last market is 1y before AGI. This hasn't been tried much in reality, though.

Replies from: rossry

↑ comment by rossry · 2020-11-26T06:27:45.719Z · LW(p) · GW(p)

Clever, but it hasn't been tried for a good reason. If, say, the next five years of markets are all untethered from reality (but consistent with each other), there's no way to get paid for bringing them into line with expected reality except by putting on the trades and holding them for five years. (The natural one-year trade will just resolve to the unfair market price of the next-year-market market and there's nothing to do about it except wait for longer.)

The chained markets end up being no more fair than if they all settled to the final expiry directly.

Replies from: Radamantis

↑ comment by NunoSempere (Radamantis) · 2020-11-26T09:28:20.915Z · LW(p) · GW(p)

Yes, I can imagine cases where this setup wouldn't be enough.

Though note that you could still buy the shares the last year. Also, if the market corrects by 10% each year (i.e., a value of a share of yes increases from 10 to 20% to 30% to 40%, etc. each year), it might still be worth it (note that the market would resolve each year to the value of a share, not to 0 or 100).

Also note that the current way in which prediction markets are structured is, as you point out, dumb: you bet 5 depreciating dollars which then go into escrow, rather than $5 worth of, say, S&P 500 shares, which increase in value. But this could change.

↑ comment by ozziegooen · 2019-02-18T13:55:27.257Z · LW(p) · GW(p)

I'd agree this would work poorly in traditional Prediction Markets. Not so sure about Prediction Tournaments, or other Prediction Market systems that could exist. Others could be heavily subsidized, and the money on hold could be invested in more standard asset classes.

*(Note: I said >20%, not exactly 20%)

Replies from: rossry

↑ comment by rossry · 2019-02-18T14:39:01.178Z · LW(p) · GW(p)

I understand Zvi's points as being relatively universal to systems where you want to use rewards to incentivize participants to work hard to get good answers.

No matter how the payouts work, a p% chance that your questions don't resolve is (to first order) equivalent to a p% tax on investment in making better predictions, and a years-long tie-up kills iterative growth and selection/amplification cycles as well limiting the return-on-investment-in-general-prediction-skill to a one-shot game. I don't think these issues go away if you reward predictions differently, since they're general features of the relation between the up-front investment in making better predictions and the to-come potential reward for doing so well.

(A counterpoint I'll entertain is Zvi's caveat to "quick resolution" -- which also caveats "probable resolution" -- that sufficient liquidity can substitute for resolution. But bootstrapping that liquidity itself seems like a Hard Problem, so I'd need to further be convinced that it's tractable here.)

Replies from: ozziegooen

↑ comment by ozziegooen · 2019-02-18T15:20:00.961Z · LW(p) · GW(p)

If the reason your questions won't resolve is that you are dead or that none of your money at all will be useful, I think things are a bit different.

That said, one major ask is that the forecasters believe the AGI will happen in between, which seems to me like an even bigger issue :)

I'd estimate there's a 2% chance of this being considered "useful" in 10 years, and in those cases would estimate it to be worth $10k to $20 million of value (90% ci). Would you predict <0.1%?

Replies from: rossry, rossry

↑ comment by rossry · 2019-02-19T02:11:42.887Z · LW(p) · GW(p)

I'm still thinking about what quantitative estimates I'd stand behind. I think I'd believe that a prize-based competitive prediction system with all eval deferred until and conditioned on AGI is <4% to add more than $1mln of value to [just pay some smart participants for their best-efforts opinions].

(If I thought harder about corner-cases, I think I could come up with a stronger statement.)

↑ comment by rossry · 2019-02-18T23:39:13.713Z · LW(p) · GW(p)

If the reason your questions won't resolve is that you are dead or that none of your money at all will be useful, I think things are a bit different.

I'm confused; to restate the above, I think that a p% chance that your predictions don't matter (for any reason: game rained out, you're dead, your money isn't useful) is (to first order) equivalent to a p% tax on investment in making better predictions. What do you think is different?

one major ask is that the forecasters believe the AGI will happen in between, which seems to me like an even bigger issue

Sure, that's an issue, but I think that requiring participants to all assume short AGI timelines is tractable in a way that the delayed/improbable resolution issues are not.

I can imagine a market without resolution issues that assumes participants all believe short AGI timelines could support 12 semi-professional traders subsidized by interested stakeholders. I don't believe that a market with resolution issues as above can elicit serious investment in getting its answers right from half that many. (I recognize that I'm eliding my definition of "serious investment" here.)

Replies from: ozziegooen

↑ comment by ozziegooen · 2019-02-19T00:04:00.139Z · LW(p) · GW(p)

For the first question, I'm happy we identified this as an issue. I think it is quite different. If you think there's a good chance you will die soon, then your marginal money will likely not be that valuable to you. It's a lot more valuable in the case that you survive.

For example, say you found out tomorrow that there's a 50% chance everyone will die in one week. (Gosh this is a downer example) You also get to place an investment for $50, that will pay out in two weeks for $70. Is the expected value of the bet really equivalent to (70/2)-50 = -$5? If you don't expect to spend all of your money in one week, I think it's still a good deal.

I'd note that Superforecasters have performed better than Prediction Markets, in what I believe are relatively small groups (<20 people). While I think that Prediction Markets could theoretically work, I'm much more confident in systems like those of Superforecasters, where they wouldn't have to make explicit bets. That said, you could argue that their time is the cost, so the percentage chance still matters. (Of course, the alternative, of giving them money to enjoy for 5-15 years before 50% death, also seems pretty bad)

comment by Matt Goldenberg (mr-hire) · 2019-02-18T12:47:13.268Z · LW(p) · GW(p)

I think I'm missing a key inferential step here.

I'm having trouble seeing the benefit of something like this over simply a regular prediction market/poll with long time horizons. Any existing prediction market/poll will by definition become a post-AGI prediction market/poll once AGI is developed. This of course won't be able to ask questions dependent on AGI being developed (without explicitly stating those in the question), but many of the questions in your examples don't seem to be those sorts of dependent questions.

I'm also having trouble seeing how you would resolve some of the questions you asked in a traditional prediction market/poll system. It seems more at that point like just asking an AGI what their probabilties are on specific things, without being able to measure their accuracy. It seems like having a list of questions that it would be useful to ask an AGI is a worthwhile goal in itself, but it seems like you have something else in mind that I'm not quite getting.

Replies from: mr-hire

↑ comment by Matt Goldenberg (mr-hire) · 2019-02-18T13:03:41.581Z · LW(p) · GW(p)

So after rereading, it seems like what you're saying is - Have the AGI do the resolutions? Which means people predicting what an AGIs probabilities will be on hard questions (assuming the AGI isn't omniscient, it will still be able to only give probabilities on these items and not certainties). This makes a bit more sense in that instead of a resolution date it gives a resolution event. However you lose the ability to weight people's answers by their accuracy since nothing ever gets resolved till the AGI comes, and it seems to fall prey to the "predicting what someone smarter than me would do" problem.

Replies from: ozziegooen

↑ comment by ozziegooen · 2019-02-18T13:53:22.591Z · LW(p) · GW(p)

I'm saying that the AGI would be helpful to do the resolutions; any world post-AGI could be significantly better at answering such questions. I'm not sure if it's a useful distinction though between "The AGI evaluates the questions" and "An evaluation group uses the AGI to evaluate the questions."

You're right it has the issue of "predicting what someone smarter than me would do." Do you know of much other literature on that one issue? I'm not sure how much of an issue to expect it to be.

Can We Place Trust in Post-AGI Forecasting Evaluations?

Contents

Motivation

Idea

Downsides

Questions for Others

16 comments