A Richly Interactive AGI Alignment Chart

post by lisperati · 2022-09-02T00:44:20.646Z · LW · GW · 6 comments

Contents

7 comments

Hi, most of you have probably seen the AGI Alignment charts generated by Rob Bensinger and Michaël Trazzi. I thought those charts could be useful to help people connect with others on Twitter that share similar (or very different) views on the two seminal questions on this topic ("AGI when?" and "AGI safe?") However, it would need to be possible for people to add themselves to the chart, and have richer tools for filtering and browsing avatars on the chart.

Here is my own chart that adds all these features: https://agialignment.com

Please let me know if you have any difficulty adding yourself to the chart, or updating/deleting your info from the chart if I already put you on there (based on your existing public predictions, or my own crude guestimates, if I thought you had said enough on the subject for this to be worth attempting)

I know the security-minded folks among you are going to be suspicious of the "log in with twitter" feature of my site, but I think upon reflection you would have to agree this is a rare, appropriate use case for this feature. Note that I only ask for minimal rights and then do my best to guide users to immediately fully disconnect my app from twitter again. All the code is available on twitter and is open source.

FYI if you are in an AI alignment org and want to collaborate to improve this chart and/or take stewardship of it, get in touch. 

Here is the announcement thread for this interactive chart on twitter, in case there is useful info in the future in this discussion: https://twitter.com/lisperati/status/1565496833677201409

6 comments

Comments sorted by top scores.

comment by Shiroe · 2022-09-02T03:41:53.076Z · LW(p) · GW(p)

I'm always amused whenever p(doom)s like "3.5%" are categorized as low risk.

Replies from: lisperati
comment by lisperati · 2022-09-03T00:54:33.385Z · LW(p) · GW(p)

I agree, but a chart like this needs to make some compromises on design for the sake of clarity

They are "low risk" relative to the risk estimate most other people are giving.

comment by Thomas Kwa (thomas-kwa) · 2022-09-02T04:15:56.842Z · LW(p) · GW(p)

Surely Paul Christiano has shorter timelines than 2048, and Demis Hassabis has a credence in x-risk lower than 45%?

Replies from: lisperati
comment by lisperati · 2022-09-03T00:53:47.715Z · LW(p) · GW(p)

Yeah, looking into this more I agree on both points. I don't see good data on those axes for either of those two.

You want to take a stab at a guesstimate for what new values would be better, so I can update the chart?

Replies from: thomas-kwa
comment by Thomas Kwa (thomas-kwa) · 2022-09-03T01:47:20.036Z · LW(p) · GW(p)

Ajeya's median of 2040 [LW · GW] for Paul? No idea for Demis. It might be better to not include people you don't have data for, because including your guesses could be misleading. Or at least indicate they're guesses somehow...

Replies from: lisperati
comment by lisperati · 2022-09-04T17:24:11.467Z · LW(p) · GW(p)

I'll adjust Paul. I agree it would be better if there was more clarity on guesses. If there is enough interest in the site I will push an update in a couple months and add indicators on what constitutes rough guesses.

Replies from: alex-k-chen