Why is increasing public awareness of AI safety not a priority?

post by FinalFormal2 · 2022-08-10T01:28:44.068Z · LW · GW · No comments

This is a question post.

Contents

  Answers
    14 Joe_Collman
    4 ChristianKl
    3 otto.barten
    1 jskatt
None
No comments

Has anyone done anything in this domain? Is there a public face for AI safety that we can promote? I wanted to pester my favorite streamer about looking into AI safety, but I don't know who I would refer him to.

It seems so obvious to me that this should be a priority, especially for anyone who cannot engage in research themselves.

It seems like this might be a result of an aversion to bad press, but the truth is that bad press would be significantly better than what we have now. As far as I can see we have no press.

This is disappointing.

Sometimes I seriously consider the possibility that you guys are all larping.

Answers

answer by Joe_Collman · 2022-08-10T02:56:06.331Z · LW(p) · GW(p)

There are people looking at this.

The obvious answer to why it's not transparently been a high priority is:

  1. It's not clear specifically how it helps.
  2. Until we're clear on a path to impact, it seems more valuable to figure out in more depth what to aim for, than to go and do something ill-thought-through quickly.
  3. Increasing awareness is not reversible, and there are obvious failure modes (e.g. via politicisation).
  4. Public pressure is a blunt tool, and it's not clear how it could best be leveraged.
    1. Supposing we actually need to hit a precise target, it's highly unlikely that public pressure is going to be pushing us towards that target (even if we knew the direction we wanted to push, which we currently do not).
    2. Even for something simple such as "slow AGI development down", the devil is in the detail: slow what down, where, how?

That said, it's nonetheless highly unlikely that the status-quo/default situation is the ideal - so it does seem probable that some form of broader communication is desirable.

But it's usually much easier to make things worse than to make them better - so I'd want most effort to go into figuring out the mechanics of a potentially useful plan. Most interventions don't achieve what was intended - even the plausibly helpful ones (for instance, one might try recruiting John Carmack to work on AI safety [this strikes me as a good idea, hindsight notwithstanding], only to get him interested enough that he starts up an AGI company a few years later).

I understand that it must be frustrating for anyone who takes this very seriously and believes they cannot engage in research themselves (though I'd suggest asking how they can help). However, the goal needs to be to actually improve the situation, rather than to do something that feels good. The latter impulse is much simpler to satisfy than the former.

To find a communication strategy that actually improves the situation, we need careful analysis.

With that in mind, if you want to start useful discussion on this, I'd suggest outlining:

  • Some plan (in broad terms).
  • The mechanisms by which you'd expect it to help.
  • Ways in which it could fail.
  • Reasons to think it won't backfire relative to the status-quo.
    • Reasons to think it won't render superior future plans impossible/impractical (relative to the status-quo).

I don't think this is easy (I have no good plan). I do think it's required.

comment by FinalFormal2 · 2022-08-11T02:49:05.232Z · LW(p) · GW(p)
  1. Increasing awareness increases resources through virtue of sheer volume. The more people hear about AI safety, the more likely someone resourceful and amenable hears about AI safety.

  2. This is a good sentiment, but 'resource gathering' is an instrumentally convergent strategy. No matter what researchers end up deciding we should do, it'll probably be best done with money and status on our side.

  3. Politicisation is not a failure mode, it's an optimistic outcome. Politicized issues get money. Politicized issues get studied. Other failure modes might be that we increase interest in AI in general and result in destructive AI being generated more rapidly, but there's already a massive profit motive in that direction, so I don't know if we can really contribute in that direction. Most other 'failure modes' which involve 'bad publicity' are infinitely preferable to the current state of affairs, given you have enough dignity to be shameless.

  4. 'Public pressure' isn't really a thing as far as I can tell. The public presses back. Money talks. Wheels turn. Equilibria equilibriate or something. I'm only talking about awareness.

My current plan relies on identifying a good representative for AI safety with enough clout to be taken seriously, contacting them, and then trying to get them into public discussions with rationalist-adjacent e-celebs.

I'd expect this to increase awareness for AI safety and make people who see this content more amenable to advocacy for AI safety in the future. I'd expect a minority of people who watch this content to become very interested in AI safety and try and learn more.

Assuming that I find a good advocate and succeed in getting them into a discussion with a minor celebrity, this could go wrong in the following ways:

  1. The advocates comes across as unhinged
  2. The advocates comes across as unlikable
  3. The advocate cannot explain AI safety well
  4. The advocate cannot respond to criticism well

As far as I can see, the efficacy and reliability of this plan relies entirely on the character of the advocate. Because this is being tested in a smaller corner of the internet, I think we can believe that if inexplicably results in disaster, the effect will be relatively contained, but honestly I think a pretty small amount of screening can prevent the worst of this.

Replies from: Joe_Collman
comment by Joe_Collman · 2022-08-11T06:45:52.702Z · LW(p) · GW(p)

The largest issue with this approach/view is that it's not addressing the distinction between:

  • Increased resources for things with "AI Safety" written on them.
  • Increased resources for approaches that stand a chance of working.

The problem is important in large measure due to its difficulty: that we need to hit a very small target we don't yet understand. By default, resources allocated to anything labelled "AI safety" will not be aimed at that target.

If things are politicised, it's a safe bet they won't be aimed at the target; politicised issues get money thrown in their general direction, but that's not about actually solving the problem. There's a big difference between [more money helps, all else being equal], and claiming [action x gets more money, therefore it helps]. Politicisation would have many downsides.

Likewise, even if we get more attention/money... there are potential signal-to-noise issues. Suppose that there are 20 people involved in grant allocation with enough technical understanding to pick out promising projects.
Consider two cases:

  1. They receive 200 grant applications, 20 of which are promising.
  2. They receive 2000 grant applications, 40 of which are promising.

In case (2) there are more promising projects, but it's not clear that grant evaluators will find more promising projects, since the signal to noise ratio will have dropped so much.

The obvious answer is to train more grant evaluators to the point where they have the necessary expertise - but this is a slow process that's (currently) difficult to scale (though people are working on that).

You also seem to be cherry-picking the upside possibilities from increased awareness: yes, some people may start to work on or advocate for AI safety (and some small proportion of those for some useful understanding of "AI safety").
However, some people may also:

  • Hear about AI safety, realise that AGI is a big deal but not buy the safety arguments, and start working on AGI.
    • This is not a hypothetical situation, or something that only happens to people without much ability: if John Carmack can get this badly wrong, where are you getting your confidence that most people won't?
  • Realise that AGI is a big deal, think that the major issues are misuse and/or ethics, and make poor decisions on that basis.
    • Make sure we get AGI before them...
    • Put in regulations that focus the 'safety' resources of AI companies on ticking meaningless boxes that do nothing to mitigate x-risk. (though I'd guess the default situation looks largely like this anyway)

We need to argue that it's net positive, not simply that there would be some positive outcomes (I don't think anyone would argue with that).

Again, I do think that there's some communication strategy we should be using that beats the status-quo. However, it needs to be analysed carefully, and carried out carefully - with adjustment based on empirical feedback where possible. (my guess is that the best approaches would be highly targeted - not that this says much at all)

comment by Mo Putera (Mo Nastri) · 2022-08-10T10:53:33.467Z · LW(p) · GW(p)

for instance, one might try recruiting John Carmack to work on AI safety [this strikes me as a good idea, hindsight notwithstanding], only to get him interested enough that he starts up an AGI company a few years later

Is this a reference to his current personal project to work on AGI? 

Edit: reading a bit more about him, I suspect if he ever got interested in alignment work he'd likely prefer working on Christiano-style stuff than MIRI-style stuff. For instance (re: metaverse):

The idea of the metaverse, Carmack says, can be "a honeypot trap for 'architecture astronauts.'" Those are the programmers and designers who "want to only look at things from the very highest levels," he said, while skipping the "nuts and bolts details" of how these things actually work.

These so-called architecture astronauts, Carmack said, "want to talk in high abstract terms about how we'll have generic objects that can contain other objects that can have references to these and entitlements to that, and we can pass control from one to the other." That kind of high-level hand-waving makes Carmack "just want to tear [his] hair out... because that's just so not the things that are actually important when you're building something."

Carmack used his own experience creating Doom as an example of the value of concrete, product-based thinking. Rather than simply writing abstract game engines, he wrote games where "some of the technology... turned out to be reusable enough to be applied to other things," he said. "But it was always driven by the technology itself, and the technology was what enabled the product and then almost accidentally enabled some other things after it."

Building pure infrastructure and focusing on the "future-proofing and planning for broad generalizations of things," on the other hand, risks "making it harder to do the things that you're trying to do today in the name of things you hope to do tomorrow, and [then] it's not actually there or doesn't actually work right when you get around to wanting to do that," he said.

Replies from: Joe_Collman
comment by Joe_Collman · 2022-08-10T20:56:44.677Z · LW(p) · GW(p)

My source on this is his recent appearance on the Lex Fridman podcast.
He's moving beyond the personal project stage.

He does seem well-informed (and no doubt he's a very smart guy), so I do still hope that he might update pretty radically given suitable evidence. Nonetheless, if he stays on his present course greater awareness seems to have been negative (this could absolutely change).

The tl;dr of his current position is:

  1. Doesn't expect a fast takeoff.
    1. Doesn't specify a probability or timescale - unclear whether he's saying e.g. a 2-year takeoff seems implausible; pretty clear he finds a 2-week takeoff implausible.
  2. We should work on ethics/safety of AGI once we have a clearer idea what it looks like. (expects we'll have time, due to 1)
  3. Not really dismissive of current AGI safety efforts, but wouldn't put his money there at present, since it's unclear what can be achieved.

My take:

  • On 1a, his argument seems to miss the possibility that an AGI doesn't need to copy itself to other hardware (he has reasonable points that suggest this would be impractical), but might write/train systems that can operate with very different hardware.
    • If we expect smooth progress, we wouldn't expect the first system to have the capability to do this - though it may be able to bide its time (potentially requiring gradient-hacking).
    • However, he expects that we need a small number of key insights for AGI (seems right to me). This is hard to square with an expectation of smooth progress.
  • On 2, he seems overconfident on the ease of solving the safety issues once we have a clearer idea what it looks like. I guess he's thinking that this looks broadly similar to engineering problems we've handled in the past, which seems wrong to me. Everything we've built is narrow, so we've never needed to solve the "point this at exactly the thing we mean" problem (capabilities didn't generalise for any system we've made safe).
  • On 3, I'm pretty encouraged. He seems to be thinking about things reasonably, and isn't casually dismissing other points of view. I think his current model of the situation is incorrect (as, I'm sure is mine!), but expect that he'd be keen to change direction pretty radically if his model did change.

Agreed that he'd be a much better fit for Christiano-style stuff than MIRI-style - though I also expect he could fit well with Anthropic/Redwood/Conjecture; essentially anything with an empirical slant to it. In an ideal world, he'd be a good fit for Carmack-style stuff.

I do think it's worthwhile to consider how to engage with him further - though this is still a case where I'd want people thinking carefully about how best to communicate before acting (and whether they're the best messenger; I quickly conclude that I am not).

Replies from: jacob_cannell
comment by jacob_cannell · 2022-08-24T03:34:40.519Z · LW(p) · GW(p)

So hilariously enough it looks like Carmack got into AGI 4 years ago because Sam Altman tried to recruit him for OpenAI but at the time Carmack knew barely any ML. (I just found the Lex Fidman interview).

The update you are looking for should probably flow in the other direction. Carmack is making an explicit bet on a more brain-like architecture. If he succeeds in his goal of proving out child level AGI raised in VR in 5 years, the safety community should throw out most of their now irrelevant work and focus everything on how to make said new AGI models raised in controlled VR environments safe.

answer by ChristianKl · 2022-08-11T07:26:38.071Z · LW(p) · GW(p)

The Obama administration had great success at reducing Mercury pollution and little success at reducing CO2 pollution. Most of the action to reduce Mercury pollution happened outside of public awareness.

The great public awareness of CO2 pollution made it a highly political topic where nothing gets done on the political level. Even worse, most of the people who are heavily engaged in the topic on both sides are not thinking clearly about the issue but are mind-killed. 

The ability to think clearly is even more important for AI safety than it is for climate change. 

comment by FinalFormal2 · 2022-08-12T22:06:08.930Z · LW(p) · GW(p)

CO2 was not brought to public awareness arbitrarily. CO2 came to public awareness because regulating it without negatively impacting a lot of businesses and people is impossible.

Controversial -> Public Awareness

Not

Public Awareness -> Controversial

Replies from: ChristianKl
comment by ChristianKl · 2022-08-13T07:47:24.292Z · LW(p) · GW(p)

Telling businesses that they have to make expensive investments to cut down mercury pollution is also negatively affecting a lot of businesses. 

Replies from: FinalFormal2
comment by FinalFormal2 · 2022-08-14T22:45:02.532Z · LW(p) · GW(p)

Which companies, and to what extent? My internal model says that this is as simple as telling them they have to contract with somebody to dispose it properly.

Fossil fuels are a billion times more fundamental to our economy than mercury.

Also mercury pollution is much more localized with clear, more immediate consequences than CO2 pollution. It doesnt suffer from any 'common good' problems.

I don't understand your model of this at all, do you think if CO2 wasn't a controversial topic, we could just raise gas taxes and people would be fine? Or do you think it would rapidly revert to being a controversial topic?

"Don't be afraid to say 'oops' and change your mind"

Replies from: ChristianKl
comment by ChristianKl · 2022-08-15T07:43:35.787Z · LW(p) · GW(p)

Which companies, and to what extent? My internal model says that this is as simple as telling them they have to contract with somebody to dispose it properly.

Electric utilities. Coal plants produced a lot of mercury pollution and adding filters cost money. Given that burning fossil fuels cause the most mercury pollution it's a really strange argument to treat that as something separate from mercury pollution. 

Also mercury pollution is much more localized with clear, more immediate consequences than dealCO2 pollution. It doesnt suffer from any 'common good' problems.

Do you think that lowered childhood IQ isn't a common good issue? That seems like a pretty a strange argument. 

I don't understand your model of this at all, do you think if CO2 wasn't a controversial topic, we could just raise gas taxes and people would be fine?

I don't think "just raise gas taxes" is an effective method to dealing with the issue. As an aside, just because the German public cares very much about CO2 doesn't mean that our government stops subventioning long commutes to work. It doesn't stop our government either from shutting down our nuclear power plants. 

The Kyoto Protocol was negotiated fine in an environment of little public pressure. 

I do agree with the sentiment that it's important to discuss solutions to reducing carbon emissions sector by sector. If there would have been less public pressure, I do think it's plausible that expert conference discussions would have been more willing to focus on individual sectors and discuss what's needed in those. 

The kind of elite consensus that brought the Kyoto Protocol could also have had a chance to create cap-and-trade. 

answer by otto.barten · 2022-08-21T10:54:20.474Z · LW(p) · GW(p)

This is what we are doing with the Existential Risk Observatory. I agree with many of the things you're saying.

I think it's helpful to debunk a few myths:

- No one has communicated AI xrisk to the public debate yet. In reality, Elon Musk, Nick Bostrom, Stephen Hawking, Sam Harris, Stuart Russell, Toby Ord and recently William MacAskill have all sought publicity with this message. There are op-eds in the NY Times, Economist articles, YouTube videos and Ted talks with millions of views, a CNN item, at least a dozen books (including for a general audience), and a documentary (incomplete overview here). AI xrisk communication to the public debate is not new. However, the public debate is a big place and when compared to e.g. climate, coverage of AI xrisk is still minimal (perhaps a few articles per year in a typical news outlet, compared to dozens to hundreds for climate).
- AI xrisk communication to the public debate is easy, we could just 'tell people'. If you actually try this, you will quickly find out public communication, especially of this message, is a craft. If you make a poor quality contribution or your network is insufficient, it will probably never make it out. If your message does make it out, it will probably not be convincing enough to make most media consumers believe AI xrisk is an actual thing. It's not necessarily easier to convince a member of the general public of this idea than it is to convince an expert, and we can see from the case of Carmack and many others how difficult this can be. Arguably, LW and EA are the only places where this has really been successful so far.
- AI xrisk communication is really dangerous and it's easy to irreversibly break things. As can easily be seen from the wealth of existing communication and how little that did, it's really hard to move the needle significantly on the topic. That cuts both ways: it's, fortunately, not easy to really break something with your first book or article, simply because it won't convince enough people. That means there's some room to experiment. However, it's also, unfortunately, fairly hard to make significant progress here without a lot of time, effort, and budget.

We think communication to the public debate is net positive and important, and a lot of people could work on this who could not work on AI alignment. There is an increasing amount of funding available as well. Also, despite the existing corpus, the area is still neglected (we are to our knowledge the only institute that specifically aims to work on this issue).

If you want to work on this, we're always available for a chat to exchange views. EA is also starting to move in this direction [EA · GW], good to compare notes with them as well.

comment by FinalFormal2 · 2022-08-24T03:20:18.974Z · LW(p) · GW(p)

Thank you very much for this response!

answer by JakubK (jskatt) · 2022-08-11T04:30:04.266Z · LW(p) · GW(p)

Campaigns for general "public awareness" seem less effective than communicating with particular groups of people since some groups of people are more influential than others for AGI risk. The  "AGI Safety Communications Initiative" [EA · GW] is a group of people thinking about effective communication.

In terms of telling your favorite streamer about AGI risk, the best approach depends on the person. Think about what arguments will make sense to them. Definitely check out "Resources I send to AI researchers about AI safety [EA · GW]."

It seems like this might be a result of an aversion to bad press, but the truth is that bad press would be significantly better than what we have now. As far as I can see we have no press.

There has definitely been some critical press. Check out Steven Pinker in Popular Science (which Rob Miles responded to). Or perhaps this NYT Opinion piece by Melanie Mitchell (note there's also a debate between her and Stuart Russell [LW · GW]). Also see Ted Chiang (Scott Alexander responded) and Daron Acemoglu (Scott Alexander responded again).

No comments

Comments sorted by top scores.