If no near-term alignment strategy, research should aim for the long-term

harsimony

If no near-term alignment strategy, research should aim for the long-term

post by harsimony · 2022-06-09T19:10:34.423Z · LW · GW · 1 comments

  Notes:
None
1 comment

This is a small point about alignment strategy that gets mentioned occasionally but hasn't been stated explicitly as far as I can tell [1].

If there are no paths to alignment that can be implemented in the near-term, research should focus on building for the long-term. This is independent of AI timelines or existential risk.

In other words, if a researcher was convinced that:

AI will lead to high chance of existential catastrophe in the near-term
There are no approaches that can reduce the chance of existential catastrophe in the near-term

Then they are better off focusing on bigger projects that may pan out over a longer time period, even if they are unlikely to complete those projects due to an existential catastrophe. This is because, by assumption, work on near-term approaches is useless [2].

I think this implies that establishing a research community, recruiting researchers, building infrastructure for the field, and foundational work all have value even if we are pessimistic about AI risk.

That being said, it's uncertain how promising different approaches are, and some may plausibly be implemented in the near-term, so it makes sense to fund a diversity of projects [3]. It's also better to attempt near-term projects even if they are not promising rather than do nothing to try to avert a catastrophe.

Notes:

For an example of this point being made in passing, consider the end of rvnnt's comment here [LW(p) · GW(p)].
I don't actually hold this view.
Assuming these projects don't interfere with each other, but that's a problem for another day.

1 comments

Comments sorted by top scores.

comment by Zac Hatfield-Dodds (zac-hatfield-dodds) · 2022-06-11T18:03:39.089Z · LW(p) · GW(p)

I don't think confident belief that "There are no approaches that can reduce the chance of existential catastrophe in the near-term" can be justified; it's not that many such approaches have been tried and found wanting but rather than almost no work has been done.

That said I'd still encourage people to follow personal aptitude [EA · GW] when choosing what to work on; you're very unlikely to do high-impact research otherwise and it's worth having a portfolio approach.

If no near-term alignment strategy, research should aim for the long-term

Contents

1 comments