Encultured AI, Part 1 Appendix: Relevant Research Examples

post by Andrew_Critch, Nick Hay (nickjhay) · 2022-08-08T22:44:50.375Z · LW · GW · 1 comments

Contents

  Appendix 1: “Trending” AI x-safety research areas
  Appendix 2: “Emerging” AI x-safety research areas
    Cooperative AI
    Multi-stakeholder control of AI systems
    Culturally-grounded AI
None
1 comment

Also available on the EA Forum.
Appendix to: Encultured AI, Part 1: Enabling New Benchmarks [LW · GW]
Followed by: Encultured AI, Part 2: Providing a Service

We mentioned a few areas of “trending” AI x-safety research above; below are some more concrete examples of what we mean:

Appendix 2: “Emerging” AI x-safety research areas

In this post, we classified cooperative AI and multi-stakeholder control of AI systems as “emerging” topics in AI x-safety.  Here’s more about what we mean, and why:

Cooperative AI

This area is “emerging” in x-safety because there’s plenty of attention to the issue of cooperation from both policy-makers and AI researchers, but not yet much among folks focused on x-risk.

Existential safety attention on cooperative AI:

AI research on cooperative AI:

AI research motivated by x-safety, on cooperative AI:

Multi-stakeholder control of AI systems

This area is “emerging” in x-safety because there seems to be attention to the issue of multi-stakeholder control from both policy-makers and AI researchers, but not yet much among AI researchers overtly attentive to x-risk:

Existential safety attention on multi-stakeholder control of AI:

Many authors and bloggers discuss the problem of aligning AI systems with the values of humanity-as-a-whole, e.g., Eliezer Yudkowsky’s coherent extrapolated volition concept.  However, these discussions have not culminated in practical algorithms for sharing control of AI systems, unless you count the S-process algorithm for grant-making or the Robust Rental Harmony algorithm for rent-sharing, which are not AI systems by most standards.

Also, AI policy discussions surrounding existential risk frequently invoke the importance of multi-stakeholder input into human institutions involved in AI governance (as do discussions of governance on all topics), such as:

However, so far there has been little advocacy in x-safety for AI technologies to enable multi-stakeholder input directly into AI systems, with the exception of:

The following position paper is not particularly x-risk themed, but is highly relevant:

Computer science research on multi-stakeholder control of decision-making:

There is a long history of applicable research on the implementation of algorithms for social choice, which could be used to share control of AI systems in various ways, but most of this work does not come from sources overtly attentive to existential risk:

AI research on multi-stakeholder control of AI systems is sparse, but present.  Notably, Ken Goldberg’s “telegardening” platform allows many web users to simultaneously control a gardening robot: https://goldberg.berkeley.edu/garden/Ars/ 

AI research motivated by x-safetyon multi-stakeholder control of AI is hard to find.  Critch has worked on a few papers on negotiable reinforcement learning (Critch, 2017aCritch, 2017bDesai, 2018Fickinger, 2020).  MIRI researcher Abram Demski has a blog post on comparing utility functions across agents, which is a highly relevant to aggregating preferences (Demski, 2020 [AF · GW])

AI x-safety research on multi-stakeholder control of AI — i.e., technical research directly assessing the potential efficacy of AI control-sharing mechanisms in mitigating x-risk — basically doesn’t exist.
 

Culturally-grounded AI

 This area is missing in technical AI x-safety research, but has received existential safety attention, AI research attention, as well as considerable attention in public discourse:

*** END APPENDIX ***

Followed by: Encultured AI, Part 2: Providing a Service

1 comments

Comments sorted by top scores.

comment by aviv (avivo) · 2022-08-09T04:37:08.192Z · LW(p) · GW(p)

Thanks for sharing this well-organized appendix and links!

As someone working on ~ the multi-stakeholder problem (likely closest to multi/single in ARCHES), it's interesting to have a summary of what you see the most relevant research being.