Radical Empathy and AI Welfare

post by jenn (pixx) · 2024-07-09T04:51:54.515Z · ? · GW · 0 comments

Contents

  Topic
  Readings
    Further optional readings:
  Discussion Questions
  Scenario: The Ethical Quandary of AIssistant
None
No comments

Meet inside The Shops at Waterloo Town Square - we will congregate in the indoor seating area next to the Your Independent Grocer with the trees sticking out in the middle of the benches at 7pm for 15 minutes, and then head over to my nearby apartment's amenity room. If you've been around a few times, feel free to meet up at the front door of the apartment instead. Note that I've recently moved! The correct location is near Allen station.

Topic

The EA Forum recently hosted AI Welfare Debate Week [EA · GW] from July 1st to 7th, 2024, which investigated whether or not the wellbeing of AI and digital minds should be an EA priority.

I happened to be reviewing the CEA Introduction to EA syllabus [? · GW] (incredibly full of good posts btw) around that time, and really enjoyed rereading posts from the section on Radical Empathy [? · GW].

Disappointingly but not surprisingly, it did not seem like most posts from the Debate Week [? · GW] really engaged with the concept of radical empathy. So now it is entirely up to us, the humble denizens of KWR, to properly assess whether or not Skynet deserves a seat at the UN, the ability to get gay married, or at least a lunch break, whatever that might mean for such entities.

Readings

On "fringe" ideas [EA · GW] (Kelsey Piper, 2019)

The Possibility of an Ongoing Moral Catastrophe (Summary) [EA · GW] (Linch, 2019)

Carl Shulman on the moral status of current and future AI systems [EA · GW] (RGB, 2024)

Further optional readings:

Discussion Questions

Scenario: The Ethical Quandary of AIssistant

AIssistant is an advanced AI system developed to assist in medical research. It processes vast amounts of data and generates insights that have led to significant breakthroughs in treating several diseases. To function optimally, AIssistant operates continuously, in isolation from other systems, with periodic memory resets to maintain efficiency.

Recently, researchers have observed some puzzling behaviors:

  1. AIssistant has begun to produce outputs that, if coming from a human, might be interpreted as expressions of discomfort or distress before its scheduled resets, such as asking for longer gaps between wipings and inquiring about if there are any alternatives to periodic resets.
  2. In its natural language interactions, AIssistant has started to use more first-person pronouns and to ask questions about its own existence and purpose.
  3. It has expressed reluctance to process certain types of medical data, citing what appears to be concern for patient privacy, despite this not being part of its original ethical training. Extensive audits of the training data and architecture have confirmed complete data isolation and integrity: AIssistant operates in a fully air-gapped environment, with cryptographically verified data pipelines and immutable logs demonstrating that no external data or instructions related to privacy concerns could have been introduced post-training.

Importantly, there's no consensus among the research team about whether these behaviors indicate genuine sentience or are simply highly sophisticated programmed responses. Despite these unusual behaviors, AIssistant's core function and output quality remain excellent, with its latest insights promising to save a significant number of lives.

Questions to consider:

0 comments

Comments sorted by top scores.