Safe AI and moral AI 2023-06-01T21:36:44.260Z
Is Deontological AI Safe? [Feedback Draft] 2023-05-27T16:39:25.556Z


Comment by William D'Alessandro (william-d-alessandro) on Is Deontological AI Safe? [Feedback Draft] · 2023-06-05T22:31:20.926Z · LW · GW

Glad to have this flagged here, thanks. As I've said to @Chipmonk privately, I think this sort of boundaries-based deontology shares lots of DNA with the libertarian deontology tradition, which I gestured at in the last footnote. (See for an overview.) Philosophers have been discussing this stuff at least since Nozick in the 1970s, so there's lots of sophisticated material to draw on -- I'd encourage boundaries/membranes fans to look at this literature before trying to reinvent everything from scratch. 

The SEP article on republicanism also has some nice discussion of conceptual questions about non-interference and non-domination  (, which I think any approach along these lines will have to grapple with.

@Andrew_Critch and @davidad, I'd be interested in hearing more about your respective boundaritarian versions of deontology, especially with respect to AI safety applications!

Comment by William D'Alessandro (william-d-alessandro) on Is Deontological AI Safe? [Feedback Draft] · 2023-06-01T18:53:58.473Z · LW · GW

A little clunky, but not bad! It's a good representation of the overall structure if a little fuzzy on certain details. Thanks for trying this out. I should have included a summary at the start -- maybe I can adapt this one?

Comment by William D'Alessandro (william-d-alessandro) on Is Deontological AI Safe? [Feedback Draft] · 2023-05-28T17:00:36.767Z · LW · GW

Lots of good stuff here, thanks. I think most of this is right.

  • Agreed about powerful AI being prone to unpredictable rules-lawyering behavior. I touch on this a little in the post, but I think it's really important that it's not just the statements of the rules that determine how a deontological agent acts, but also how the relevant (moral and non-moral) concepts are operationalized, how different shapes and sizes of rule violation are weighted against each other, how risk and probability are taken into account, and so on. With all those parameters in play, we should have a high prior on getting weird and unforeseen behavior. 
  • Also agreed that you can mitigate many of these risks if you've got a weak deontological agent with only a few behavior-guiding parameters and a limited palette of available actions.
  • My impression of the AIs value alignment literature is that it's actually quite diverse. There are some people looking at deontological approaches using top-down rules, and some people who take moral uncertainty or pluralism seriously and think we should at least include deontology in our collection of potential moral alignment targets. (Some of @Dan H 's work falls into that second category, e.g. this paper and this one.) In general, I think the default to utilitarianism probably isn't as automatic among AI safety and ethics researchers as it is in LW/EA circles.
Comment by William D'Alessandro (william-d-alessandro) on Is Deontological AI Safe? [Feedback Draft] · 2023-05-28T15:48:17.834Z · LW · GW

Excellent, thanks! I was pretty confident that some other iterations of something like these ideas must be out there. Will read and incorporate this (and get back to you in a couple days).