william-d-alessandro

Posts
Comments

Posts

Safe AI and moral AI 2023-06-01T21:36:44.260Z

Is Deontological AI Safe? [Feedback Draft] 2023-05-27T16:39:25.556Z

Comments

Comment by William D'Alessandro (william-d-alessandro) on Wei Dai's Shortform · 2024-08-27T19:31:42.533Z · LW · GW

Another academic philosopher, directed here by @Simon Goldstein. Hello Wei!

It's not common to switch entirely to metaphilosophy, but I think lots of us get more interested in the foundations and methodology of at least our chosen subfields as we gain experience, see where progress is(n't) being made, start noticing deep disagreements about the quality of different kinds of work, and so on. It seems fair to describe this as awakening to a need for better tools and a greater understanding of methods. I recently wrote a paper about the methodology of one of my research areas, philosophy of mathematical practice, for pretty much these reasons.
Current LLMs are pretty awful at discussing the recent philosophy literature, so I think anyone who'd like AI tools to serve as useful research assistants would be happy to see at least some improvement here! I'm personally also excited about the prospects of using language models with bigger context windows for better corpus analysis work in empirical and practice-oriented parts of philosophy.
I basically agree with Simon on this.
I don't think this is uncommon. You might not see these reversals in print often, because nobody wants to publish and few people want to read a paper that just says "I retract my previous claims and no longer have a confident positive view to offer". But my sense is that philosophers often give up on projects because the problems are piling up and they no longer see an appealing way forward. Sometimes this happens more publicly. Hilary Putnam, one of the most influential philosophers of the later 20th century, was famous for changing his mind about scientific realism and other basic metaphysical issues. Wesley Salmon gave up his influential "mark transmission" account of causal explanation due to counterexamples raised by Kitcher (as you can read here). It would be easy enough to find more examples.

Comment by William D'Alessandro (william-d-alessandro) on Is Deontological AI Safe? [Feedback Draft] · 2023-06-05T22:31:20.926Z · LW · GW

Glad to have this flagged here, thanks. As I've said to @Chipmonk privately, I think this sort of boundaries-based deontology shares lots of DNA with the libertarian deontology tradition, which I gestured at in the last footnote. (See https://plato.stanford.edu/entries/ethics-deontological/#PatCenDeoThe for an overview.) Philosophers have been discussing this stuff at least since Nozick in the 1970s, so there's lots of sophisticated material to draw on -- I'd encourage boundaries/membranes fans to look at this literature before trying to reinvent everything from scratch.

The SEP article on republicanism also has some nice discussion of conceptual questions about non-interference and non-domination (https://plato.stanford.edu/entries/republicanism), which I think any approach along these lines will have to grapple with.

@Andrew_Critch and @davidad, I'd be interested in hearing more about your respective boundaritarian versions of deontology, especially with respect to AI safety applications!

Comment by William D'Alessandro (william-d-alessandro) on Is Deontological AI Safe? [Feedback Draft] · 2023-06-01T18:53:58.473Z · LW · GW

A little clunky, but not bad! It's a good representation of the overall structure if a little fuzzy on certain details. Thanks for trying this out. I should have included a summary at the start -- maybe I can adapt this one?

Comment by William D'Alessandro (william-d-alessandro) on Is Deontological AI Safe? [Feedback Draft] · 2023-05-28T17:00:36.767Z · LW · GW

Lots of good stuff here, thanks. I think most of this is right.

Agreed about powerful AI being prone to unpredictable rules-lawyering behavior. I touch on this a little in the post, but I think it's really important that it's not just the statements of the rules that determine how a deontological agent acts, but also how the relevant (moral and non-moral) concepts are operationalized, how different shapes and sizes of rule violation are weighted against each other, how risk and probability are taken into account, and so on. With all those parameters in play, we should have a high prior on getting weird and unforeseen behavior.
Also agreed that you can mitigate many of these risks if you've got a weak deontological agent with only a few behavior-guiding parameters and a limited palette of available actions.
My impression of the AIs value alignment literature is that it's actually quite diverse. There are some people looking at deontological approaches using top-down rules, and some people who take moral uncertainty or pluralism seriously and think we should at least include deontology in our collection of potential moral alignment targets. (Some of @Dan H 's work falls into that second category, e.g. this paper and this one.) In general, I think the default to utilitarianism probably isn't as automatic among AI safety and ethics researchers as it is in LW/EA circles.

Comment by William D'Alessandro (william-d-alessandro) on Is Deontological AI Safe? [Feedback Draft] · 2023-05-28T15:48:17.834Z · LW · GW

Excellent, thanks! I was pretty confident that some other iterations of something like these ideas must be out there. Will read and incorporate this (and get back to you in a couple days).

User info

Posts

Comments