Model-free decisions

post by paulfchristiano · 2014-12-02T17:39:04.000Z · LW · GW · 4 comments

Much concern about AI comes down to the scariness of goal-oriented behavior. A common response to such concerns is "why would we give an AI goals anyway?" I think there are good reasons to expect goal-oriented behavior, and I've been on that side of a lot of arguments. But I don't think the issue is settled, and it might be possible to get better outcomes by directly specifying what actions are good. I flesh out one possible alternative here.

(As an experiment I wrote the post on medium, so that it is easier to provide sentence-level feedback, especially feedback on writing or low-level comments. Big-picture discussion should probably stay here.)

4 comments

Comments sorted by top scores.

comment by IAFF-User-34 (Imported-IAFF-User-34) · 2015-07-03T22:45:41.000Z · LW(p) · GW(p)

Despite this essay's age, I was linked by Structural Risk Minimization and felt I had to address some points you made. I think you may have dismissed a strawman of consequence-approval direction, and then later used a more robust version on your reasoning while avoiding those terms. See my comments on the essay.

comment by danieldewey · 2014-12-13T12:25:49.000Z · LW(p) · GW(p)

It seems that if it is desired, the overseer could also set their behaviour and intentions so that the approval-directed agent acts as we would want an oracle or tool to act. This is a nice feature.

comment by danieldewey · 2014-12-03T14:15:33.000Z · LW(p) · GW(p)

I think Nick Bostrom and Stuart Armstrong would also be interested in this, and might have good feedback for you.

comment by danieldewey · 2014-12-03T14:12:49.000Z · LW(p) · GW(p)

High-level feedback: this is a really interesting proposal, and looks like a promising direction to me! Most of my inline comments on Medium are more critical, but that doesn't reflect my overall assessment.