Jack O'Brien's Shortform

post by Jack O'Brien (jack-o-brien) · 2022-12-01T08:58:32.177Z · LW · GW · 2 comments

2 comments

Comments sorted by top scores.

comment by Jack O'Brien (jack-o-brien) · 2022-12-01T08:58:32.415Z · LW(p) · GW(p)

Let's be optimistic and prove that an agentic AI will be beneficial for the long-term future of humanity. We probably need to prove these 3 premises:

Premise 1:  Training story X will create an AI model which approximates agent formalism A
Premise 2: Agent formalism A is computable and has a set of alignment properties P
Premise 3: An AI with a set of alignment properties P will be beneficial for the long-term future.

Aaand so far I'm not happy with our answers to any of these.

Replies from: isabella-barber
comment by Isabella Barber (isabella-barber) · 2022-12-01T10:02:33.298Z · LW(p) · GW(p)

maybe there is no set of properties p that can produce alignment hmm