Posts

Comments

Comment by Eugene D (eugene-d) on Superintelligent AI is necessary for an amazing future, but far from sufficient · 2022-11-08T23:52:49.059Z · LW · GW

Is super-intelligent AI necessarily AGI (for this amazing future), or can it be ANI ? 

i.e. why insist on all of the work-arounds we force with pursuing AGI, when, with ANI, don't we already have Safety, Alignment, Corrigibility, Reliability, and super-human ability, today? 

Eugene

Comment by Eugene D (eugene-d) on Where I agree and disagree with Eliezer · 2022-06-29T12:23:58.129Z · LW · GW

OK thanks, I guess I missed him differentiating between 'solve alignment first, then trust', versus 'trusting first, given enough intelligence'.  Although I think one issue w/having a proof is that we (or a million monkeys, to paraphrase him) still won't understand the decisions of the AGI...?  ie we'll be asked to trust the prior proof instead of understanding the logic behind each future decision/step which the AGI takes? That also bothers me, because, what are the tokens which comprise a "step"?  Does it stop 1,000 times to check with us that we're comfortable with, or understand, its next move? 

However, since, it seems, we can't explain much of the decisions of our current ANI, how do we expect to understand future ones? He mentions that we may be able to, but only by becoming trans-human.  

:) 

Comment by Eugene D (eugene-d) on Where I agree and disagree with Eliezer · 2022-06-23T22:17:40.183Z · LW · GW

Thank you--btw before I try responding to other points, here's the Ben G vid to which I'm referring.  Starting around 52m, for a few minutes, for that particular part anyway:

Comment by Eugene D (eugene-d) on Where I agree and disagree with Eliezer · 2022-06-23T14:28:24.659Z · LW · GW

I've heard a few times that AI experts both 1) admit we don't know much about what goes on inside, even as it stands today, and 2) we expect to extend more trust to the AI even as capabilities increase (most recently Ben Goertzel).  

I'm curious to know if you expect explainability to increase in correlation with capability?  i.e. or can we use Ben's analogy that 'I expect my dog to trust me, both bc I'm that much smarter, and I have a track-record of providing food/water for him' ?

thanks!

Eugene

Comment by Eugene D (eugene-d) on Where I agree and disagree with Eliezer · 2022-06-23T00:26:05.996Z · LW · GW

I wonder when Alignment and Capability will finally be considered synonymous, so that the efforts merge into one -- bc that's where any potential AI-safety lives, I would surmise. 

Comment by Eugene D (eugene-d) on Where I agree and disagree with Eliezer · 2022-06-22T23:38:49.159Z · LW · GW

I for one really appreciate the 'dumb-question' area :) 

Comment by Eugene D (eugene-d) on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-20T12:39:45.235Z · LW · GW

When AI experts call upon others to ponder, as EY just did, "[an AGI] meant to carry out some single task" (emphasis mine), how do they categorize all the other important considerations besides this single task?  

Or, asked another way, where do priorities come into play, relative to the "single" goal?  e.g. a human goes to get milk from the fridge in the other room, and there are plentiful considerations to weigh in parallel to accomplishing this one goal -- some of which should immediately derail the task due to priority (I notice the power is out, i stub my toe, someone specific calls for me with a sense of urgency from a different room, etc, etc). 

And does this relate at all to our understanding of how to make AGI corrigible? 

many thanks,

Eugene

https://www.lesswrong.com/posts/AqsjZwxHNqH64C2b6/let-s-see-you-write-that-corrigibility-tag

Comment by Eugene D (eugene-d) on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-17T18:25:59.141Z · LW · GW

Does this remind you of what I'm trying to get at?  bc it sure does, to me:

https://twitter.com/ESYudkowsky/status/1537842203543801856?s=20&t=5THtjV5sUU1a7Ge1-venUw

but I'm prob going to stay in the "dumb questions" area and not comment :) 

ie. "the feeling I have when someone tries to teach me that human-safety is orthogonal to AI-Capability -- in a real implementation, they'd be correlated in some way" 

Comment by Eugene D (eugene-d) on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-17T14:33:52.499Z · LW · GW

That makes sense.  My intention was not to argue from the position of it becoming a psychopath though (my apologies if it came out that way)...but instead from a perspective of an entity which starts-out as supposedly Aligned (centered-on human safety, let's say), but then, bc it's orders of magnitude smarter than we are (by definition), it quickly develops a different perspective.  But you're saying it will remain 'aligned' in some vitally-important way, even when it discovers ways the code could've been written differently? 

Comment by Eugene D (eugene-d) on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-17T12:42:55.051Z · LW · GW

thank you.  Make some sense...but does "rewriting its own code" (the very code we thought would perhaps permanently influence it before it got-going) nullify our efforts at hardcoding  our intentions? 

Comment by Eugene D (eugene-d) on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-17T12:21:56.559Z · LW · GW

Why do we suppose it is even logical that control / alignment of a superior entity would be possible?  

(I'm told that "we're not trying to outsmart AGI, bc, yes, by definition that would be impossible", and I understand that we are the ones who "create it" (so I'm told, therefore, we have the upper-hand bc of this--somehow in building it that provides the key benefit we need for corrigibility... 

What am I missing, in viewing a superior entity as something you can't simply "use" ?  Does it depend on the fact that the AGI is not meant to have a will like humans do, and therefore we wouldn't be imposing upon it?  But doesn't that go out the window the moment we provide some goal for it to perform for us? 

thanks much! 

Comment by Eugene D (eugene-d) on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-14T01:55:24.423Z · LW · GW

YES. 

You are a gentleman and a scholar for taking the time on this.  I wish I could've explained it more clearly from the outset.  

Comment by Eugene D (eugene-d) on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-13T17:45:32.129Z · LW · GW

https://twitter.com/KerryLVaughan/status/1536365808594608129?s=20&t=yTDds2nbg4F4J3wqXbsbCA

Comment by Eugene D (eugene-d) on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-13T17:35:48.692Z · LW · GW

Yes you hit the nail on the head understanding my point, thank you.  I also think this is what Yann is saying, to go out on a limb:  He's doing AI-safety simultaneously, he considers alignment AS safety.  

I guess, maybe, I can see how the 2nd take could be true..but I also can't think of a practical example, which is my sticking point.  Of course, a bomb which can blow-up the moon is partly "capable", and there is partial-progress to report --but only if we judge it based on limited factors, and exclude certain essential ones (e.g. navigation).  I posit we will never avoid judging our real inventions based on what I'd consider essential output:  

"Will it not kill us == Does it work?" 

It's a theory, but:  I think AI-safety ppl may lose the argument right away, and can sadly be an afterthought (that's what I'm told, by them), because they are allowing others to define "intelligence/capability" to be free from normal human concerns about our own safety...like I said before, others can go their merry-way making stuff more powerful, calling it "progress", calling it higher-IQ...but I don't see how that should earn Capability.  

Comment by Eugene D (eugene-d) on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-13T13:52:37.254Z · LW · GW

I'm sorry, I think i misspoke--I agree with all that you said about it being different.  But when I've attempted to question the Orthogonality of safety with AI-safety experts, it seems as if I was told that safety is independent of capability.  First, I think this is a reason why AI-Safety has been relegated to 2nd-class status...and second, I can't see why it is not, like I think Yann puts it, central to any objective (i.e. an attribute of competency/intelligence) we give to AGI  (presuming we are talking about real-world goals and not just theoretical IQ points) 

so to reiterate I do indeed agree that we need to (somehow, can't see how, or even why we'd take these risks, honestly) get it right every time including the first time--despite Yann's plan to build-in correction mechanisms, post-failure, or build-in common-sense safety into the objective itself

Comment by Eugene D (eugene-d) on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-13T12:51:09.097Z · LW · GW

when i asked about singling-out safety, i agree about it being considered different, however, what i meant:  why wouldn't safety be considered as 'just another attribute' by which we can judge the success/intelligence of the AI ?  that's what Yann seems to be implying?  how could it be considered orthogonal to the real issue--we judge the AI by its actions in the real world, the primary concern is its effect on humanity, and we consider those actions on a scale of intelligence, and every goal (I would presume) has some semblance of embedded safety consideration... 

Comment by Eugene D (eugene-d) on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-12T18:30:13.289Z · LW · GW

Strictly speaking about superhuman AGI:  I believe you summarize the relative difficulty / impossibility of this task :)  I can't say I agree that the goal is void of human-values though (I'm talking about safety in particular--not sure if that's make a difference?) --seems impractical right from the start? 

I also think these considerations seem manageable though, when considering the narrow AI that we are producing as of today.  But where's the appetite to continue on the ANI road? I can't really believe we wouldn't want more of the same, in different fields of endeavor... 

Comment by Eugene D (eugene-d) on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-12T01:19:38.899Z · LW · GW

great stuff.

I'm saying that no one is asking that safety be smuggled-in, or obtained "for free", or by default --I'm curious why it would be singled-out for the Thesis, when it's always a part of any goal, like any other attribute of the goal in question?  if it fails to be safe then it fails competency to perform properly...whether it swerved into another lane on the highway, or it didn't brake fast enough and hit someone, both not smart things. 

"the smarter the AI is, the safer it becomes" -- eureka, but this seems un-orthogonal, dang-near correlated, all of a sudden, doesn't it?  :) 

Yes, I agree about the maximizer and subtle failures, thanks to Rob Miles vids about how this thing is likely to fail, ceteris paribus.

"the smarter the AI gets, the more likely it is to think of something like this..."  -- this seems to contradict the above quote.  Also, I would submit that we actually call this incompetence...and avoid saying that it got any "smarter" at all, because:  One of the critical parts of its activity was to understand and perform what we intended, which it failed to do. 

FAIR simply must be already concerned with alignment issues, and the correlated safety-risks if that fails.  Their grand plans will naturally fail if safety is not baked-into everything, right? 

I'm getting dangerously close to admitting that I don't like the AGI-odds here, but that's, well, an adjacent topic. 

Comment by Eugene D (eugene-d) on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-12T00:24:32.403Z · LW · GW

OK again I'm a beginner here so pls correct me, I'd be grateful:

I would offer that any set of goals given to this AGI would include the safety-concerns of humans.  (Is this controversial?)  Not theoretical intelligence for a thesis, but AGI acting in the world with the ability to affect us.  Because of the nature of our goals, it doesn't even seem logical to say that the AGI has gained more intelligence without also gaining an equal amount of safety-consciousness.  

e.g. it's either getting better at safely navigating the highway, or it's still incompetent at driving.  

Out on a limb:  Further, bc orthogonality seems to force the separation between safety and competency, you have EY writing various intense treatises in the vain hopes that FAIR / etc will merely pay attention to safety-concerns.   This just seems ridiculous, so there must be a reason, and my wild theory is that Orthogonality provides the cover needed to charge ahead with a nuke you can't steer--but it sure goes farther and faster every month, doesn't it?

(Now I'm guessing, and this can't be true, but then again, why would EY say what he said about FAIR?)  But they go on their merry-way because they think, "the AI is increasingly competent...no need to concerns ourselves with 'orthogonal' issues like safety". 

Respectfully,

Eugene

Comment by Eugene D (eugene-d) on AGI Safety FAQ / all-dumb-questions-allowed thread · 2022-06-11T10:13:46.864Z · LW · GW

Why does EY bring up "orthogonality" so early, and strongly ("in denial", "and why they're true") ?  Why does it seem so important that it be accepted?   thanks!