Posts

Reducing the risk of catastrophically misaligned AI by avoiding the Singleton scenario: the Manyton Variant 2023-08-06T14:24:04.774Z

Comments

Comment by GravitasGradient (Bll) on autonomy: the missing AGI ingredient? · 2022-05-29T18:22:13.583Z · LW · GW

Perhaps "agency" is a better term here? In the strict sense of an agent acting in an environment?

And yeah, it seems we have shifted focus away from that.

Thankfully, thanks to our natural play instincts, we have a wonderful collection of ready made training environments: I think the field needs a new challenge of an agent playing video games, only receiving instructions of what to do using natural language.

Comment by GravitasGradient (Bll) on Another (outer) alignment failure story · 2021-08-01T17:11:50.873Z · LW · GW

These stories always assume that an AI would be dumb enough to not realise the difference between measuring something and the thing measured.

Every AGI is a drug addict, unaware that it's high is a false one.

Why? Just for drama?

Comment by GravitasGradient (Bll) on [AN #136]: How well will GPT-N perform on downstream tasks? · 2021-02-04T13:41:27.934Z · LW · GW

The predicted cost for GPT-N parameter improvements is for the "classical Transformer" architecture? Recent updates like the Performer should require substantially less compute and therefore cost.

Comment by GravitasGradient (Bll) on What do we *really* expect from a well-aligned AI? · 2021-01-06T10:48:59.935Z · LW · GW

but indeed human utility functions will have to be aggregated in some manner

I do not see why that should be the case? Assuming virtual heavens, why couldn't each individuals personal preferences be fullfilled?

Comment by GravitasGradient (Bll) on To what extent is GPT-3 capable of reasoning? · 2020-07-22T07:42:30.382Z · LW · GW

It seems pretty undeniable to me from these examples that GPT-3 can reason to an extend.

However, it can't seem to do it consistently.

Maybe analogous to people with mental and/or brain issues that have times of clarity and times of confusion?

If we can find a way to isolate the pattern of activity in GPT-3 that relates to reasoning we might be bale to enforce that state permanently?