LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

← previous page (newer posts) · next page (older posts) →

Recent comments

bec-hawk on Stephen Fowler's Shortform

Instances in history in which private companies (or any individual humans) have intentionally turned down huge profits and power are the exception, not the rule.

OpenAI wasn’t a private company (ie for-profit) at the time of the OP grant though.

seth-herd on Instruction-following AGI is easier and more likely than value aligned AGI

In the near term AI and search are blurred, but that's a separate topic. This post was about AGI as distinct from AI. There's no sharp line between but there are important distinctions, and I'm afraid we're confused as a group because of that blurring. More above [LW(p) · GW(p)], and it's worth its own post and some sort of new clarifying terminology. The term AGI has been watered down to include LLMs that are fairly general, rather than the original and important meaning of AI that can think about anything, implying the ability to learn, and therefore almost necessarily to have explicit goals and agency. This was about that type of "real" AGI, which is still hypothetical even though increasingly plausible in the near term.

alex_altair on Fund me please - I Work so Hard that my Feet start Bleeding and I Need to Infiltrate University

Hey Johannes, I don't quite know how to say this, but I think this post is a red flag about your mental health. "I work so hard that I ignore broken glass and then walk on it" is not healthy.

I've been around the community a long time and have seen several people have psychotic episodes. This is exactly the kind of thing I start seeing before they do.

I'm not saying it's 90% likely, or anything. Just that it's definitely high enough for me to need to say something. Please try to seek out some resources to get you more grounded.

arthur-conmy on Language Models Model Us

They emailed some people about this: https://x.com/brianryhuang/status/1763438814515843119

The reason is that it may allow unembedding matrix weight stealing: https://arxiv.org/abs/2403.06634

seth-herd on Instruction-following AGI is easier and more likely than value aligned AGI

Yes, we do see such "values" now, but that's a separate issue IMO.

There's an interesting thing happening in which we're mixing discussions of AI safety and AGI x-risk. There's no sharp line, but I think they are two importantly different things. This post was intended to be about AGI, as distinct from AI. Most of the economic and other concerns relative to the "alignment" of AI are not relevant to the alignment of AGI.

This thesis could be right or wrong, but let's keep it distinct from theories about AI in the present and near future. My thesis here (and a common thesis) is that we should be most concerned about AGI that is an entity with agency and goals, like humans have. AI as a tool is a separate thing. It's very real and we should be concerned with it, but not let it blur into categorically distinct, goal-directed, self-aware AGI.

Whether or not we actually get such AGI is an open question that should be debated, not assumed. I think the answer is very clearly that we will, and soon; as soon as tool AI is smart enough, someone will make it agentic, because agents can do useful work, and they're interesting. So I think we'll get AGI with real goals, distinct from the pseudo-goals implicit in current LLMs behavior.

The post addresses such "real" AGI that is self-aware and agentic, but that has the sole goal of doing what people want is pretty much a third thing that's somewhat counterintuitive.

bec-hawk on Ilya Sutskever and Jan Leike resign from OpenAI [updated]

Is that not what Altman is referring to when he talks about vested equity? My understanding was employees had no other form of equity besides PPUs, in which case he’s talking non-misleadingly about the non-narrow case of vested PPUs, ie the thing people were alarmed about, right?

linch on Ilya Sutskever and Jan Leike resign from OpenAI [updated]

OpenAI has something called PPUs ("Profit Participation Units") which in theory is supposed to act like RSUs albeit with a capped profit and no voting rights, but in practice is entirely a new legal invention and we don't really know how it works.

ryan_greenblatt on DeepMind's "Frontier Safety Framework" is weak and unambitious

I agree with 1 and think that race dynamics makes the situation considerably worse when we only have access to prosaic approaches. (Though I don't think this is the biggest issue with these approaches.)

I think I expect a period substantially longer than several months by default due to slower takeoff than this. (More like 2 years than 2 months.)

Insofar as the hope was for governments to step in at some point, I think the best and easiest point for them to step in is actually during the point where AIs are already becoming very powerful:

Prior to this point, we don't get substantial value from pausing, especially if we're pausing/dismantling all of semi-conductor R&D globally.
Prior to this point AI won't be concerning enough for governments to take agressive action.
At this point, additional time is extremely useful due to access to powerful AIs.
The main counterargument is that at this point more powerful AI will also look very attractive. So, it will seem too expensive to stop.

So, I don't really see very compelling alternatives to push on at the margin as far as "metastrategy" (though I'm not sure I know exactly what you're pointing at here). Pushing for bigger asks seems fine, but probably less leveraged.

I actually don't think control is a great meme for the interests of labs which purely optimize for power as it is a relatively legible ask which is potentially considerably more expensive than just "our model looks aligned because we red teamed it" which is more like the default IMO.

The same way "secure these model weights from China" isn't a great meme for these interests IMO.

bec-hawk on Ilya Sutskever and Jan Leike resign from OpenAI [updated]

What do you mean by pseudo-equity?

david-ring on Using GPT-3 for preventing conflict during messaging — a pitch for an app

I love this and would love to collaborate. I've been thinking of more of a training system, social connection, and inner exploration app. I do think one issue with blindly translating into nvc is we don't develop the ability to process our own feelings that come up.

LessWrong 2.0 Reader

Archive

Recent comments