spencer-ericson

Posts
Comments

Posts

Comments

Comment by Spencer Ericson on Turing-Test-Passing AI implies Aligned AI · 2025-01-02T20:44:47.447Z · LW · GW

If the clone behaves indistinguishably from the human it is based on, then there is simply nothing more to say. It doesn't matter what is going on inside.

Right, I agree on that. The problem is, "behaves indistinguishably" for how long? You can't guarantee whether it will stop acting that way in the future, which is what is predicted by deceptive alignment.

Comment by Spencer Ericson on Turing-Test-Passing AI implies Aligned AI · 2025-01-01T22:24:49.150Z · LW · GW

It sounds like you're asking why inner alignment is hard (or maybe why it's harder than outer alignment?). I'm pretty new here -- I don't think I can explain that any better than the top posts in the tag.

Re: o1, it's not clear to me that o1 is an instantiation of a creator's highly specific vision. It seems more to me like we tried something, didn't know exactly where it would end up, but it sure is nice that it ended up in a useful place. It wasn't planned in advance exactly what o1 would be good at/bad at, and to what extent -- the way that if you were copying a human, you'd have to be way more careful to consider and copy a lot of details.

Comment by Spencer Ericson on Turing-Test-Passing AI implies Aligned AI · 2025-01-01T00:52:16.873Z · LW · GW

I would mostly disagree with the implication here:

IF you can make a machine that constructs human-imitator-AI systems,

THEN AI alignment in the technical sense is mostly trivialized and you just have the usual political human-politics problems plus the problem of preventing anyone else from making superintelligent black box systems.

I would say sure, it seems possible to make a machine that imitates a given human well enough that I couldn't tell them apart -- maybe forever! But just because it's possible in theory doesn't mean we are anywhere close to doing it, knowing how to do it, or knowing how to know how to do it.

Maybe an aside: If we could align an AI model to the values of like, my sexist uncle, I'd still say it was an aligned AI. I don't agree with all my uncle's values, but he's like, totally decent. It would be good enough for me to call a model like that "aligned." I don't feel like we need to make saints, or even AI models with values that a large number of current or future humans would agree with, to be safe.

Comment by Spencer Ericson on Counter-theses on Sleep · 2024-12-21T23:18:15.227Z · LW · GW

even if we only take people with bipolar disorder: how the hell can they go on so few number of hours a night with their brain being manic but not simply breaking down?

Just wanted to tune in on this from anecdotal experience:

My last ever (non-iatrogenic) hypomanic episode started unprompted. But I was terrified of falling back into depression again! My solution was to try to avoid the depression by extending my hypomania as long as possible.

How did I do this? By intentionally not sleeping and by drinking more coffee (essentially doing the opposite of whatever the internet said stabilized hypomanic patients). I had a strong intuition that this would work. (I also had a strong intuition that the depression later would be worse, but I figured I'd cross that bridge when I came to it, even though my depression was life-threatening, because I was cognitively impaired by my episode.)

It worked! It was my longest and most intense (most euphoric and erratic, but least productive) hypomanic episode, and I don't think this is fully explained by it being later in the progression of my illness.

Did I "not simply break down?" I wouldn't say that's the case, even after iirc less than a week of hypomania and ~3 hours of sleep per night.

I would say that the urge to extend my episode was already an obvious thinking error from the hypomania.
I had the worst depression I had ever experienced immediately afterwards, and I would be willing to bet that, within-subject, longer hypomanic episodes in bipolar II patients are followed by more severe (more symptomatic/disabling, not necessarily longer) depressive episodes.

Generally, I would say that bipolar I patients with months-long mania are also "breaking down." Mania is severely disruptive. Manic patients are constantly making thinking mistakes (inappropriate risks resulting in long-term disability/losses/hospitalizations, delusions, hallucinations). They're also not happy all the time -- a lot of mania and hypomania presents with severe anger and irritability! I would consider this a breakdown. I can't say how much of the breaking down is because of the sleep deprivation vs. the other factors of the illness.

(Fortunately, I've been episode-free for 8 years now, save for a couple of days of hypomanic symptoms on the days I was given new anxiety medications that didn't work out.)

Comment by Spencer Ericson on Support me in a Week-Long Picketing Campaign Near OpenAI's HQ: Seeking Support and Ideas from the LessWrong Community · 2023-05-09T21:01:27.388Z · LW · GW

Hi Alistair! You might want to look into more strategic ways of planning activism work. It's true that many social movements start becoming visible with protests, but there is a lot of background work involved in a protest, or any activism.

It looks like your goal is to slow down AI development.

First, you'll want a small working group who can help you develop your message, analyses, and tactics. A few of your colleagues who are deeply concerned about AI risk would work. When planning most things, it's helpful to have people who can temper your impulses and give you more ideas.

I see that you want to "Develop clear message, and demands, and best approach to this protest. Clear explanation of ai dangers that anybody can understand." I recommend doing this more than 2 weeks out from launching a campaign, with help from your working group. There are many important talking points you can use around AI risk, but if you just pick one clear phrase for your campaign, it can get more traction.

After you're clear on the one most important message for you to spread right now, you want to know who you need to tell and who can help you tell it. This is the time for a stakeholder analysis. Be clear on:

Constituencies (who you represent the interests of)
Allies
Opponents
Targets (who can change things)
Secondary targets (who can influence them)

Then, and only then, you want to think of which tactic is best for you to influence your target towards your goal. A protest might not be the best way to convince them, for many reasons that I'll leave up to people who know more about AI and the stakeholders. Protests are one tactic, but so is a well-planned email campaign to officials (with a template, gone through rounds of revision and feedback, for your participants), a social media campaign with text and images about the risk, or visual/literary art about a hypothetical future where AI development does not slow down.

After you've selected a tactic (and made a plan for the project that has been reviewed by people who are experienced in AI safety), you can organize your community to help you carry out that tactic as massively as you can.

This process from start to finish might take a few months, but it is worth getting it right the first time. Then, you will have less issues to fix, and you can build on your momentum and scale up. The more people who help you think through how to effectively influence your targets towards your goal, the better. It is best if you have some people to work with who are deeply familiar with your target. Networks are everything in organizing. Good luck.

Edit: I figured you might want me to tell you why I'm recommending all these other steps. I'm doing that because I'm seeing you receive feedback (from people more involved in the issue than I am) that this could cause harm. I saw you say "in my view it will highly likely be better than nothing" above. It might be worse than nothing. Hence the planning. It seems you want to act fast because this is an urgent threat, and I get that. But acting fast and making things worse is worse than planning for a few months and making things much better.

User info

Posts

Comments