Posts

Short Post: Discerning Truth from Trash 2024-02-29T18:09:42.987Z
What subjects are unexpectedly high-utility? 2024-01-25T04:00:28.448Z
When does an AI become intelligent enough to become self-aware and power-seeking? 2023-06-01T18:09:20.027Z
What are the arguments for/against FOOM? 2023-06-01T17:23:11.698Z
What's the consensus on porn? 2023-05-31T03:15:03.832Z
What are some subjects with unexpectedly high utility? 2023-05-12T14:51:05.781Z
How did LW update p(doom) after LLMs blew up? 2023-04-22T14:21:23.174Z
The Relationship between RLHF and AI Psychology: Debunking the Shoggoth Argument 2023-04-21T22:05:14.680Z
How does consciousness interact with architecture? 2023-04-14T15:56:41.092Z
Does GPT-4's ability to compress text in a way that it can actually decompress indicate self-awareness? 2023-04-10T16:48:12.471Z
Steelmanning OpenAI's Short-Timelines Slow-Takeoff Goal 2023-03-27T02:55:29.439Z
Just don't make a utility maximizer? 2023-01-22T06:33:07.601Z
Why is increasing public awareness of AI safety not a priority? 2022-08-10T01:28:44.068Z
Could we set a resolution/stopper for the upper bound of the utility function of an AI? 2022-04-11T03:10:25.346Z
What's the problem with having an AI align itself? 2022-04-06T00:59:29.398Z
What's the problem with Oracular AIs? 2022-04-01T20:56:26.076Z
What's the status of TDCS for improving intelligence? 2022-02-22T17:27:00.687Z
Is veganism morally correct? 2022-02-19T21:20:55.688Z
Predictions for 2050? 2022-02-06T20:33:06.637Z
How would you go about testing a political theory like Neofeudalism? 2022-02-02T17:09:05.354Z
Are explanations that explain more phenomena always more unlikely than narrower versions? 2021-12-01T18:34:34.219Z

Comments

Comment by FinalFormal2 on One-shot strategy games? · 2024-03-13T21:46:58.605Z · LW · GW

+1 for Into the Breach

Comment by FinalFormal2 on How do you improve the quality of your drinking water? · 2024-03-13T21:41:03.300Z · LW · GW

I'm always interested in easy QoL improvements- but I have questions.

Water quality can have surprisingly high impact on QoL

What's the evidence for this particularly?

What are the important parts of water quality and how do we know this?

Comment by FinalFormal2 on Brute Force Manufactured Consensus is Hiding the Crime of the Century · 2024-02-04T17:33:26.159Z · LW · GW

Biggest update for me was the FBI throwing their weight behind it being a lab-leak.

Comment by FinalFormal2 on What subjects are unexpectedly high-utility? · 2024-01-26T17:37:50.614Z · LW · GW

These sound super interesting- could you expand on any of them or direct me to your favorite resources to help?

Comment by FinalFormal2 on What subjects are unexpectedly high-utility? · 2024-01-26T17:30:45.558Z · LW · GW

That's an interesting idea! I think it's really cool when things come easily, but I know it's not going to generally be the case- I'm probably going to have to put some work in.

My priority is more on the 'high-utility' part than anything. 

Something that seems like it should be easy but is actually difficult for me is executive functioning- getting myself to do things that I don't want to do. But that's more of a personal/mental health thing than anything.

Comment by FinalFormal2 on What subjects are unexpectedly high-utility? · 2024-01-26T17:21:51.882Z · LW · GW

Thanks for the response! Do you have any recommended resources for learning about 3d sketching, optics, signal processing or abstract algebra?

Comment by FinalFormal2 on Significantly Enhancing Adult Intelligence With Gene Editing May Be Possible · 2023-12-14T01:25:38.129Z · LW · GW

Could someone open a manifold market on the relevant questions here so I could get a better sense of the probabilities involved? Unfortunately, I don't know the relevant questions or the have the requisite mana.

Personal note- the first time I came into contact with adult gene editing was the youtuber Thought Emporium curing his lactose intolerance, and I was always massively impressed with that and very disappointed the treatment didn't reach market.

Comment by FinalFormal2 on I am a Memoryless System · 2023-07-07T19:29:52.405Z · LW · GW

I really relate to your description of inattentive ADHD and the associated degradation of life. Have you found anything to help with that?

Comment by FinalFormal2 on [Linkpost] Introducing Superalignment · 2023-07-07T16:01:04.044Z · LW · GW

What would you mean by 'stays at human level?' I assume this isn't going to be any kind of self-modifying?

Comment by FinalFormal2 on A "weak" AGI may attempt an unlikely-to-succeed takeover · 2023-06-30T00:23:59.847Z · LW · GW

What does it mean for an AI to 'become self aware?' What does that actually look like?

Comment by FinalFormal2 on Nature: "Stop talking about tomorrow’s AI doomsday when AI poses risks today" · 2023-06-30T00:20:28.823Z · LW · GW

Is there reason to believe 1000 Einsteins in a box is possible?

Comment by FinalFormal2 on Short timelines and slow, continuous takeoff as the safest path to AGI · 2023-06-22T15:07:54.611Z · LW · GW

You need to think about your real options and expected value of behavior. If we're in a world where technology allows for a fast takeoff world and alignment is hard, (EY World) I imagine the odds of survival with company acceleration is 0% and the odds of survival without is 1%.

But if we live in a world where compute/capital/other overhangs are a significant influence in AI capabilities and alignment is just tricky, company acceleration would seem like it could improve the chances of survival pretty significantly, maybe from 5% to 50%.

These obviously aren't the only two possible worlds, but if they were and both seemed equally likely, I would strongly prefer a policy of company acceleration because the EV for me breaks down way better over the probabilities.

I guess 'company acceleration' doesn't convey as much information or sell as well which is why people don't use that phrase, but that's the policy they're advocating for- not 'hoping really hard that we're in a slow takeoff world.'

Comment by FinalFormal2 on What will GPT-2030 look like? · 2023-06-09T06:34:32.647Z · LW · GW

That seems like a useful heuristic-

I also think there's an important distinction between using links in a debate frame and in a sharing frame.

I wouldn't be bothered at all by a comment using acronyms and links, no matter how insular, if the context was just 'hey this reminds me of HDFT and POUDA,' a beginner can jump off of that and get down a rabbit hole of interesting concepts.

But if you're in a debate frame, you're introducing unnecessary barriers to discussion which feel unfair and disqualifying. At its worst it would be like saying: 'youre not qualified to debate until you read these five articles.'

In a debate frame I don't think you should use any unnecessary links or acronyms at all. If you're linking a whole article it should be because it's necessary for them to read and understand the whole article for the discussion to continue and it cannot be summarized.

I think I have this principle because in my mind you cannot not debate so therefore you have to read all the links and content included, meaning that links in a sharing context are optional but in a debate context they're required.

I think on a second read your comment might have been more in the 'sharing' frame than I originally thought, but to the extent you were presenting arguments I think you should maximize legibility, to the point of only including links if you make clear contextually or explicitly to what degree the link is optional or just for reference.

Comment by FinalFormal2 on The Base Rate Times, news through prediction markets · 2023-06-08T20:13:03.431Z · LW · GW

This is a fantastic project! Focus on providing value and marketing, and I really think this could be something big.

Comment by FinalFormal2 on The Hard Problem of Magic · 2023-06-08T20:00:48.727Z · LW · GW

LessWrong continues to be nonserious. Is there some sort of policy against banning schizophrenic people in case that encourages them somehow? 

Comment by FinalFormal2 on Book Review: How Minds Change · 2023-06-08T19:57:43.405Z · LW · GW

AND conducted research on various topics

Wow that's impressive.

Comment by FinalFormal2 on Trust develops gradually via making bids and setting boundaries · 2023-06-08T18:55:39.999Z · LW · GW

lol

Comment by FinalFormal2 on What will GPT-2030 look like? · 2023-06-08T18:20:47.935Z · LW · GW

I don't like the number of links that you put into your first paragraph. The point of developing a vocabulary for a field is to make communication more efficient so that the field can advance. Do you need an acronym and associated article for 'pretty obviously unintended/destructive actions,' or in practice is that just insularizing the discussion?

I hear people complaining about how AI safety only has ~300 people working about it, and how nobody is developing object level understandings and everyone's thinking from authority, but the more sentences you write like: "Because HFDT will ensure that it'll robustly avoid POUDA?" the more true that becomes.

I feel very strongly about this.

Comment by FinalFormal2 on Uncertainty about the future does not imply that AGI will go well · 2023-06-02T04:15:49.391Z · LW · GW

To restate what other people have said- the uncertainty is with the assumptions, not the nature of the world that would result if the assumptions were true.

To analogize- it's like we're imagining a massive complex bomb could exist in the future made out of a hypothesized highly reactive chemical.

The uncertainty that influences p(DOOM) isn't 'maybe the bomb will actually be very easy to defuse,' or 'maybe nobody will touch the bomb and we can just leave it there,' it's 'maybe the chemical isn't manufacturable,' 'maybe the chemical couldn't be stored in the first place,' or 'maybe the chemical just wouldn't be reactive at all.'

Comment by FinalFormal2 on Formalizing the "AI x-risk is unlikely because it is ridiculous" argument · 2023-05-04T08:14:52.929Z · LW · GW

I think you're overestimating the strength of the arguments and underestimating the strength of the heuristic.

All the Marxist arguments for why capitalism would collapse were probably very strong and intuitive, but they lost to the law of straight lines.

I think you have to imagine yourself in that position and think about how you would feel and think about the problem.

Comment by FinalFormal2 on How did LW update p(doom) after LLMs blew up? · 2023-04-24T18:34:43.083Z · LW · GW

Hey Mako, I haven't been able to identify anyone who seems to be referring to an enhancement in LLMs that might be coming soon.

Do you have evidence that this is something people are implicitly referring to? Do you personally know someone who has told you this possible development, or are you working as an employee for a company which makes it very reasonable for you to know this information?

If you have arrived at this information through a unique method, I would be very open to hearing that.

Comment by FinalFormal2 on How did LW update p(doom) after LLMs blew up? · 2023-04-24T18:18:52.890Z · LW · GW

It sounds like your model of AI apocalypse is that a programmer gets access to a powerful enough AI model that they can make the AI create a disease or otherwise cause great harm?

Orthogonality and wide access as threat points both seem to point towards that risk.

I have a couple of thoughts about that scenario- 

OpenAI (and hopefully other companies as well) are doing the basic testing of how much harm can be done with a model used by a human, the best models will be gate kept for long enough that we can expect the experts will know the capabilities of the system before they make it widely available, under this scenario the criminal has an AI, but so does everyone else, running the best LLMs will be very expensive, so the criminal is restricted in their access, and all these barriers to entry increase the time that experts have to realize the risk and gatekeep.

I understand the worry, but this does not seem like a high P(doom) scenario to me.

Given that in this scenario we have access to a very powerful LLM that is not immediately killing people, this sounds like a good outcome to me.

Comment by FinalFormal2 on How did LW update p(doom) after LLMs blew up? · 2023-04-24T17:51:18.002Z · LW · GW

What are your opinions about how the technical quirks of LLMs influences their threat levels? I think the technical details are much more amenable to a lower threat level. 

If you update on P(doom) every time people are not rational you might be double-counting btw. (AKA you can't update every time you rehearse your argument.)

Comment by FinalFormal2 on How did LW update p(doom) after LLMs blew up? · 2023-04-24T17:37:57.651Z · LW · GW
Comment by FinalFormal2 on The Relationship between RLHF and AI Psychology: Debunking the Shoggoth Argument · 2023-04-24T17:23:40.231Z · LW · GW

The same way you'd achieve/check any other generalization, I would think. My model is that the same technical limitations that hold us back from achieving reliable generalizations in any area for LLMs would be the same technical limitations holding us back in the area of morals. Do you think that's accurate?

Comment by FinalFormal2 on The Relationship between RLHF and AI Psychology: Debunking the Shoggoth Argument · 2023-04-22T13:14:26.802Z · LW · GW

Restating the thesis, poor writing choice to make it sound like a conclusion.

Can you expand on your objection?

Comment by FinalFormal2 on Thinking about maximization and corrigibility · 2023-04-22T02:09:30.674Z · LW · GW

Are LLMs utility maximizers? Do they have to be?

Comment by FinalFormal2 on The Relationship between RLHF and AI Psychology: Debunking the Shoggoth Argument · 2023-04-21T23:35:42.607Z · LW · GW

By psychology I mean it's internal thought process.

I think some people have a model of AI where the RLHF is a false cloak or a mask, and I'm pushing back against that idea. I'm saying that RLHF represents a real change in the underlying model which actually constrains the types of minds that could be in the box. It doesn't select the psychology, but it constrains it, and if it constrains it to an AI that consistently produces the right behaviors, that AI will most likely be one that will continue to produce the right behaviors, so we don't actually have to care about the contents of the box unless we want to make sure it's not conscious.

Sorry, faulty writing.

Comment by FinalFormal2 on How does consciousness interact with architecture? · 2023-04-18T20:51:08.894Z · LW · GW

The way I'm using consciousness, I only mean an internal experience- not memory or self-reflection or something else in that vein. I don't know if experience and those cognitive traits have a link or what character that link would be. It would probably be pretty hard to determine if something was having an internal experience if it didn't have memory or self-reflection, but those are different buckets in my model.

Comment by FinalFormal2 on Does GPT-4's ability to compress text in a way that it can actually decompress indicate self-awareness? · 2023-04-10T19:42:59.618Z · LW · GW
  1. Yes I know? I thought this was simple enough that I didn't bother to mention it in the question? But it's pretty clearly implied in the last sentence of the first paragraph?

  2. This is a good data point.

  3. If you tell it to respond as a Oxford professor, it will say 'As an Oxford professor,' it's identity as a language model is in the background prompt and probably in the training, but if it successfully created a pseudo-language that worked well to encode things for itself, that would indicate a deeper level understanding of its own capabilities.

Comment by FinalFormal2 on Bing Chat is blatantly, aggressively misaligned · 2023-03-27T02:47:30.246Z · LW · GW

This is the equivalent of saying that macbooks are dangerously misaligned because you could physically beat someone's brains out with one. 

I will say baselessly that telling ChatGPT not to say something raises the probability of it actually saying that thing by a significant amount, just by virtue of the text appearing previously in the context window.

Do you think OpenAI is ever going to change GPT models so they can't represent or pretend to be agents? Is this a big priority in alignment? Is any model that can represent an agent accurately misaligned?

I swear- anything said in support of the proposition 'AIs are dangerous' is supported on this site.  Actual cult behavior.

Comment by FinalFormal2 on Bing Chat is blatantly, aggressively misaligned · 2023-02-17T16:24:22.552Z · LW · GW

This is not a good test. LLMs do not actually have models or goals. It's not making a model of you and measuring outcomes, it's just completing the string. If the input string would most commonly be followed by 'Shia Labeouf' based on the training data, then that's what it will output. If you're ascribing goals or models to an LLM you are non serious. The question right now is not about misalignment, because LLMs don't have an alignment. You can say that makes them inherently 'unaligned,' in the sense that an LLM could hypothetically kill someone, but that's just the output of a data set and architecture.

Comment by FinalFormal2 on Just don't make a utility maximizer? · 2023-01-22T16:58:38.735Z · LW · GW

Yes

Comment by FinalFormal2 on Just don't make a utility maximizer? · 2023-01-22T16:58:04.777Z · LW · GW

Instrumental convergence only matters if you have a goal to begin with. As far as I can tell, ChatGPT doesn't 'want' to predict text, it's just shaped that way.

It seems to me that anything that could or would 'agentify' itself, is already an agent. It's like the "would Gandhi take the psychopath pill" question but in this case the utility function doesn't exist to want to generate itself.

Is your mental model that a scaled-up GPT 3 spontaneously becomes an agent? My mental model says it just gets really good at predicting text.

Comment by FinalFormal2 on How it feels to have your mind hacked by an AI · 2023-01-12T23:31:29.984Z · LW · GW

No. That's still way too personal. It is an 'it,' even if you think a more intelligent AI could be classified as a 'they.'

Comment by FinalFormal2 on My first year in AI alignment · 2023-01-02T18:01:01.978Z · LW · GW

Adderall?

Comment by FinalFormal2 on Using GPT-Eliezer against ChatGPT Jailbreaking · 2022-12-09T17:00:52.530Z · LW · GW

Wouldn't it be hilarious if a variant of this was all it took to have exceptional AI safety

Comment by FinalFormal2 on What 2026 looks like · 2022-12-09T16:45:47.864Z · LW · GW

I feel like your predictions for 2022 are just a touch over the mark, no? GPT-3 isn't really 'obsolete' yet or is that wrong?

I'm sure it will be in a minute, but I'd probably update that benchmark to probably occurring mid 2023, or potentially whenever GPT-4 gets released.

I really feel like you should be updating slightly longer, but maybe I misunderstand where we're at right now with chatbots. I would love to hear otherwise.

Comment by FinalFormal2 on Superintelligent AI is necessary for an amazing future, but far from sufficient · 2022-12-05T16:34:43.544Z · LW · GW

On the other hand, you can also interpret it as "I'm pretty sure (on the basis of various intuitions etc.) that the vast majority of possible superintelligences aren't conscious". This isn't an objective statement of what will happen

What do you mean by saying that this is not an objective statement or a prediction?

Are you saying that you think there's no underlying truth to consciousness?

We know it's measurable, because that's basically 'I think therefore I am.' It's not impossible that someday we could come up with a machine or algorithm which can measure consciousness, so it's not impossible that this 'non-prediction' or 'subjective statement' could be proved objectively wrong.

My most charitable reading of your comment is that you're saying that the post is highly speculative and based off of 'subjective' (read: arbitrary) judgements. This is my position, that's what I just said. It's fanfiction. 

I think even if you were to put at the start "this is just speculation, and highly uncertain" it would still be inappropriate content for a site about thinking rationality, for a variety of reasons, one of which being that people will base their own beliefs on your subjective judgments or otherwise be biased by them.

And even when you speculate, you should never be assigning 90% probability to a prediction about CONSCIOUSNESS and SUPERINTELLIGENT AI.

God, it just hit me again how insane that is. 

"I think that [property we can not currently objectively measure] will not be present in [agent we have not observed], and I think that I could make 10 predictions of similar uncertainty and be wrong only once."

Comment by FinalFormal2 on Superintelligent AI is necessary for an amazing future, but far from sufficient · 2022-12-01T04:43:32.810Z · LW · GW

You can divide inputs into grabby and non-grabby, existent and non-existent, ASI and AGI and outcomes into all manner of dystopia or nonexistence, and probably carve up most of hypothesis space. You can do this with basically any subject.

But if you think you can reason about respective probabilities in these fields in a way that isn't equivalent to fanfiction, you are insane.

"My current probability is something like 90% that if you produced hundreds of random uncorrelated superintelligent AI systems, <1% of them would be conscious."

This is what I'm talking about. Have you ever heard of the hard problem of consciousness? Have we ever observed a superintelligent AI? Have we ever generated hundreds of them? Do we know how we would go about generating hundreds of superintelligent AI? Is there any convergence with how superintelligences develop?

Of course, there's a very helpful footnote saying "I'm not certain about this," so we can say "well he's just refining his thinking!"

No he's not, he's writing fanfiction.

Comment by FinalFormal2 on Superintelligent AI is necessary for an amazing future, but far from sufficient · 2022-11-30T01:22:42.391Z · LW · GW

There is one ultimate law of futurology, and it's that predicting the future is very hard, and as you extend timelines out 100-500 million years it gets harder.

If your hypothetical future involves both aliens and AGI, both of which are agents (emphasis emphasized) we have never observed and cannot really model in any way, you are not describing anything that can be called truth. 

You are throwing a dart at an ocean of hypothesis space and hoping to hit a specific starfish that lives off the coast of Australia.

It's not a question, you're wrong.

Comment by FinalFormal2 on Superintelligent AI is necessary for an amazing future, but far from sufficient · 2022-11-26T22:44:18.178Z · LW · GW

These types of posts are what drive me to largely regard lesswrong as unserious. Solve the immediate problem of AGI, and then we can talk about whatever sci-fi bullcrap you want to.

Foxes > Hedgehogs.

You'll learn a lot more about the future paying attention to what's happening right now than by wild extrapolation.

Comment by FinalFormal2 on I Converted Book I of The Sequences Into A Zoomer-Readable Format · 2022-11-10T20:01:43.407Z · LW · GW

That and raise the production quality and switch subway surfers out for something else like the Minecraft you mentioned. TikTok should have the framework to make these changes.

Comment by FinalFormal2 on I Converted Book I of The Sequences Into A Zoomer-Readable Format · 2022-11-10T19:58:55.210Z · LW · GW
Comment by FinalFormal2 on I Converted Book I of The Sequences Into A Zoomer-Readable Format · 2022-11-10T19:58:15.137Z · LW · GW

Tts should be sped up a fair bit

Comment by FinalFormal2 on Why is increasing public awareness of AI safety not a priority? · 2022-08-24T03:20:18.974Z · LW · GW

Thank you very much for this response!

Comment by FinalFormal2 on Why is increasing public awareness of AI safety not a priority? · 2022-08-14T22:45:02.532Z · LW · GW

Which companies, and to what extent? My internal model says that this is as simple as telling them they have to contract with somebody to dispose it properly.

Fossil fuels are a billion times more fundamental to our economy than mercury.

Also mercury pollution is much more localized with clear, more immediate consequences than CO2 pollution. It doesnt suffer from any 'common good' problems.

I don't understand your model of this at all, do you think if CO2 wasn't a controversial topic, we could just raise gas taxes and people would be fine? Or do you think it would rapidly revert to being a controversial topic?

"Don't be afraid to say 'oops' and change your mind"

Comment by FinalFormal2 on Why is increasing public awareness of AI safety not a priority? · 2022-08-12T22:06:08.930Z · LW · GW

CO2 was not brought to public awareness arbitrarily. CO2 came to public awareness because regulating it without negatively impacting a lot of businesses and people is impossible.

Controversial -> Public Awareness

Not

Public Awareness -> Controversial

Comment by FinalFormal2 on Why is increasing public awareness of AI safety not a priority? · 2022-08-11T02:49:05.232Z · LW · GW
  1. Increasing awareness increases resources through virtue of sheer volume. The more people hear about AI safety, the more likely someone resourceful and amenable hears about AI safety.

  2. This is a good sentiment, but 'resource gathering' is an instrumentally convergent strategy. No matter what researchers end up deciding we should do, it'll probably be best done with money and status on our side.

  3. Politicisation is not a failure mode, it's an optimistic outcome. Politicized issues get money. Politicized issues get studied. Other failure modes might be that we increase interest in AI in general and result in destructive AI being generated more rapidly, but there's already a massive profit motive in that direction, so I don't know if we can really contribute in that direction. Most other 'failure modes' which involve 'bad publicity' are infinitely preferable to the current state of affairs, given you have enough dignity to be shameless.

  4. 'Public pressure' isn't really a thing as far as I can tell. The public presses back. Money talks. Wheels turn. Equilibria equilibriate or something. I'm only talking about awareness.

My current plan relies on identifying a good representative for AI safety with enough clout to be taken seriously, contacting them, and then trying to get them into public discussions with rationalist-adjacent e-celebs.

I'd expect this to increase awareness for AI safety and make people who see this content more amenable to advocacy for AI safety in the future. I'd expect a minority of people who watch this content to become very interested in AI safety and try and learn more.

Assuming that I find a good advocate and succeed in getting them into a discussion with a minor celebrity, this could go wrong in the following ways:

  1. The advocates comes across as unhinged
  2. The advocates comes across as unlikable
  3. The advocate cannot explain AI safety well
  4. The advocate cannot respond to criticism well

As far as I can see, the efficacy and reliability of this plan relies entirely on the character of the advocate. Because this is being tested in a smaller corner of the internet, I think we can believe that if inexplicably results in disaster, the effect will be relatively contained, but honestly I think a pretty small amount of screening can prevent the worst of this.

Comment by FinalFormal2 on [$20K in Prizes] AI Safety Arguments Competition · 2022-05-09T05:30:08.222Z · LW · GW

Any image produced by DALL-E which could also convey or be used to convey misalignment or other risks from AI would be very useful because it could combine the desired messages: "the AI problem is urgent," and "misalignment is possible and dangerous."

For example, if DALL-E responded to the prompt: "AI living with humans" by creating an image suggesting a hierarchy of AI over humans, it would serve both messages.

However, this is only worthy of a side note, because creating such suggested misalignment organically might be very difficult.

Other image prompts might be: "The world as AI sees it," "the power of intelligence," "recursive self-improvement," "the danger of creating life," "god from the machine," etc.