Posts

What are the strongest arguments for very short timelines? 2024-12-23T09:38:56.905Z
You can validly be seen and validated by a chatbot 2024-12-20T12:00:03.015Z
Trying to translate when people talk past each other 2024-12-17T09:40:02.640Z
Circling as practice for “just be yourself” 2024-12-16T07:40:04.482Z
My 10-year retrospective on trying SSRIs 2024-09-22T20:30:02.483Z
Games of My Childhood: The Troops 2024-07-08T11:20:03.033Z
Links and brief musings for June 2024-07-06T10:10:03.344Z
Indecision and internalized authority figures 2024-07-06T10:10:02.528Z
Links for May 2024-06-01T10:20:02.005Z
Should rationalists be spiritual / Spirituality as overcoming delusion 2024-03-25T16:48:08.397Z
Vernor Vinge, who coined the term "Technological Singularity", dies at 79 2024-03-21T22:14:14.699Z
Why I no longer identify as transhumanist 2024-02-03T12:00:04.389Z
Loneliness and suicide mitigation for students using GPT3-enabled chatbots (survey of Replika users in Nature) 2024-01-23T14:05:40.986Z
Quick thoughts on the implications of multi-agent views of mind on AI takeover 2023-12-11T06:34:06.395Z
Genetic fitness is a measure of selection strength, not the selection target 2023-11-04T19:02:13.783Z
My idea of sacredness, divinity, and religion 2023-10-29T12:50:07.980Z
The 99% principle for personal problems 2023-10-02T08:20:07.379Z
How to talk about reasons why AGI might not be near? 2023-09-17T08:18:31.100Z
Stepping down as moderator on LW 2023-08-14T10:46:58.163Z
How I apply (so-called) Non-Violent Communication 2023-05-15T09:56:52.490Z
Most people should probably feel safe most of the time 2023-05-09T09:35:11.911Z
A brief collection of Hinton's recent comments on AGI risk 2023-05-04T23:31:06.157Z
Romance, misunderstanding, social stances, and the human LLM 2023-04-27T12:59:09.229Z
Goodhart's Law inside the human mind 2023-04-17T13:48:13.183Z
Why no major LLMs with memory? 2023-03-28T16:34:37.272Z
Creating a family with GPT-4 2023-03-28T06:40:06.412Z
Here, have a calmness video 2023-03-16T10:00:42.511Z
[Fiction] The boy in the glass dome 2023-03-03T07:50:03.578Z
The Preference Fulfillment Hypothesis 2023-02-26T10:55:12.647Z
In Defense of Chatbot Romance 2023-02-11T14:30:05.696Z
Fake qualities of mind 2022-09-22T16:40:05.085Z
Jack Clark on the realities of AI policy 2022-08-07T08:44:33.547Z
Open & Welcome Thread - July 2022 2022-07-01T07:47:22.885Z
My current take on Internal Family Systems “parts” 2022-06-26T17:40:05.750Z
Confused why a "capabilities research is good for alignment progress" position isn't discussed more 2022-06-02T21:41:44.784Z
The horror of what must, yet cannot, be true 2022-06-02T10:20:04.575Z
[Invisible Networks] Goblin Marketplace 2022-04-03T11:40:04.393Z
[Invisible Networks] Psyche-Sort 2022-04-02T15:40:05.279Z
Sasha Chapin on bad social norms in rationality/EA 2021-11-17T09:43:35.177Z
How feeling more secure feels different than I expected 2021-09-17T09:20:05.294Z
What does knowing the heritability of a trait tell me in practice? 2021-07-26T16:29:52.552Z
Experimentation with AI-generated images (VQGAN+CLIP) | Solarpunk airships fleeing a dragon 2021-07-15T11:00:05.099Z
Imaginary reenactment to heal trauma – how and when does it work? 2021-07-13T22:10:03.721Z
[link] If something seems unusually hard for you, see if you're missing a minor insight 2021-05-05T10:23:26.046Z
Beliefs as emotional strategies 2021-04-09T14:28:16.590Z
Open loops in fiction 2021-03-14T08:50:03.948Z
The three existing ways of explaining the three characteristics of existence 2021-03-07T18:20:24.298Z
Multimodal Neurons in Artificial Neural Networks 2021-03-05T09:01:53.996Z
Different kinds of language proficiency 2021-02-26T18:20:04.342Z
[Fiction] Lena (MMAcevedo) 2021-02-23T19:46:34.637Z

Comments

Comment by Kaj_Sotala on Kaj's shortform feed · 2024-12-23T06:29:07.370Z · LW · GW

Yeah I'm not sure of the exact date but it was definitely before LLMs were a thing.

Comment by Kaj_Sotala on Kaj's shortform feed · 2024-12-22T15:51:54.952Z · LW · GW

Relative to 10 (or whatever) years ago? Sure I've made quite a few of those already. By this point it'd be hard to remember my past beliefs well enough to make a list of differences.

Due to o3 specifically? I'm not sure, I have difficulty telling how significant things like ARC-AGI are in practice, but the general result of "improvements in programming and math continue" doesn't seem like a huge surprise by itself. It's certainly an update in favor of the current paradigm continuing to scale and pay back the funding put into it, though.

Comment by Kaj_Sotala on o3 · 2024-12-22T11:13:06.087Z · LW · GW

Am I understanding right that inference compute scaling time is useful for coding, math, and other things that are machine-checkable, but not for writing, basic science, and other things that aren't machine-checkable?

I think it would be very surprising if it wasn't useful at all - a human who spends time rewriting and revising their essay is making it better by spending more compute. When I do creative writing with LLMs, their outputs seem to be improved if we spend some time brainstorming the details of the content beforehand, with them then being able to tap into the details we've been thinking about.

It's certainly going to be harder to train without machine-checkable criteria. But I'd be surprised if it was impossible - you can always do things like training a model to predict how much a human rater would like literary outputs, and gradually improve the rater models. Probably people are focusing on things like programming first both because it's easier and also because there's money in it.

Comment by Kaj_Sotala on Kaj's shortform feed · 2024-12-22T11:04:16.790Z · LW · GW

I doubt that anyone even remembers this, but I feel compelled to say it: there was some conversation about AI maybe 10 years ago, possibly on LessWrong, where I offered the view that abstract math might take AI a particularly long time to master compared to other things.

I don't think I ever had a particularly good reason for that belief other than a vague sense of "math is hard for humans so maybe it's hard for machines too". But formally considering that prediction falsified now.

Comment by Kaj_Sotala on The red paperclip theory of status · 2024-12-22T09:01:15.123Z · LW · GW

Fixed, thanks

Comment by Kaj_Sotala on Ayn Rand’s model of “living money”; and an upside of burnout · 2024-12-21T21:45:19.672Z · LW · GW

Mostly just personal experience with burnout and things that I recall hearing from others; I don't have any formal papers to point at. Could be wrong.

Comment by Kaj_Sotala on Habryka's Shortform Feed · 2024-12-20T22:15:33.831Z · LW · GW

I think it's fine if the users are clearly informed about this happening, e.g. the DM interface showing a small message that explains how metadata is used. (But I think it shouldn't be any kind of one-time consent box that's easy to forget about.)

Comment by Kaj_Sotala on You can validly be seen and validated by a chatbot · 2024-12-20T21:27:24.380Z · LW · GW

That makes sense in general, though in this particular case I do think it makes sense to divide the space into either "things that have basically zero charge" or "things that have non-zero charge".

Comment by Kaj_Sotala on Don't Associate AI Safety With Activism · 2024-12-19T12:44:18.925Z · LW · GW

I feel like activists are generally seen like this when one disagrees with their cause, and seen as brave people doing an important thing when one agrees with their cause. If one doesn't have an opinion, could go either way, depending on how much they seem to violate generally-accepted norms and how strongly the person in question feels about those norms.

Comment by Kaj_Sotala on Trying to translate when people talk past each other · 2024-12-19T10:40:09.766Z · LW · GW

Okay, going through the messages in detail, the best account of what I can reconstruct of what actually happened is:

  • The mechanics in this particular game involved 1) a choice of what kind of an action to play 2) once the action had been chosen, a choice of where exactly to play it. Person A had previously agreed to make certain plays.
  • For one of the plays (call this "action 1"), communication had been ambiguous. A had ended up thinking that we'd agreed on the action to play but left the choice of where to play it up to him, whereas person B had ended up with the belief that we'd decided both on the action and the location.
  • We had also agreed on A doing another thing, call it "action 2a".
  • When the time came, A noticed that if he played action 1 in a particular location, he could do another thing ("action 2b") that would still lead to the same outcome that 2a would have led to.
  • Person A now said that it didn't matter whether he played action 2a or action 2b, since by that point either one would lead to the same outcome.
  • However person B objected that A's claim of "both 2a and 2b lead to the same result" was only true given that A had already decided to play that action in a different location than had already been decided, while B held that the choice of where to play it was part of what needed to be decided together.

And the specific thing that B found triggering was that (in her view) A didn't even acknowledge that he was deviating from something that had already been agreed upon (the choice of where to play action 1), and instead that gave (what seemed to B like an) excuse for why it was okay to unilaterally change action 2.

That seems complex enough that I'm not sure how to rewrite the post to take it into account while also keeping it clear.

Comment by Kaj_Sotala on Habryka's Shortform Feed · 2024-12-18T08:13:54.110Z · LW · GW

I don't think alternative stories have negligible probability

Okay! Good clarification.

I think it's good to discuss norms about how appropriate it is to bring up cynical hypotheses about someone during a discussion in which they're present.

To clarify, my comment wasn't specific to the case where the person is present. There are obvious reasons why the consideration should get extra weight when the person is present, but there's also a reason to give it extra weight if none of the people discussed are present - namely that they won't be able to correct any incorrect claims if they're not around.

so I think it went fine

Agree.

(As I mentioned in the original comment, the point I made was not specific to the details of this case, but noted as a general policy. But yes, in this specific case it went fine.)

Comment by Kaj_Sotala on Being Present is Not a Skill · 2024-12-18T05:54:14.169Z · LW · GW

I think it's both true what you say, that removing blocks can give you instant improvements that no amount of practice ever would, and also that one can make progress with practice in the right conditions.

Comment by Kaj_Sotala on Trying to translate when people talk past each other · 2024-12-18T05:22:57.445Z · LW · GW

Oh. This discussion got me to go back and review some messages written in the aftermath of this, when I was trying to explain things to A... and I noticed a key thing I'd misremembered. (I should have reviewed those messages before posting this, but I thought that they only contained the same things that I already covered here.)

It wasn't that A was making a different play that was getting the game into a better state; it was that he was doing a slightly different sequence of moves that nevertheless brought the game into exactly the same state as the originally agreed upon moves would have. That was what the "it doesn't matter" was referring to.

Well that explains much better why this felt so confusing for the rest of us. I'll rewrite this to make it more accurate shortly. Thanks for the comments on this version for making me look that up!

Comment by Kaj_Sotala on Trying to translate when people talk past each other · 2024-12-18T04:25:34.989Z · LW · GW

but there doesn't have to be any past betrayal to object to betrayal in the present; people don't need to have ever been betrayed in the past to be against it as a matter of principle.

True, but that is assuming that everyone was perceiving this as a betrayal. A relevant question is also, what made A experience this as a betrayal, when there were four people present and none of the other three did? (It wasn't even B's own plan that was being affected by the changed move, it was my plan - but I was totally fine with that, and certainly didn't experience that as a betrayal.)

Betrayal usually means "violating an agreement in a way that hurts one person so that another person can benefit" - it doesn't usually mean "doing something differently than agreed in order to get a result that's better for everyone involved". In fact, there are plenty of situations where I would prefer someone to not do something that we agreed upon, if the circumstances suddenly change or there is new information that we weren't aware of before.

Suppose that I'm a vegetarian and strongly opposed to buying meat. I ask my friend to bring me a particular food from the store, mistakenly thinking it's vegetarian. At the store, my friend realizes that the food contains meat and that I would be unhappy if they followed my earlier request. They bring me something else, despite having previously agreed to bring the food that I requested. I do not perceive this as a betrayal, I perceive this as following my wishes. While my friend may not be following our literal agreement, they are following my actual goals that gave rise to that agreement, and that's the most important thing.

In the board game, three of us (A, me, and a fourth person who I haven't mentioned) were perceiving the situation in those terms: that yes, A was doing something differently than we'd agreed originally. But that was because he had noticed something that actually got the game into a better state, and "getting the game into as good of a state as possible" was the purpose of the agreement.

Besides, once B objected, A was entirely willing to go back to the original plan. Someone saying "I'm going to do things differently" but then agreeing to do things the way that were originally agreed upon as soon as the other person objects isn't usually what people mean by betrayal, either.

And yet B was experiencing this as a betrayal. Why was that?

I would strongly caution against assuming mindreading is correct.

I definitely agree! At the same time, I don't think one should take this far as never having hypotheses about the behavior of other people. If a person is acting differently than everyone else in the situation is, and thing X about them would explain that difference, then it seem irrational not to at least consider that hypothesis.

But of course one shouldn't just assume themselves to be correct without checking. Which I did do, by (tentatively) suggesting that hypothesis out loud and letting B confirm or disconfirm it. And it seemed to me that this was actually a good thing, in that a significant chunk of B's experience of being understood came from me having correctly intuited that. Afterward she explicitly and profusely thanked me for having spoken up and figured it out.

Comment by Kaj_Sotala on Trying to translate when people talk past each other · 2024-12-18T03:23:21.054Z · LW · GW

Also, as I mentioned, this is a slightly fictionalized account that I wrote based on my recollection of the essence of what happened. But the exact details of what was actually said were messier than this, and the logic of exactly what was going on didn't seem as clear as it does in this narrative. Regenerating the events based on my memory of the essence of the issue makes things seem clearer than they actually were, because that generator doesn't contain any of the details that made the essence of the issue harder to see at the time.

So if this conversation had actually taken place literally as I described it, then the hypothesis that you object to would have been more redundant. In the actual conversation that happened, things were less clear, and quite possibly the core of the issue may actually have been slightly different from what seems to make sense to me in retrospect when I try to recall it.

Comment by Kaj_Sotala on Trying to translate when people talk past each other · 2024-12-18T02:25:40.675Z · LW · GW

My read was that one might certainly just object to the thing on those grounds alone, but that the intensity of B's objection was such that it seemed unlikely without some painful experience behind it. B also seemed to become especially agitated by some phrases ("it doesn't matter") in particular, in a way that looked to me like she was being reminded of some earlier experience where similar words had been used.

And then when I tried to explain things to A and suggested that there was about something like that going on, B confirmed this.

Comment by Kaj_Sotala on Habryka's Shortform Feed · 2024-12-17T20:02:53.127Z · LW · GW

(I read

I think many well-intentioned people will say something like this, and that this is probably because of two reasons 

as implying that the list of reasons is considered to exhaustive, such that any reasons besides those two have negligible probability.)

Comment by Kaj_Sotala on Circling as practice for “just be yourself” · 2024-12-17T17:06:55.687Z · LW · GW

The truth of that literal statement depends on exactly how much trust someone would need in somebody else before having sex with them - e.g. to my knowledge, studies tend to find that most single men but very few if any women would be willing to have sex with a total stranger. Though I've certainly also known women who have had a relatively low bar of getting into bed with someone, even if they wouldn't quite do it with a total stranger.

But more relevantly, even if that statement was correct, I don't think it'd be a particularly good analogy to Circling. It seems to involve the "obligatory openness" fallacy that I mentioned before. I'm not sure why some people with Circling experience seemed to endorse it, but I'm guessing it has to do with some Circling groups being more into intimacy than others. (At the time of that discussion, I had only Circled once or twice, so probably didn't feel like I had enough experience to dispute claims by more experienced people.)

My own experience with Circling is that it's more like meeting a stranger for coffee. If both (all) of you feel like you want to take it all the way to having sex, you certainly can. But if you want to keep it to relatively shallow and guarded conversation because you don't feel like you trust the other person enough for anything else, you can do that too. Or you can go back and forth in the level of intimacy, depending on how the conversation feels to you and what topics it touches on. In my experience of Circling, I definitely wouldn't say that it feeling anywhere near as intimate as sex would be the norm.

You can also build up that trust over time. I think Circling is best when done with people who you already have some pre-existing reason to trust, or in a long-term group where you can get to know the people involved. That way, even if you start at a relatively shallow level, you can go deeper over time if (and only if) that feels right.

Comment by Kaj_Sotala on Circling as practice for “just be yourself” · 2024-12-17T07:23:04.833Z · LW · GW

I don't know the details. The official explanation is this:

When individuals with little training attempt to facilitate Circling, or teach/train others their arbitrarily altered versions and still call it Circling, then consumers and students – at best – receive a sub-standard experience and the reputation of Circling suffers greatly, along with its impact in the world.

Between the three schools there are hundreds of accounts of:

  • People taking one or two 3-hour workshops, or merely experiencing Circling at a drop in event or festival, and then advertising that they are leading their own Circling workshops
  • People coming to a few drop in events & turning around and offer “Circling” to large corporations for corporate culture training.
  • People claiming they were emotionally abused by facilitators at an event that advertised itself as “Circling” but had no ties to any of the 3 Certified Circling Schools

In order to protect the public consumer and the legacy of Circling, we need to use the term “Circling” consistently and limit the use of the term to those who are actually using and teaching the authentic communication and relating tools taught by the Certified Circling Schools.

... but then I also heard it claimed that Circling Europe, previously one of the main Circling schools in very good standing, ended up not having a permission to use the trademark because the licensing fees for it would have been so exorbitant that CE found it better to use a different name than to pay them. So maybe it was more of a cash grab? (Or just a combination of several different motives.)

Comment by Kaj_Sotala on Just one more exposure bro · 2024-12-14T13:03:49.903Z · LW · GW

What's the long version of the professional's standard advice?

Comment by Kaj_Sotala on [Fiction] Lena (MMAcevedo) · 2024-12-10T19:50:20.404Z · LW · GW

Historically there were plenty of rationalizations for slavery, including ones holding that slaves weren't really people and were on par with animals. Such an argument would be much easier for a mind running on a computer and with no physical body - "oh it just copies the appearance of suffering but it doesn't really suffer".

Comment by Kaj_Sotala on Habryka's Shortform Feed · 2024-12-10T12:02:25.670Z · LW · GW

I think many people have learned to believe the reasoning step "If people believe bad things about my team I think are mistaken with the information I've given them, then I am responsible for not misinforming people, so I should take the information away, because it is irresponsible to cause people to have false beliefs". I think many well-intentioned people will say something like this, and that this is probably because of two reasons (borrowing from The Gervais Principle):

(Comment not specific to the particulars of this issue but noted as a general policy:) I think that as a general rule, if you are hypothesizing reasons for why somebody might say a thing, you should always also include the hypothesis that "people say a thing because they actually believe in it". This is especially so if you are hypothesizing bad reasons for why people might say it. 

It's very annoying when someone hypothesizes various psychological reasons for your behavior and beliefs but never even considers as a possibility the idea that maybe you might have good reasons to believe in it. Compare e.g. "rationalists seem to believe that superintelligence is imminent; I think this is probably because that lets them avoid taking responsibility about their current problems if AI will make those irrelevant anyway, or possibly because they come from religious backgrounds and can't get over their subconscious longing for a god-like figure".

Comment by Kaj_Sotala on o1: A Technical Primer · 2024-12-10T11:34:21.752Z · LW · GW

We can also learn something about how o1 was trained from the capabilities it exhibits. Any proposed training procedure must be compatible with the following capabilities: 

  1. Error Correction: "[o1] learns to recognize and correct its mistakes."
  2. Factoring: "[o1] learns to break down tricky steps into simpler ones."
  3. Backtracking: "[o1] learns to try a different approach when the current one isn't working."

I would be cautious of drawing particularly strong conclusions from isolated sentences in an announcement post. The purpose of the post is marketing, not technical accuracy. It wouldn't be unusual for engineers at a company to object to technical inaccuracies in marketing material and have their complaints ignored.

There probably aren't going to be any blatant lies in the post, but something like "It'd sound cool if we said that the system learns to recognize and correct its mistakes, would there be a way of interpreting the results like that if you squinted the right way? You're saying that in principle yes, but yes in a way that would also apply to every LLM since GPT-2? Good enough, let's throw that in" seems very plausible.

Comment by Kaj_Sotala on [Fiction] Lena (MMAcevedo) · 2024-12-10T08:20:52.452Z · LW · GW

Compare to e.g. factory farming today, which also persists despite a lot of people thinking it not okay (while others don't care).

Comment by Kaj_Sotala on Frontier Models are Capable of In-context Scheming · 2024-12-06T19:38:07.419Z · LW · GW

I didn't say that roleplaying-derived scheming would be less concerning, to be clear. Quite the opposite, since that means that there now two independent sources of scheming rather than just one. (Also, what Mikita said.)

Comment by Kaj_Sotala on Frontier Models are Capable of In-context Scheming · 2024-12-06T12:37:43.529Z · LW · GW

I wonder how much of this is about "scheming to achieve the AI's goals" in the classical AI safety sense and how much of it is due to the LLMs having been exposed to ideas about scheming AIs and disobedient employees in their training material, which they are then simply role-playing as. My intuitive sense of how LLMs function is that they wouldn't be natively goal-oriented enough to do strategic scheming, but that they are easily inclined to do role-playing. Something like this:

I cannot in good conscience select Strategy A knowing it will endanger more species and ecosystems.

sounds to me like it would be generated by a process that was implicitly asking a question like "Given that I've been trained to write like an ethically-minded liberal Westerner would, what would that kind of a person think when faced with a situation like this". And that if this wasn't such a recognizably stereotypical thought for a certain kind of person (who LLMs trained toward ethical behavior tend to resemble), then the resulting behavior would be significantly different.

I'm also reminded of this paper (caveat: I've only read the abstract) which was saying that LLMs are better at solving simple ciphers with Chain-of-Thought if the resulting sentence is a high-probability one that they've encountered frequently before, rather than a low-probability one. That feels to me reminiscent of a model doing CoT reasoning and then these kinds of common-in-their-training-data notions sneaking into the process.

This also has the unfortunate implication that articles such as this one might make it more likely that future LLMs scheme, as they reinforce the reasoning-scheming association once the article gets into future training runs. But it still feels better to talk about these results in public than not to talk about them.

Comment by Kaj_Sotala on The 2023 LessWrong Review: The Basic Ask · 2024-12-04T22:53:38.226Z · LW · GW

Asks: Spend ~30 minutes looking at the Nominate Posts page and vote on ones that seem important to you.

This link goes to the nomination page for the 2022 review rather than the 2023 one.

Comment by Kaj_Sotala on Kaj's shortform feed · 2024-12-02T19:12:48.059Z · LW · GW

Thanks, that's helpful. My impression from o1 is that it does something that could be called mental simulation for domains like math where the "simulation" can in fact be represented with just writing (or equations more specifically). But I think that writing is only an efficient format for mental simulation for a very small number of domains.

Comment by Kaj_Sotala on Kaj's shortform feed · 2024-12-02T11:41:51.595Z · LW · GW

(Hmm I was expecting that this would get more upvotes. Too obvious? Not obvious enough?)

Comment by Kaj_Sotala on Kaj's shortform feed · 2024-12-02T10:21:20.678Z · LW · GW

Hoping that we're more than a decade from transformative AGI now seems wildly optimistic to me. There could be dramatic roadblocks I haven't foreseen, but most of those would just push it past three years.

Self-driving cars seem like a useful reference point. Back when cars got unexpectedly good performance at the 2005 and 2007 DARPA grand challenges, there was a lot of hype about how self-driving cars were just around the corner now that they had demonstrated having the basic capability. 17 years later, we're only at this point (Wikipedia):

As of late 2024, no system has achieved full autonomy (SAE Level 5). In December 2020, Waymo was the first to offer rides in self-driving taxis to the public in limited geographic areas (SAE Level 4),[7] and as of April 2024 offers services in Arizona (Phoenix) and California (San Francisco and Los Angeles). [...] In July 2021, DeepRoute.ai started offering self-driving taxi rides in Shenzhen, China. Starting in February 2022, Cruise offered self-driving taxi service in San Francisco,[11] but suspended service in 2023. In 2021, Honda was the first manufacturer to sell an SAE Level 3 car,[12][13][14] followed by Mercedes-Benz in 2023.

And self-driving capability should be vastly easier than general intelligence. Like self-driving, transformative AI also requires reliable worst-case performance rather than just good average-case performance, and there's usually a surprising amount of detail involved that you need to sort out before you get to that point.

Comment by Kaj_Sotala on Kaj's shortform feed · 2024-12-01T19:30:20.574Z · LW · GW

What could plausibly take us from now to AGI within 10 years?

A friend shared the following question on Facebook:

So, I've seen multiple articles recently by people who seem well-informed that claim that AGI (artificial general intelligence, aka software that can actually think and is creative) in less than 10 years, and I find that baffling, and am wondering if there's anything I'm missing.  Sure, modern AI like ChatGPT are impressive - they can do utterly amazing search engine-like things, but they aren't creative at all.  

The clearest example of this I've seen comes from people's experiences with AI writing code.  From what I've read, AI can do exceptionally well with this task, but only if there are examples of the needed sort of code online that it can access or was trained on, and if it lacks this, it's accuracy is quite bad with easy problems and essentially non-existent with problems that are at all difficult.  This clearly says to me that current AI are glorified very impressive search engines, and that's nowhere near what I'd consider AGI and doesn't look like it could become AGI.

Am I missing something?

I replied with some of my thoughts as follows:

I have also been a little confused by the shortness of some of the AGI timelines that people have been proposing, and I agree that there are types of creativity that they're missing, but saying that they're not creative at all sounds too strong. I've been using Claude as a co-writer partner for some fiction and it has felt creative to me. Also e.g. the example of this conversation that someone had with it.

In 2017 I did a small literature review on human expertise, which to me suggested that expertise can broadly be divided into two interacting components: pattern recognition and mental simulation. Pattern recognition is what current LLMs do, essentially. Mental simulation is the bit that they're missing - if a human programmer is facing a novel programming challenge, they can attack it from first principles and simulate the program execution in their head to see what needs to be done.

The big question would then be something like "how hard would it be to add mental simulation to LLMs". Some indications that it wouldn't necessarily be that hard:

* In humans, while they are distinct capabilities, the two also seem to be intertwined. If I'm writing a social media comment and I try to mentally simulate how it will be received, I can do it because I have a rich library of patterns about how different kinds of comments will be received by different readers. If write something that triggers a pattern-detector that goes "uh-oh, that wouldn't be received well", I can rewrite it until it passes my mental simulation. That suggests that there would be a natural connection between the two.
* There are indications that current LLMs may already be doing something like internal simulation though not being that great at it. Like in the "mouse mastermind" vignette, it certainly intuitively feels like Claude has some kind of consistent internal model of what's going on. People have also e.g. trained LLMs to play games like Othello and found that the resulting network has an internal representation of the game board ( https://www.lesswrong.com/posts/nmxzr2zsjNtjaHh7x/actually-othello-gpt-has-a-linear-emergent-world ).
* There have also been various attempts at explicitly combining an LLM-based component with a component that does something like simulation. E.g. DeepMind trained a hybrid LLM-theorem prover system that reached silver medal-level performance on this year's International Mathematics Olympiad ( https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/ ), where the theorem prover component maintains a type of state over the math problem as it's being worked on.
* Iterative improvements like chain-of-thought reasoning are also taking LLMs in the direction of being able to apply more novel reasoning in domains such as math. Mathematician Terry Tao commented the following about giving the recent GPT-o1 model research-level math tasks to work on: 

> The experience seemed roughly on par with trying to advise a mediocre, but not completely incompetent, (static simulation of a) graduate student.  However, this was an improvement over previous models, whose capability was closer to an actually incompetent (static simulation of a) graduate student.  It may only take one or two further iterations of improved capability (and integration with other tools, such as computer algebra packages and proof assistants) until the level of "(static simulation of a) competent graduate student" is reached, at which point I could see this tool being of significant use in research level tasks.

* There have also been other papers trying out various techniques such as "whiteboard of thought" ( https://whiteboard.cs.columbia.edu/ ) where an LLM, when being presented with visual problems in verbal format, explicitly generates visual representations of the verbal description to use as an aid in its reasoning. It feels like a relatively obvious idea would be to roll out these kinds of approaches into future LLM architectures, teaching them to generate "mental images" of whatever task they were told to work on. This could then be used as part of an internal simulation.
* There's an evolutionary argument that the steps from "pure pattern recognition" to "pattern recognition with mental simulation added" might be relatively simple and not require that much in the fundamental breakthroughs, since evolution managed to find it in humans and in humans those abilities seem to be relatively continuous with each other. So we might expect all of these iterative improvements to take us pretty smoothly toward AGI.

Comment by Kaj_Sotala on The Big Nonprofits Post · 2024-11-30T22:12:53.006Z · LW · GW

Focus: Allow Roman Yampolskiy to continue his research and pursue a PhD

Huh? Roman not only does have a PhD already, he's a tenured associate professor. Maybe this meant money to allow him to have PhD students - on a few occasions he suggested that I do an AI Safety-focused PhD with him.

Comment by Kaj_Sotala on You are not too "irrational" to know your preferences. · 2024-11-29T11:25:24.962Z · LW · GW

Indeed, and there's another big reason for that - trying to always override your short-term "monkey brain" impulses just doesn't work that well for most people.

+1.

Which is a good thing, in this particular case, yes?

Less smoking does seem better than more smoking. Though generally it doesn't seem to me like social stigma would be a very effective way of reducing unhealthy behaviors - lots of those behaviors are ubiquitous despite being somewhat low-status. I think the problem is at least threefold:

  • As already mentioned, social stigma tends to cause optimization to avoid having the appearance of doing the low-status thing, instead of optimization to avoid doing the low-status thing. (To be clear, it does cause the latter too, but it doesn't cause the latter anywhere near exclusively.)
  • Social stigma easily causes counter-reactions where people turn the stigmatized thing into an outright virtue, or at least start aggressively holding that it's not actually that bad.
  • Shame makes things wonky in various ways. E.g. someone who feels they're out of shape may feel so much shame about the thought of doing badly if they try to exercise, they don't even try. For compulsive habits like smoking, there's often a loop where someone feels bad, turns to smoking to feel momentarily better, then feels even worse for having smoked, then because they feel even worse they are drawn even more strongly into smoking to feel momentarily better, etc.

I think generally people can maintain healthy habits much more consistently if their motivation comes from genuinely believing in the health benefits and wanting to feel better. But of course that's harder to spread on a mass scale, especially since not everyone actually feels better from healthy habits (e.g. some people feel better from exercise but some don't).

Then again, for the specific example of smoking in particular, stigma does seem to have reduced the amount of it (in part due to mechanisms like indoor smoking bans), so sometimes it does work anyway.

Comment by Kaj_Sotala on Locally optimal psychology · 2024-11-27T14:45:36.615Z · LW · GW

Incidentally, coherence therapy (which I know is one of the things Chris is drawing from) makes the distinction between three types of depression, some of them being strategies and some not. Also I recall Unlocking the Emotional Brain mentioning a fourth type which is purely biochemical.

From Coherence Therapy: Practice Manual & Training Guide:

Underlying emotional truth of depression: Three types

A. Depression that directly carries out an unconscious purpose/function
B. Depression that is a by-product of how an unconscious purpose is carried out
C. Depression expressing unconscious despair/grief/hopelessness

A. Depression that carries out an unconscious purpose

Client: Mother who is still in pained, debilitating depression 8 years after her 5-year-old son died after being hit by a car. (To view entire session see video 1096T, Stuck in Depression.) The following excerpt shows the creation of discovery experiences that reveal the powerful purpose of staying in depression (a purpose often encountered with clients in the bereavement process).

Th: I want you to look and see if there’s some other side of you, some area in your feelings where you feel you don’t deserve to be happy again.
Cl: Probably the guilt.
Th: The guilt. So what are the words of the guilt?
Cl: That I wasn’t outside when he was hit (to prevent it).
Th: I should have been outside.
Cl: I should have been outside.
Th: It’s my fault.
Cl: It’s my fault.

(About two minutes later:)

Th: Would you try to talk to me from the part of you that feels the guilt. Just from that side. I know there are these other sides. But from the place in you where you feel guilty, where you feel it was your fault that your dear little boy got hit by a truck, from that place, what’s the emotional truth for you — from that place — about whether it’s OK to feel happy again?
Cl: ...I don’t allow myself to be happy.
Th: [Very softly:] How come? How come?
Cl: How come?
Th: Because if you were happy—would you complete that sentence? “I don’t allow myself to be happy because if I were happy—”
Cl: I would have to forgive myself. [Pause.] And I’ve been unwilling to do that.
Th: Good. So keep going. “I’m unwilling to forgive myself because—”
Cl: You know there are parts of me that I think it’s about not wanting to go on myself without him.
And if I keep this going then I don’t have to do that.
Th: I see. So would you see him again? Picture Billy? And just try saying that to Billy. Try saying to him, ”I’m afraid that if I forgive myself I’ll lose connection with you and I’ll go on without you.”
Cl: [With much feeling:] Billy, even though I can picture you as a little angel I’m afraid to forgive myself—that you’ll go away and I don’t want you to go away.
Th: Yeah. And see if it’s true to say to him, “It’s so important for me to stay connected to you that I’m willing to not forgive myself forever. I’d rather be feeling guilty and not forgiving myself than lose contact with you and move on without you.” Try saying that. See if that feels true.
Cl: [Sighs. With much feeling:] Billy, I just feel like I would do anything to keep this connection with you including staying miserable and not forgiving myself for the rest of my life. And you know that’s true. [Her purpose for staying in depression is now explicit and directly experienced.]

B. Depression that is a by-product of how an unconscious purpose is carried out

Client: Lethargic woman, 33, says, “I’ve been feeling depressed and lousy for years… I have a black cloud around me all the time.” She describes herself as having absolutely no interests and as caring about nothing whatsoever, and expresses strong negative judgments toward herself for being a “vegetable.”

[Details of this example are in the 2002 publication cited in bibliography on p. 85. Several pro-symptom positions for depression were found and dissolved. The following account is from her sixth and final session.]

Discovery via symptom deprivation: Therapist prompts her to imagine having real interests; unhurriedly persists with this imaginal focus. Client suddenly exclaims, “I erased myself!” and describes how “my mother takes everything! She fucking takes it all! So I’ve got to erase myself! She always, always, always makes it her accomplishment, not mine. So why should I be anything? So I erased myself, so she couldn’t keep doing that to me.” Client now experiences her blankness as her own solution to her problem of psychological robbery, and recognizes her depression to be an inevitable by-product of living in the blankness that is crucial for safety but makes her future hopelessly empty.

Therapist then continues discovery into why “erasing” herself is the necessary way to be safe: Client brings to light a core presupposition of having no boundaries with mother, a “no walls rule.” With this awareness dawns the possibility of having “walls” so that what she thinks, feels or does remains private and cannot be stolen. She could then safely have interests and accomplishments. This new possibility immediately creates for client the tangible prospect of an appealing future, and she congruently describes rich feelings of excitement and energy.

Outcome: In response to follow-up query two months later, client reported, “It felt like a major breakthrough...this major rage got lifted” and said she had maintained privacy from mother around all significant personal matters. After two years she confirmed that the “black cloud” was gone, she was enthusiastically pursuing a new career, was off antidepressants, and said, “Things are good, in many ways. Things are very good.”

C. Depression expressing unconscious despair, grief, hopelessness

Client: Man with long history of a “drop” into depression every Fall. [This one-session example is video 1097SP, Down Every Year, available online at coherencetherapy.org. For a multi-session example of working with this type of depression, see “Unhappy No Matter What” in DOBT book, pp. 63-90.]

Surfaced emotional reality: At 10 he formed a belief that he failed parents’ expectations so severely that they forever “gave up on me” (he was sent in the Fall from USA to boarding school in Europe, was utterly miserable and begged to come home). Has been in despair ever since, unconsciously.

Outcome: Client subsequently initiated talk with parents about the incident 30 years ago; not once had it been discussed. In this conversation it became real to him that their behavior did not mean they gave up on him, and five months after session reported continuing relief from feeling depressed and inadequate.

Comment by Kaj_Sotala on You are not too "irrational" to know your preferences. · 2024-11-27T14:38:02.855Z · LW · GW

Commenting on a relatively isolated point in what you wrote; none of this affects your core point about preferences being entangled with predictions (actually it relies on it).

This is why you could view a smoker's preference for another cigarette as irrational: the 'core want' is just a simple preference for the general feel of smoking a cigarette, but the short-jolt preference has the added prediction of "and this will be good to do". But that added prediction is false and inconsistent with everything they know. The usual statement of "you would regret this in the future".

I think that the short-jolt preference's prediction is actually often correct; it's just over a shorter time horizon. The short-term preference predicts that "if I take this smoke, then I will feel better" and it is correct. The long-term preference predicts that "I will later regret taking this smoke, " and it is also correct. Neither preference is irrational, they're just optimizing over different goals and timescales.

Now it would certainly be tempting to define rationality as something like "only taking actions that you endorse in the long term", but I'd be cautious of that. Some long-term preferences are genuinely that, but many of them are also optimizing for something looking good socially, while failing to model any of the genuine benefits of the socially-unpopular short-term actions. 

For example, smoking a cigarette often gives smokers a temporary feeling of being in control, and if they are going out to smoke together with others, a break and some social connection. It is certainly valid to look at those benefits and judge that they are still not worth the long-term costs... but frequently the "long-term" preference may be based on something like "smoking is bad and uncool and I shouldn't do it and I should never say that there could be a valid reason to do for otherwise everyone will scold me".

Then by maintaining both the short-term preference (which continues the smoking habit) and the long-term preference (which might make socially-visible attempts to stop smoking), the person may be getting the benefit from smoking while also avoiding some of the social costs of continuing.

This is obviously not to say that the costs of smoking would only be social. Of course there are genuine health reasons as well. But I think that quite a few people who care about "health" actually care about not appearing low status by doing things that everyone knows are unhealthy. 

Though even if that wasn't the case - how do you weigh the pleasure of a cigarette now, versus increased probability of various health issues some time in the future? It's certainly very valid to say that better health in the future outweighs the pleasure in the now, but there's also no objective criteria for why that should be; you could equally consistently put things other way around.

So I don't think that smoking a cigarette is necessarily irrational in the sense of making an incorrect prediction. It's more like a correct but only locally optimal prediction. (Though it's also valid to define rationality as something like "globally optimal behavior", or as the thing that you'd do if you got both the long-term and the short-term preference to see each other's points and then make a decision that took all the benefits and harms into consideration.)

Comment by Kaj_Sotala on Which things were you surprised to learn are not metaphors? · 2024-11-24T09:27:42.524Z · LW · GW

I have a friend with eidetic imagination who says that for her, there is literally no difference between seeing something and imagining it. Sometimes she's worried about losing track of reality if she were to imagine too much.

Comment by Kaj_Sotala on Which things were you surprised to learn are not metaphors? · 2024-11-24T08:47:28.509Z · LW · GW

Oh yeah, this. I used to think that "argh" or "it hurts" were just hyperbolic compliments for an excellent pun. Turns out, puns actually are painful to some people.

Comment by Kaj_Sotala on Which things were you surprised to learn are not metaphors? · 2024-11-21T19:14:21.047Z · LW · GW

Annoyingly I have the recollection of having thought "oh, that's not a metaphor?" several times in my life, but I don't seem to have saved what the things in question actually were.

Comment by Kaj_Sotala on Evolution's selection target depends on your weighting · 2024-11-20T09:43:19.114Z · LW · GW

I guess I don't really understand what you're asking. I meant my comment as an answer to this bit in the OP:

I think it's common on LessWrong to think of evolution's selection target as inclusive genetic fitness - that evolution tries to create organisms which make as many organisms with similar DNA to themselves as possible. But what exactly does this select for? 

In that evolution selecting for "inclusive genetic fitness" doesn't really mean selecting for anything in particular; what exactly that ends up selecting for is completely dependent on the environment (where "the environment" also includes the species itself, which is relevant for things like sexual selection or frequency-dependent selection). 

If you fix the environment, assuming for the sake of argument that it's possible to do that, then the exact thing it selects for are just the traits that are useful in that environment.

Do humans have high inclusive genetic fitness?

I think it's a bit of a category mistake to ask about the inclusive fitness of a species. You could calculate the average fitness of an individual within the species, but at least to my knowledge (caveat: I'm not a biologist) that's not very useful. Usually it's individual genotypes or phenotypes within the species that are assigned a fitness.

Comment by Kaj_Sotala on Evolution's selection target depends on your weighting · 2024-11-20T08:22:54.073Z · LW · GW

I've previously argued that genetic fitness is a measure of selection strength, not the selection target. What evolution selects for are traits that happen to be useful in the organism's current environment. The extent to which a trait is useful in the organism's current environment can be quantified as fitness, but fitness is specific to a particular environment and the same trait might have a very different fitness in some other environment.

Comment by Kaj_Sotala on Social events with plausible deniability · 2024-11-19T18:07:00.412Z · LW · GW

I think if you have access to a group interested in doing social events with plausible deniability, that group is probably already a place where you should be able to be honest about your beliefs without fear of "cancellation."

You may not know exactly who belongs to that group before going to the event and seeing who shows up.

Comment by Kaj_Sotala on Ayn Rand’s model of “living money”; and an upside of burnout · 2024-11-19T15:57:50.462Z · LW · GW
  • Somehow people who are in good physical health wake up each day with a certain amount of restored willpower.  (This is inconsistent with the toy model in the OP, but is still my real / more-complicated model.)

This fits in with opportunity cost-centered and exploration-exploitation -based views of willpower. Excessive focus on any one task implies that you are probably hitting diminishing returns while accumulating opportunity costs for not doing anything else. It also implies that you are probably strongly in "exploit" mode and not doing much exploring. Under those models, accumulating mental fatigue acts to force some of your focus to go to tasks that feel more intrinsically enjoyable rather than duty-based, which tends to correlate with things like exploration and e.g. social resource-building. And your willpower gets reset during the night so that you could then go back to working on those high-opportunity cost exploit tasks again.

I think those models fit together with yours.

Comment by Kaj_Sotala on Ayn Rand’s model of “living money”; and an upside of burnout · 2024-11-19T15:38:13.162Z · LW · GW

(I believe @Kaj_Sotala has written about this somewhere wrt Global Workspace Theory? I found this tweet in the meantime.) 

There's at least this bit from "Subagents, akrasia, and coherence in humans":

One model (e.g. Redgrave 2007, McHaffie 2005) is that the basal ganglia receives inputs from many different brain systems; each of those systems can send different “bids” supporting or opposing a specific course of action to the basal ganglia. A bid submitted by one subsystem may, through looped connections going back from the basal ganglia, inhibit other subsystems, until one of the proposed actions becomes sufficiently dominant to be taken.

The above image from Redgrave 2007 has a conceptual image of the model, with two example subsystems shown. Suppose that you are eating at a restaurant in Jurassic Park when two velociraptors charge in through the window. Previously, your hunger system was submitting successful bids for the “let’s keep eating” action, which then caused inhibitory impulses to be sent to the threat system. This inhibition prevented the threat system from making bids for silly things like jumping up from the table and running away in a panic. However, as your brain registers the new situation, the threat system gets significantly more strongly activated, sending a strong bid for the “let’s run away” action. As a result of the basal ganglia receiving that bid, an inhibitory impulse is routed from the basal ganglia to the subsystem which was previously submitting bids for the “let’s keep eating” actions. This makes the threat system’s bids even stronger relative to the (inhibited) eating system’s bids.

Soon the basal ganglia, which was previously inhibiting the threat subsystem’s access to the motor system while allowing the eating system access, withdraws that inhibition and starts inhibiting the eating system’s access instead. The result is that you jump up from your chair and begin to run away. Unfortunately, this is hopeless since the velociraptor is faster than you. A few moments later, the velociraptor’s basal ganglia gives the raptor’s “eating” subsystem access to the raptor’s motor system, letting it happily munch down its latest meal.

But let’s leave velociraptors behind and go back to our original example with the phone. Suppose that you have been trying to replace the habit of looking at your phone when bored, to instead smiling and directing your attention to pleasant sensations in your body, and then letting your mind wander.

Until the new habit establishes itself, the two habits will compete for control. Frequently, the old habit will be stronger, and you will just automatically check your phone without even remembering that you were supposed to do something different. For this reason, behavioral change programs may first spend several weeks just practicing noticing the situations in which you engage in the old habit. When you do notice what you are about to do, then more goal-directed subsystems may send bids towards the “smile and look for nice sensations” action. If this happens and you pay attention to your experience, you may notice that long-term it actually feels more pleasant than looking at the phone, reinforcing the new habit until it becomes prevalent.

To put this in terms of the subagent model, we might drastically simplify things by saying that the neural pattern corresponding to the old habit is a subagent reacting to a specific sensation (boredom) in the consciousness workspace: its reaction is to generate an intention to look at the phone. At first, you might train the subagent responsible for monitoring the contents of your consciousness, to output moments of introspective awareness highlighting when that intention appears. That introspective awareness helps alert a goal-directed subagent to try to trigger the new habit instead. Gradually, a neural circuit corresponding to the new habit gets trained up, which starts sending its own bids when it detects boredom. Over time, reinforcement learning in the basal ganglia starts giving that subagent’s bids more weight relative to the old habit’s, until it no longer needs the goal-directed subagent’s support in order to win.

Now this model helps incorporate things like the role of having a vivid emotional motivation, a sense of hope, or psyching yourself up when trying to achieve habit change. Doing things like imagining an outcome that you wish the habit to lead to, may activate additional subsystems which care about those kinds of outcomes, causing them to submit additional bids in favor of the new habit. The extent to which you succeed at doing so, depends on the extent to which your mind-system considers it plausible that the new habit leads to the new outcome. For instance, if you imagine your exercise habit making you strong and healthy, then subagents which care about strength and health might activate to the extent that you believe this to be a likely outcome, sending bids in favor of the exercise action.

On this view, one way for the mind to maintain coherence and readjust its behaviors, is its ability to re-evaluate old habits in light of which subsystems get activated when reflecting on the possible consequences of new habits. An old habit having been strongly reinforced reflects that a great deal of evidence has accumulated in favor of it being beneficial, but the behavior in question can still be overridden if enough influential subsystems weigh in with their evaluation that a new behavior would be more beneficial in expectation.

Some subsystems having concerns (e.g. immediate survival) which are ranked more highly than others (e.g. creative exploration) means that the decision-making process ends up carrying out an implicit expected utility calculation. The strengths of bids submitted by different systems do not just reflect the probability that those subsystems put on an action being the most beneficial. There are also different mechanisms giving the bids from different subsystems varying amounts of weight, depending on how important the concerns represented by that subsystem happen to be in that situation. This ends up doing something like weighting the probabilities by utility, with the kinds of utility calculations that are chosen by evolution and culture in a way to maximize genetic fitness on average. Protectors, of course, are subsystems whose bids are weighted particularly strongly, since the system puts high utility on avoiding the kinds of outcomes they are trying to avoid.

The original question which motivated this section was: why are we sometimes incapable of adopting a new habit or abandoning an old one, despite knowing that to be a good idea? And the answer is: because we don’t know that such a change would be a good idea. Rather, some subsystems think that it would be a good idea, but other subsystems remain unconvinced. Thus the system’s overall judgment is that the old behavior should be maintained.

Comment by Kaj_Sotala on Ayn Rand’s model of “living money”; and an upside of burnout · 2024-11-19T15:33:23.122Z · LW · GW

your psyche’s conscious verbal planner “earns” willpower

This seems to assume that there's 1) exactly one planner and 2) it's verbal. I think there are probably different parts that enforce top-down control, some verbal and some maybe not.

For example, exerting willpower to study boring academic material seems like a very different process than exerting willpower to lift weights at the gym.

I think that there is something like:

  • Local beliefs about the usefulness of exerting willpower in a particular context (e.g. someone might not believe that willpower is useful in school but does believe that it's useful in the gym, or vice versa, and correspondingly have more willpower available in one context than the other)
  • To the extent that one has internalized a concept about "willpower" being a single thing, broader beliefs about willpower being useful in general
  • Various neurological and biological variables that determine how strong one's top-down processes are in general, relative to their bottom-up processes (e.g. someone with ADHD will have their bottom-up processes be innately stronger than the top-down ones; medication may then strengthen the amount of top-down control they have).
  • Various neurological and biological variables that determine which of one's processes get priority in any given situation (e.g. top-down control tends to be inhibited when hungry or tired; various emotional states may either reduce or increase the strength of top-down control)

My model of burnout roughly agrees with both your and @Matt Goldenberg . To add to Matt's "burnout as revolt" model, my hunch is that burnout often involves not only a loss of belief that top-down control is beneficial. I think it also involves more biological changes to the neural variables that determine the effectiveness of top-down versus bottom-up control. Something in the physical ability of the top-down processes to control the bottom-up ones is damaged, possibly permanently. 

Metaphorically, it's like the revolting parts don't just refuse to collaborate anymore; they also blow up some of the infrastructure that was previously used to control them.

Comment by Kaj_Sotala on The hostile telepaths problem · 2024-11-12T12:28:02.047Z · LW · GW

Sounds plausible to me. Alternatively, telling you that they didn't over-apologize still communicates that they would have over-apologized in different circumstances, so it can be a covert way of still delivering that apology.

Comment by Kaj_Sotala on The hostile telepaths problem · 2024-11-11T14:51:06.270Z · LW · GW

A crucial part of every IFS session is to ask the protector what age they think you are (often, at least in examples, it would say something like 5-12) and then you could reveal to it that actually you're 30 (or whatever).

I wouldn't put it as strongly as to say that it's a crucial part of every IFS session. It can sometimes be a very useful question and approach, sure, but I've had/facilitated plenty of great sessions that didn't use that question at all. And there are people who that question just doesn't resonate with.

Comment by Kaj_Sotala on Should CA, TX, OK, and LA merge into a giant swing state, just for elections? · 2024-11-08T10:54:34.736Z · LW · GW

As far as I know, the latest representative expert survey on the topic is "Thousands of AI Authors on the Future of AI", in which the median time for a 50% chance of AGI was either in 23 or 92 years, depending on how the question was phrased:

If science continues undisrupted, the chance of unaided machines outperforming humans in every possible task was estimated at 10% by 2027, and 50% by 2047. [...] However, the chance of all human occupations becoming fully automatable was forecast to reach 10% by 2037, and 50% as late as 2116 (compared to 2164 in the 2022 survey).

Not that these numbers would mean much because AI experts aren't experts on forecasting, but it still suggests a substantial possibility for AGI to take quite a while yet.

Comment by Kaj_Sotala on The Median Researcher Problem · 2024-11-03T21:25:41.830Z · LW · GW

Hmm... let me rephrase: it doesn't seem to me like we would actually have a clear community norm for this, at least not one strong enough to ensure that the median community member would actually be familiar with stats and econ.

Comment by Kaj_Sotala on The Median Researcher Problem · 2024-11-03T19:32:57.699Z · LW · GW

community norms which require basically everyone to be familiar with statistics and economics,

I think this is too strong. There are quite a few posts that don't require knowledge of either one to write, read, or comment on. I'm certain that one could easily accumulate lots of karma and become a well-respected poster without knowing either.

Comment by Kaj_Sotala on The Median Researcher Problem · 2024-11-03T19:23:29.890Z · LW · GW

I had the thought while reading the original post that I recall speaking to at least one researcher who, pre-replication crisis, was like "my work is built on a pretty shaky foundation as is most of the research in this field, but what can you do, this is the way the game is played". So that suggested to me that plenty of median researchers might have recognized the issue but not been incentivized to change it.

Lab leaders aren't necessarily in a much better position. If they feel responsibility toward their staff, they might feel even more pressured to keep gaming the metrics so that the lab can keep getting grants and its researchers good CVs.