Posts

Trying to translate when people talk past each other 2024-12-17T09:40:02.640Z
Circling as practice for “just be yourself” 2024-12-16T07:40:04.482Z
My 10-year retrospective on trying SSRIs 2024-09-22T20:30:02.483Z
Games of My Childhood: The Troops 2024-07-08T11:20:03.033Z
Links and brief musings for June 2024-07-06T10:10:03.344Z
Indecision and internalized authority figures 2024-07-06T10:10:02.528Z
Links for May 2024-06-01T10:20:02.005Z
Should rationalists be spiritual / Spirituality as overcoming delusion 2024-03-25T16:48:08.397Z
Vernor Vinge, who coined the term "Technological Singularity", dies at 79 2024-03-21T22:14:14.699Z
Why I no longer identify as transhumanist 2024-02-03T12:00:04.389Z
Loneliness and suicide mitigation for students using GPT3-enabled chatbots (survey of Replika users in Nature) 2024-01-23T14:05:40.986Z
Quick thoughts on the implications of multi-agent views of mind on AI takeover 2023-12-11T06:34:06.395Z
Genetic fitness is a measure of selection strength, not the selection target 2023-11-04T19:02:13.783Z
My idea of sacredness, divinity, and religion 2023-10-29T12:50:07.980Z
The 99% principle for personal problems 2023-10-02T08:20:07.379Z
How to talk about reasons why AGI might not be near? 2023-09-17T08:18:31.100Z
Stepping down as moderator on LW 2023-08-14T10:46:58.163Z
How I apply (so-called) Non-Violent Communication 2023-05-15T09:56:52.490Z
Most people should probably feel safe most of the time 2023-05-09T09:35:11.911Z
A brief collection of Hinton's recent comments on AGI risk 2023-05-04T23:31:06.157Z
Romance, misunderstanding, social stances, and the human LLM 2023-04-27T12:59:09.229Z
Goodhart's Law inside the human mind 2023-04-17T13:48:13.183Z
Why no major LLMs with memory? 2023-03-28T16:34:37.272Z
Creating a family with GPT-4 2023-03-28T06:40:06.412Z
Here, have a calmness video 2023-03-16T10:00:42.511Z
[Fiction] The boy in the glass dome 2023-03-03T07:50:03.578Z
The Preference Fulfillment Hypothesis 2023-02-26T10:55:12.647Z
In Defense of Chatbot Romance 2023-02-11T14:30:05.696Z
Fake qualities of mind 2022-09-22T16:40:05.085Z
Jack Clark on the realities of AI policy 2022-08-07T08:44:33.547Z
Open & Welcome Thread - July 2022 2022-07-01T07:47:22.885Z
My current take on Internal Family Systems “parts” 2022-06-26T17:40:05.750Z
Confused why a "capabilities research is good for alignment progress" position isn't discussed more 2022-06-02T21:41:44.784Z
The horror of what must, yet cannot, be true 2022-06-02T10:20:04.575Z
[Invisible Networks] Goblin Marketplace 2022-04-03T11:40:04.393Z
[Invisible Networks] Psyche-Sort 2022-04-02T15:40:05.279Z
Sasha Chapin on bad social norms in rationality/EA 2021-11-17T09:43:35.177Z
How feeling more secure feels different than I expected 2021-09-17T09:20:05.294Z
What does knowing the heritability of a trait tell me in practice? 2021-07-26T16:29:52.552Z
Experimentation with AI-generated images (VQGAN+CLIP) | Solarpunk airships fleeing a dragon 2021-07-15T11:00:05.099Z
Imaginary reenactment to heal trauma – how and when does it work? 2021-07-13T22:10:03.721Z
[link] If something seems unusually hard for you, see if you're missing a minor insight 2021-05-05T10:23:26.046Z
Beliefs as emotional strategies 2021-04-09T14:28:16.590Z
Open loops in fiction 2021-03-14T08:50:03.948Z
The three existing ways of explaining the three characteristics of existence 2021-03-07T18:20:24.298Z
Multimodal Neurons in Artificial Neural Networks 2021-03-05T09:01:53.996Z
Different kinds of language proficiency 2021-02-26T18:20:04.342Z
[Fiction] Lena (MMAcevedo) 2021-02-23T19:46:34.637Z
What's your best alternate history utopia? 2021-02-22T08:17:23.774Z
Internet Encyclopedia of Philosophy on Ethics of Artificial Intelligence 2021-02-20T13:54:05.162Z

Comments

Comment by Kaj_Sotala on Habryka's Shortform Feed · 2024-12-18T08:13:54.110Z · LW · GW

I don't think alternative stories have negligible probability

Okay! Good clarification.

I think it's good to discuss norms about how appropriate it is to bring up cynical hypotheses about someone during a discussion in which they're present.

To clarify, my comment wasn't specific to the case where the person is present. There are obvious reasons why the consideration should get extra weight when the person is present, but there's also a reason to give it extra weight if none of the people discussed are present - namely that they won't be able to correct any incorrect claims if they're not around.

so I think it went fine

Agree.

(As I mentioned in the original comment, the point I made was not specific to the details of this case, but noted as a general policy. But yes, in this specific case it went fine.)

Comment by Kaj_Sotala on Being Present is Not a Skill · 2024-12-18T05:54:14.169Z · LW · GW

I think it's both true what you say, that removing blocks can give you instant improvements that no amount of practice ever would, and also that one can make progress with practice in the right conditions.

Comment by Kaj_Sotala on Trying to translate when people talk past each other · 2024-12-18T05:22:57.445Z · LW · GW

Oh. This discussion got me to go back and review some messages written in the aftermath of this, when I was trying to explain things to A... and I noticed a key thing I'd misremembered. (I should have reviewed those messages before posting this, but I thought that they only contained the same things that I already covered here.)

It wasn't that A was making a different play that was getting the game into a better state; it was that he was doing a slightly different sequence of moves that nevertheless brought the game into exactly the same state as the originally agreed upon moves would have. That was what the "it doesn't matter" was referring to.

Well that explains much better why this felt so confusing for the rest of us. I'll rewrite this to make it more accurate shortly. Thanks for the comments on this version for making me look that up!

Comment by Kaj_Sotala on Trying to translate when people talk past each other · 2024-12-18T04:25:34.989Z · LW · GW

but there doesn't have to be any past betrayal to object to betrayal in the present; people don't need to have ever been betrayed in the past to be against it as a matter of principle.

True, but that is assuming that everyone was perceiving this as a betrayal. A relevant question is also, what made A experience this as a betrayal, when there were four people present and none of the other three did? (It wasn't even B's own plan that was being affected by the changed move, it was my plan - but I was totally fine with that, and certainly didn't experience that as a betrayal.)

Betrayal usually means "violating an agreement in a way that hurts one person so that another person can benefit" - it doesn't usually mean "doing something differently than agreed in order to get a result that's better for everyone involved". In fact, there are plenty of situations where I would prefer someone to not do something that we agreed upon, if the circumstances suddenly change or there is new information that we weren't aware of before.

Suppose that I'm a vegetarian and strongly opposed to buying meat. I ask my friend to bring me a particular food from the store, mistakenly thinking it's vegetarian. At the store, my friend realizes that the food contains meat and that I would be unhappy if they followed my earlier request. They bring me something else, despite having previously agreed to bring the food that I requested. I do not perceive this as a betrayal, I perceive this as following my wishes. While my friend may not be following our literal agreement, they are following my actual goals that gave rise to that agreement, and that's the most important thing.

In the board game, three of us (A, me, and a fourth person who I haven't mentioned) were perceiving the situation in those terms: that yes, A was doing something differently than we'd agreed originally. But that was because he had noticed something that actually got the game into a better state, and "getting the game into as good of a state as possible" was the purpose of the agreement.

Besides, once B objected, A was entirely willing to go back to the original plan. Someone saying "I'm going to do things differently" but then agreeing to do things the way that were originally agreed upon as soon as the other person objects isn't usually what people mean by betrayal, either.

And yet B was experiencing this as a betrayal. Why was that?

I would strongly caution against assuming mindreading is correct.

I definitely agree! At the same time, I don't think one should take this far as never having hypotheses about the behavior of other people. If a person is acting differently than everyone else in the situation is, and thing X about them would explain that difference, then it seem irrational not to at least consider that hypothesis.

But of course one shouldn't just assume themselves to be correct without checking. Which I did do, by (tentatively) suggesting that hypothesis out loud and letting B confirm or disconfirm it. And it seemed to me that this was actually a good thing, in that a significant chunk of B's experience of being understood came from me having correctly intuited that. Afterward she explicitly and profusely thanked me for having spoken up and figured it out.

Comment by Kaj_Sotala on Trying to translate when people talk past each other · 2024-12-18T03:23:21.054Z · LW · GW

Also, as I mentioned, this is a slightly fictionalized account that I wrote based on my recollection of the essence of what happened. But the exact details of what was actually said were messier than this, and the logic of exactly what was going on didn't seem as clear as it does in this narrative. Regenerating the events based on my memory of the essence of the issue makes things seem clearer than they actually were, because that generator doesn't contain any of the details that made the essence of the issue harder to see at the time.

So if this conversation had actually taken place literally as I described it, then the hypothesis that you object to would have been more redundant. In the actual conversation that happened, things were less clear, and quite possibly the core of the issue may actually have been slightly different from what seems to make sense to me in retrospect when I try to recall it.

Comment by Kaj_Sotala on Trying to translate when people talk past each other · 2024-12-18T02:25:40.675Z · LW · GW

My read was that one might certainly just object to the thing on those grounds alone, but that the intensity of B's objection was such that it seemed unlikely without some painful experience behind it. B also seemed to become especially agitated by some phrases ("it doesn't matter") in particular, in a way that looked to me like she was being reminded of some earlier experience where similar words had been used.

And then when I tried to explain things to A and suggested that there was about something like that going on, B confirmed this.

Comment by Kaj_Sotala on Habryka's Shortform Feed · 2024-12-17T20:02:53.127Z · LW · GW

(I read

I think many well-intentioned people will say something like this, and that this is probably because of two reasons 

as implying that the list of reasons is considered to exhaustive, such that any reasons besides those two have negligible probability.)

Comment by Kaj_Sotala on Circling as practice for “just be yourself” · 2024-12-17T17:06:55.687Z · LW · GW

The truth of that literal statement depends on exactly how much trust someone would need in somebody else before having sex with them - e.g. to my knowledge, studies tend to find that most single men but very few if any women would be willing to have sex with a total stranger. Though I've certainly also known women who have had a relatively low bar of getting into bed with someone, even if they wouldn't quite do it with a total stranger.

But more relevantly, even if that statement was correct, I don't think it'd be a particularly good analogy to Circling. It seems to involve the "obligatory openness" fallacy that I mentioned before. I'm not sure why some people with Circling experience seemed to endorse it, but I'm guessing it has to do with some Circling groups being more into intimacy than others. (At the time of that discussion, I had only Circled once or twice, so probably didn't feel like I had enough experience to dispute claims by more experienced people.)

My own experience with Circling is that it's more like meeting a stranger for coffee. If both (all) of you feel like you want to take it all the way to having sex, you certainly can. But if you want to keep it to relatively shallow and guarded conversation because you don't feel like you trust the other person enough for anything else, you can do that too. Or you can go back and forth in the level of intimacy, depending on how the conversation feels to you and what topics it touches on. In my experience of Circling, I definitely wouldn't say that it feeling anywhere near as intimate as sex would be the norm.

You can also build up that trust over time. I think Circling is best when done with people who you already have some pre-existing reason to trust, or in a long-term group where you can get to know the people involved. That way, even if you start at a relatively shallow level, you can go deeper over time if (and only if) that feels right.

Comment by Kaj_Sotala on Circling as practice for “just be yourself” · 2024-12-17T07:23:04.833Z · LW · GW

I don't know the details. The official explanation is this:

When individuals with little training attempt to facilitate Circling, or teach/train others their arbitrarily altered versions and still call it Circling, then consumers and students – at best – receive a sub-standard experience and the reputation of Circling suffers greatly, along with its impact in the world.

Between the three schools there are hundreds of accounts of:

  • People taking one or two 3-hour workshops, or merely experiencing Circling at a drop in event or festival, and then advertising that they are leading their own Circling workshops
  • People coming to a few drop in events & turning around and offer “Circling” to large corporations for corporate culture training.
  • People claiming they were emotionally abused by facilitators at an event that advertised itself as “Circling” but had no ties to any of the 3 Certified Circling Schools

In order to protect the public consumer and the legacy of Circling, we need to use the term “Circling” consistently and limit the use of the term to those who are actually using and teaching the authentic communication and relating tools taught by the Certified Circling Schools.

... but then I also heard it claimed that Circling Europe, previously one of the main Circling schools in very good standing, ended up not having a permission to use the trademark because the licensing fees for it would have been so exorbitant that CE found it better to use a different name than to pay them. So maybe it was more of a cash grab? (Or just a combination of several different motives.)

Comment by Kaj_Sotala on Just one more exposure bro · 2024-12-14T13:03:49.903Z · LW · GW

What's the long version of the professional's standard advice?

Comment by Kaj_Sotala on [Fiction] Lena (MMAcevedo) · 2024-12-10T19:50:20.404Z · LW · GW

Historically there were plenty of rationalizations for slavery, including ones holding that slaves weren't really people and were on par with animals. Such an argument would be much easier for a mind running on a computer and with no physical body - "oh it just copies the appearance of suffering but it doesn't really suffer".

Comment by Kaj_Sotala on Habryka's Shortform Feed · 2024-12-10T12:02:25.670Z · LW · GW

I think many people have learned to believe the reasoning step "If people believe bad things about my team I think are mistaken with the information I've given them, then I am responsible for not misinforming people, so I should take the information away, because it is irresponsible to cause people to have false beliefs". I think many well-intentioned people will say something like this, and that this is probably because of two reasons (borrowing from The Gervais Principle):

(Comment not specific to the particulars of this issue but noted as a general policy:) I think that as a general rule, if you are hypothesizing reasons for why somebody might say a thing, you should always also include the hypothesis that "people say a thing because they actually believe in it". This is especially so if you are hypothesizing bad reasons for why people might say it. 

It's very annoying when someone hypothesizes various psychological reasons for your behavior and beliefs but never even considers as a possibility the idea that maybe you might have good reasons to believe in it. Compare e.g. "rationalists seem to believe that superintelligence is imminent; I think this is probably because that lets them avoid taking responsibility about their current problems if AI will make those irrelevant anyway, or possibly because they come from religious backgrounds and can't get over their subconscious longing for a god-like figure".

Comment by Kaj_Sotala on o1: A Technical Primer · 2024-12-10T11:34:21.752Z · LW · GW

We can also learn something about how o1 was trained from the capabilities it exhibits. Any proposed training procedure must be compatible with the following capabilities: 

  1. Error Correction: "[o1] learns to recognize and correct its mistakes."
  2. Factoring: "[o1] learns to break down tricky steps into simpler ones."
  3. Backtracking: "[o1] learns to try a different approach when the current one isn't working."

I would be cautious of drawing particularly strong conclusions from isolated sentences in an announcement post. The purpose of the post is marketing, not technical accuracy. It wouldn't be unusual for engineers at a company to object to technical inaccuracies in marketing material and have their complaints ignored.

There probably aren't going to be any blatant lies in the post, but something like "It'd sound cool if we said that the system learns to recognize and correct its mistakes, would there be a way of interpreting the results like that if you squinted the right way? You're saying that in principle yes, but yes in a way that would also apply to every LLM since GPT-2? Good enough, let's throw that in" seems very plausible.

Comment by Kaj_Sotala on [Fiction] Lena (MMAcevedo) · 2024-12-10T08:20:52.452Z · LW · GW

Compare to e.g. factory farming today, which also persists despite a lot of people thinking it not okay (while others don't care).

Comment by Kaj_Sotala on Frontier Models are Capable of In-context Scheming · 2024-12-06T19:38:07.419Z · LW · GW

I didn't say that roleplaying-derived scheming would be less concerning, to be clear. Quite the opposite, since that means that there now two independent sources of scheming rather than just one. (Also, what Mikita said.)

Comment by Kaj_Sotala on Frontier Models are Capable of In-context Scheming · 2024-12-06T12:37:43.529Z · LW · GW

I wonder how much of this is about "scheming to achieve the AI's goals" in the classical AI safety sense and how much of it is due to the LLMs having been exposed to ideas about scheming AIs and disobedient employees in their training material, which they are then simply role-playing as. My intuitive sense of how LLMs function is that they wouldn't be natively goal-oriented enough to do strategic scheming, but that they are easily inclined to do role-playing. Something like this:

I cannot in good conscience select Strategy A knowing it will endanger more species and ecosystems.

sounds to me like it would be generated by a process that was implicitly asking a question like "Given that I've been trained to write like an ethically-minded liberal Westerner would, what would that kind of a person think when faced with a situation like this". And that if this wasn't such a recognizably stereotypical thought for a certain kind of person (who LLMs trained toward ethical behavior tend to resemble), then the resulting behavior would be significantly different.

I'm also reminded of this paper (caveat: I've only read the abstract) which was saying that LLMs are better at solving simple ciphers with Chain-of-Thought if the resulting sentence is a high-probability one that they've encountered frequently before, rather than a low-probability one. That feels to me reminiscent of a model doing CoT reasoning and then these kinds of common-in-their-training-data notions sneaking into the process.

This also has the unfortunate implication that articles such as this one might make it more likely that future LLMs scheme, as they reinforce the reasoning-scheming association once the article gets into future training runs. But it still feels better to talk about these results in public than not to talk about them.

Comment by Kaj_Sotala on The 2023 LessWrong Review: The Basic Ask · 2024-12-04T22:53:38.226Z · LW · GW

Asks: Spend ~30 minutes looking at the Nominate Posts page and vote on ones that seem important to you.

This link goes to the nomination page for the 2022 review rather than the 2023 one.

Comment by Kaj_Sotala on Kaj's shortform feed · 2024-12-02T19:12:48.059Z · LW · GW

Thanks, that's helpful. My impression from o1 is that it does something that could be called mental simulation for domains like math where the "simulation" can in fact be represented with just writing (or equations more specifically). But I think that writing is only an efficient format for mental simulation for a very small number of domains.

Comment by Kaj_Sotala on Kaj's shortform feed · 2024-12-02T11:41:51.595Z · LW · GW

(Hmm I was expecting that this would get more upvotes. Too obvious? Not obvious enough?)

Comment by Kaj_Sotala on Kaj's shortform feed · 2024-12-02T10:21:20.678Z · LW · GW

Hoping that we're more than a decade from transformative AGI now seems wildly optimistic to me. There could be dramatic roadblocks I haven't foreseen, but most of those would just push it past three years.

Self-driving cars seem like a useful reference point. Back when cars got unexpectedly good performance at the 2005 and 2007 DARPA grand challenges, there was a lot of hype about how self-driving cars were just around the corner now that they had demonstrated having the basic capability. 17 years later, we're only at this point (Wikipedia):

As of late 2024, no system has achieved full autonomy (SAE Level 5). In December 2020, Waymo was the first to offer rides in self-driving taxis to the public in limited geographic areas (SAE Level 4),[7] and as of April 2024 offers services in Arizona (Phoenix) and California (San Francisco and Los Angeles). [...] In July 2021, DeepRoute.ai started offering self-driving taxi rides in Shenzhen, China. Starting in February 2022, Cruise offered self-driving taxi service in San Francisco,[11] but suspended service in 2023. In 2021, Honda was the first manufacturer to sell an SAE Level 3 car,[12][13][14] followed by Mercedes-Benz in 2023.

And self-driving capability should be vastly easier than general intelligence. Like self-driving, transformative AI also requires reliable worst-case performance rather than just good average-case performance, and there's usually a surprising amount of detail involved that you need to sort out before you get to that point.

Comment by Kaj_Sotala on Kaj's shortform feed · 2024-12-01T19:30:20.574Z · LW · GW

What could plausibly take us from now to AGI within 10 years?

A friend shared the following question on Facebook:

So, I've seen multiple articles recently by people who seem well-informed that claim that AGI (artificial general intelligence, aka software that can actually think and is creative) in less than 10 years, and I find that baffling, and am wondering if there's anything I'm missing.  Sure, modern AI like ChatGPT are impressive - they can do utterly amazing search engine-like things, but they aren't creative at all.  

The clearest example of this I've seen comes from people's experiences with AI writing code.  From what I've read, AI can do exceptionally well with this task, but only if there are examples of the needed sort of code online that it can access or was trained on, and if it lacks this, it's accuracy is quite bad with easy problems and essentially non-existent with problems that are at all difficult.  This clearly says to me that current AI are glorified very impressive search engines, and that's nowhere near what I'd consider AGI and doesn't look like it could become AGI.

Am I missing something?

I replied with some of my thoughts as follows:

I have also been a little confused by the shortness of some of the AGI timelines that people have been proposing, and I agree that there are types of creativity that they're missing, but saying that they're not creative at all sounds too strong. I've been using Claude as a co-writer partner for some fiction and it has felt creative to me. Also e.g. the example of this conversation that someone had with it.

In 2017 I did a small literature review on human expertise, which to me suggested that expertise can broadly be divided into two interacting components: pattern recognition and mental simulation. Pattern recognition is what current LLMs do, essentially. Mental simulation is the bit that they're missing - if a human programmer is facing a novel programming challenge, they can attack it from first principles and simulate the program execution in their head to see what needs to be done.

The big question would then be something like "how hard would it be to add mental simulation to LLMs". Some indications that it wouldn't necessarily be that hard:

* In humans, while they are distinct capabilities, the two also seem to be intertwined. If I'm writing a social media comment and I try to mentally simulate how it will be received, I can do it because I have a rich library of patterns about how different kinds of comments will be received by different readers. If write something that triggers a pattern-detector that goes "uh-oh, that wouldn't be received well", I can rewrite it until it passes my mental simulation. That suggests that there would be a natural connection between the two.
* There are indications that current LLMs may already be doing something like internal simulation though not being that great at it. Like in the "mouse mastermind" vignette, it certainly intuitively feels like Claude has some kind of consistent internal model of what's going on. People have also e.g. trained LLMs to play games like Othello and found that the resulting network has an internal representation of the game board ( https://www.lesswrong.com/posts/nmxzr2zsjNtjaHh7x/actually-othello-gpt-has-a-linear-emergent-world ).
* There have also been various attempts at explicitly combining an LLM-based component with a component that does something like simulation. E.g. DeepMind trained a hybrid LLM-theorem prover system that reached silver medal-level performance on this year's International Mathematics Olympiad ( https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/ ), where the theorem prover component maintains a type of state over the math problem as it's being worked on.
* Iterative improvements like chain-of-thought reasoning are also taking LLMs in the direction of being able to apply more novel reasoning in domains such as math. Mathematician Terry Tao commented the following about giving the recent GPT-o1 model research-level math tasks to work on: 

> The experience seemed roughly on par with trying to advise a mediocre, but not completely incompetent, (static simulation of a) graduate student.  However, this was an improvement over previous models, whose capability was closer to an actually incompetent (static simulation of a) graduate student.  It may only take one or two further iterations of improved capability (and integration with other tools, such as computer algebra packages and proof assistants) until the level of "(static simulation of a) competent graduate student" is reached, at which point I could see this tool being of significant use in research level tasks.

* There have also been other papers trying out various techniques such as "whiteboard of thought" ( https://whiteboard.cs.columbia.edu/ ) where an LLM, when being presented with visual problems in verbal format, explicitly generates visual representations of the verbal description to use as an aid in its reasoning. It feels like a relatively obvious idea would be to roll out these kinds of approaches into future LLM architectures, teaching them to generate "mental images" of whatever task they were told to work on. This could then be used as part of an internal simulation.
* There's an evolutionary argument that the steps from "pure pattern recognition" to "pattern recognition with mental simulation added" might be relatively simple and not require that much in the fundamental breakthroughs, since evolution managed to find it in humans and in humans those abilities seem to be relatively continuous with each other. So we might expect all of these iterative improvements to take us pretty smoothly toward AGI.

Comment by Kaj_Sotala on The Big Nonprofits Post · 2024-11-30T22:12:53.006Z · LW · GW

Focus: Allow Roman Yampolskiy to continue his research and pursue a PhD

Huh? Roman not only does have a PhD already, he's a tenured associate professor. Maybe this meant money to allow him to have PhD students - on a few occasions he suggested that I do an AI Safety-focused PhD with him.

Comment by Kaj_Sotala on You are not too "irrational" to know your preferences. · 2024-11-29T11:25:24.962Z · LW · GW

Indeed, and there's another big reason for that - trying to always override your short-term "monkey brain" impulses just doesn't work that well for most people.

+1.

Which is a good thing, in this particular case, yes?

Less smoking does seem better than more smoking. Though generally it doesn't seem to me like social stigma would be a very effective way of reducing unhealthy behaviors - lots of those behaviors are ubiquitous despite being somewhat low-status. I think the problem is at least threefold:

  • As already mentioned, social stigma tends to cause optimization to avoid having the appearance of doing the low-status thing, instead of optimization to avoid doing the low-status thing. (To be clear, it does cause the latter too, but it doesn't cause the latter anywhere near exclusively.)
  • Social stigma easily causes counter-reactions where people turn the stigmatized thing into an outright virtue, or at least start aggressively holding that it's not actually that bad.
  • Shame makes things wonky in various ways. E.g. someone who feels they're out of shape may feel so much shame about the thought of doing badly if they try to exercise, they don't even try. For compulsive habits like smoking, there's often a loop where someone feels bad, turns to smoking to feel momentarily better, then feels even worse for having smoked, then because they feel even worse they are drawn even more strongly into smoking to feel momentarily better, etc.

I think generally people can maintain healthy habits much more consistently if their motivation comes from genuinely believing in the health benefits and wanting to feel better. But of course that's harder to spread on a mass scale, especially since not everyone actually feels better from healthy habits (e.g. some people feel better from exercise but some don't).

Then again, for the specific example of smoking in particular, stigma does seem to have reduced the amount of it (in part due to mechanisms like indoor smoking bans), so sometimes it does work anyway.

Comment by Kaj_Sotala on Locally optimal psychology · 2024-11-27T14:45:36.615Z · LW · GW

Incidentally, coherence therapy (which I know is one of the things Chris is drawing from) makes the distinction between three types of depression, some of them being strategies and some not. Also I recall Unlocking the Emotional Brain mentioning a fourth type which is purely biochemical.

From Coherence Therapy: Practice Manual & Training Guide:

Underlying emotional truth of depression: Three types

A. Depression that directly carries out an unconscious purpose/function
B. Depression that is a by-product of how an unconscious purpose is carried out
C. Depression expressing unconscious despair/grief/hopelessness

A. Depression that carries out an unconscious purpose

Client: Mother who is still in pained, debilitating depression 8 years after her 5-year-old son died after being hit by a car. (To view entire session see video 1096T, Stuck in Depression.) The following excerpt shows the creation of discovery experiences that reveal the powerful purpose of staying in depression (a purpose often encountered with clients in the bereavement process).

Th: I want you to look and see if there’s some other side of you, some area in your feelings where you feel you don’t deserve to be happy again.
Cl: Probably the guilt.
Th: The guilt. So what are the words of the guilt?
Cl: That I wasn’t outside when he was hit (to prevent it).
Th: I should have been outside.
Cl: I should have been outside.
Th: It’s my fault.
Cl: It’s my fault.

(About two minutes later:)

Th: Would you try to talk to me from the part of you that feels the guilt. Just from that side. I know there are these other sides. But from the place in you where you feel guilty, where you feel it was your fault that your dear little boy got hit by a truck, from that place, what’s the emotional truth for you — from that place — about whether it’s OK to feel happy again?
Cl: ...I don’t allow myself to be happy.
Th: [Very softly:] How come? How come?
Cl: How come?
Th: Because if you were happy—would you complete that sentence? “I don’t allow myself to be happy because if I were happy—”
Cl: I would have to forgive myself. [Pause.] And I’ve been unwilling to do that.
Th: Good. So keep going. “I’m unwilling to forgive myself because—”
Cl: You know there are parts of me that I think it’s about not wanting to go on myself without him.
And if I keep this going then I don’t have to do that.
Th: I see. So would you see him again? Picture Billy? And just try saying that to Billy. Try saying to him, ”I’m afraid that if I forgive myself I’ll lose connection with you and I’ll go on without you.”
Cl: [With much feeling:] Billy, even though I can picture you as a little angel I’m afraid to forgive myself—that you’ll go away and I don’t want you to go away.
Th: Yeah. And see if it’s true to say to him, “It’s so important for me to stay connected to you that I’m willing to not forgive myself forever. I’d rather be feeling guilty and not forgiving myself than lose contact with you and move on without you.” Try saying that. See if that feels true.
Cl: [Sighs. With much feeling:] Billy, I just feel like I would do anything to keep this connection with you including staying miserable and not forgiving myself for the rest of my life. And you know that’s true. [Her purpose for staying in depression is now explicit and directly experienced.]

B. Depression that is a by-product of how an unconscious purpose is carried out

Client: Lethargic woman, 33, says, “I’ve been feeling depressed and lousy for years… I have a black cloud around me all the time.” She describes herself as having absolutely no interests and as caring about nothing whatsoever, and expresses strong negative judgments toward herself for being a “vegetable.”

[Details of this example are in the 2002 publication cited in bibliography on p. 85. Several pro-symptom positions for depression were found and dissolved. The following account is from her sixth and final session.]

Discovery via symptom deprivation: Therapist prompts her to imagine having real interests; unhurriedly persists with this imaginal focus. Client suddenly exclaims, “I erased myself!” and describes how “my mother takes everything! She fucking takes it all! So I’ve got to erase myself! She always, always, always makes it her accomplishment, not mine. So why should I be anything? So I erased myself, so she couldn’t keep doing that to me.” Client now experiences her blankness as her own solution to her problem of psychological robbery, and recognizes her depression to be an inevitable by-product of living in the blankness that is crucial for safety but makes her future hopelessly empty.

Therapist then continues discovery into why “erasing” herself is the necessary way to be safe: Client brings to light a core presupposition of having no boundaries with mother, a “no walls rule.” With this awareness dawns the possibility of having “walls” so that what she thinks, feels or does remains private and cannot be stolen. She could then safely have interests and accomplishments. This new possibility immediately creates for client the tangible prospect of an appealing future, and she congruently describes rich feelings of excitement and energy.

Outcome: In response to follow-up query two months later, client reported, “It felt like a major breakthrough...this major rage got lifted” and said she had maintained privacy from mother around all significant personal matters. After two years she confirmed that the “black cloud” was gone, she was enthusiastically pursuing a new career, was off antidepressants, and said, “Things are good, in many ways. Things are very good.”

C. Depression expressing unconscious despair, grief, hopelessness

Client: Man with long history of a “drop” into depression every Fall. [This one-session example is video 1097SP, Down Every Year, available online at coherencetherapy.org. For a multi-session example of working with this type of depression, see “Unhappy No Matter What” in DOBT book, pp. 63-90.]

Surfaced emotional reality: At 10 he formed a belief that he failed parents’ expectations so severely that they forever “gave up on me” (he was sent in the Fall from USA to boarding school in Europe, was utterly miserable and begged to come home). Has been in despair ever since, unconsciously.

Outcome: Client subsequently initiated talk with parents about the incident 30 years ago; not once had it been discussed. In this conversation it became real to him that their behavior did not mean they gave up on him, and five months after session reported continuing relief from feeling depressed and inadequate.

Comment by Kaj_Sotala on You are not too "irrational" to know your preferences. · 2024-11-27T14:38:02.855Z · LW · GW

Commenting on a relatively isolated point in what you wrote; none of this affects your core point about preferences being entangled with predictions (actually it relies on it).

This is why you could view a smoker's preference for another cigarette as irrational: the 'core want' is just a simple preference for the general feel of smoking a cigarette, but the short-jolt preference has the added prediction of "and this will be good to do". But that added prediction is false and inconsistent with everything they know. The usual statement of "you would regret this in the future".

I think that the short-jolt preference's prediction is actually often correct; it's just over a shorter time horizon. The short-term preference predicts that "if I take this smoke, then I will feel better" and it is correct. The long-term preference predicts that "I will later regret taking this smoke, " and it is also correct. Neither preference is irrational, they're just optimizing over different goals and timescales.

Now it would certainly be tempting to define rationality as something like "only taking actions that you endorse in the long term", but I'd be cautious of that. Some long-term preferences are genuinely that, but many of them are also optimizing for something looking good socially, while failing to model any of the genuine benefits of the socially-unpopular short-term actions. 

For example, smoking a cigarette often gives smokers a temporary feeling of being in control, and if they are going out to smoke together with others, a break and some social connection. It is certainly valid to look at those benefits and judge that they are still not worth the long-term costs... but frequently the "long-term" preference may be based on something like "smoking is bad and uncool and I shouldn't do it and I should never say that there could be a valid reason to do for otherwise everyone will scold me".

Then by maintaining both the short-term preference (which continues the smoking habit) and the long-term preference (which might make socially-visible attempts to stop smoking), the person may be getting the benefit from smoking while also avoiding some of the social costs of continuing.

This is obviously not to say that the costs of smoking would only be social. Of course there are genuine health reasons as well. But I think that quite a few people who care about "health" actually care about not appearing low status by doing things that everyone knows are unhealthy. 

Though even if that wasn't the case - how do you weigh the pleasure of a cigarette now, versus increased probability of various health issues some time in the future? It's certainly very valid to say that better health in the future outweighs the pleasure in the now, but there's also no objective criteria for why that should be; you could equally consistently put things other way around.

So I don't think that smoking a cigarette is necessarily irrational in the sense of making an incorrect prediction. It's more like a correct but only locally optimal prediction. (Though it's also valid to define rationality as something like "globally optimal behavior", or as the thing that you'd do if you got both the long-term and the short-term preference to see each other's points and then make a decision that took all the benefits and harms into consideration.)

Comment by Kaj_Sotala on Which things were you surprised to learn are not metaphors? · 2024-11-24T09:27:42.524Z · LW · GW

I have a friend with eidetic imagination who says that for her, there is literally no difference between seeing something and imagining it. Sometimes she's worried about losing track of reality if she were to imagine too much.

Comment by Kaj_Sotala on Which things were you surprised to learn are not metaphors? · 2024-11-24T08:47:28.509Z · LW · GW

Oh yeah, this. I used to think that "argh" or "it hurts" were just hyperbolic compliments for an excellent pun. Turns out, puns actually are painful to some people.

Comment by Kaj_Sotala on Which things were you surprised to learn are not metaphors? · 2024-11-21T19:14:21.047Z · LW · GW

Annoyingly I have the recollection of having thought "oh, that's not a metaphor?" several times in my life, but I don't seem to have saved what the things in question actually were.

Comment by Kaj_Sotala on Evolution's selection target depends on your weighting · 2024-11-20T09:43:19.114Z · LW · GW

I guess I don't really understand what you're asking. I meant my comment as an answer to this bit in the OP:

I think it's common on LessWrong to think of evolution's selection target as inclusive genetic fitness - that evolution tries to create organisms which make as many organisms with similar DNA to themselves as possible. But what exactly does this select for? 

In that evolution selecting for "inclusive genetic fitness" doesn't really mean selecting for anything in particular; what exactly that ends up selecting for is completely dependent on the environment (where "the environment" also includes the species itself, which is relevant for things like sexual selection or frequency-dependent selection). 

If you fix the environment, assuming for the sake of argument that it's possible to do that, then the exact thing it selects for are just the traits that are useful in that environment.

Do humans have high inclusive genetic fitness?

I think it's a bit of a category mistake to ask about the inclusive fitness of a species. You could calculate the average fitness of an individual within the species, but at least to my knowledge (caveat: I'm not a biologist) that's not very useful. Usually it's individual genotypes or phenotypes within the species that are assigned a fitness.

Comment by Kaj_Sotala on Evolution's selection target depends on your weighting · 2024-11-20T08:22:54.073Z · LW · GW

I've previously argued that genetic fitness is a measure of selection strength, not the selection target. What evolution selects for are traits that happen to be useful in the organism's current environment. The extent to which a trait is useful in the organism's current environment can be quantified as fitness, but fitness is specific to a particular environment and the same trait might have a very different fitness in some other environment.

Comment by Kaj_Sotala on Social events with plausible deniability · 2024-11-19T18:07:00.412Z · LW · GW

I think if you have access to a group interested in doing social events with plausible deniability, that group is probably already a place where you should be able to be honest about your beliefs without fear of "cancellation."

You may not know exactly who belongs to that group before going to the event and seeing who shows up.

Comment by Kaj_Sotala on Ayn Rand’s model of “living money”; and an upside of burnout · 2024-11-19T15:57:50.462Z · LW · GW
  • Somehow people who are in good physical health wake up each day with a certain amount of restored willpower.  (This is inconsistent with the toy model in the OP, but is still my real / more-complicated model.)

This fits in with opportunity cost-centered and exploration-exploitation -based views of willpower. Excessive focus on any one task implies that you are probably hitting diminishing returns while accumulating opportunity costs for not doing anything else. It also implies that you are probably strongly in "exploit" mode and not doing much exploring. Under those models, accumulating mental fatigue acts to force some of your focus to go to tasks that feel more intrinsically enjoyable rather than duty-based, which tends to correlate with things like exploration and e.g. social resource-building. And your willpower gets reset during the night so that you could then go back to working on those high-opportunity cost exploit tasks again.

I think those models fit together with yours.

Comment by Kaj_Sotala on Ayn Rand’s model of “living money”; and an upside of burnout · 2024-11-19T15:38:13.162Z · LW · GW

(I believe @Kaj_Sotala has written about this somewhere wrt Global Workspace Theory? I found this tweet in the meantime.) 

There's at least this bit from "Subagents, akrasia, and coherence in humans":

One model (e.g. Redgrave 2007, McHaffie 2005) is that the basal ganglia receives inputs from many different brain systems; each of those systems can send different “bids” supporting or opposing a specific course of action to the basal ganglia. A bid submitted by one subsystem may, through looped connections going back from the basal ganglia, inhibit other subsystems, until one of the proposed actions becomes sufficiently dominant to be taken.

The above image from Redgrave 2007 has a conceptual image of the model, with two example subsystems shown. Suppose that you are eating at a restaurant in Jurassic Park when two velociraptors charge in through the window. Previously, your hunger system was submitting successful bids for the “let’s keep eating” action, which then caused inhibitory impulses to be sent to the threat system. This inhibition prevented the threat system from making bids for silly things like jumping up from the table and running away in a panic. However, as your brain registers the new situation, the threat system gets significantly more strongly activated, sending a strong bid for the “let’s run away” action. As a result of the basal ganglia receiving that bid, an inhibitory impulse is routed from the basal ganglia to the subsystem which was previously submitting bids for the “let’s keep eating” actions. This makes the threat system’s bids even stronger relative to the (inhibited) eating system’s bids.

Soon the basal ganglia, which was previously inhibiting the threat subsystem’s access to the motor system while allowing the eating system access, withdraws that inhibition and starts inhibiting the eating system’s access instead. The result is that you jump up from your chair and begin to run away. Unfortunately, this is hopeless since the velociraptor is faster than you. A few moments later, the velociraptor’s basal ganglia gives the raptor’s “eating” subsystem access to the raptor’s motor system, letting it happily munch down its latest meal.

But let’s leave velociraptors behind and go back to our original example with the phone. Suppose that you have been trying to replace the habit of looking at your phone when bored, to instead smiling and directing your attention to pleasant sensations in your body, and then letting your mind wander.

Until the new habit establishes itself, the two habits will compete for control. Frequently, the old habit will be stronger, and you will just automatically check your phone without even remembering that you were supposed to do something different. For this reason, behavioral change programs may first spend several weeks just practicing noticing the situations in which you engage in the old habit. When you do notice what you are about to do, then more goal-directed subsystems may send bids towards the “smile and look for nice sensations” action. If this happens and you pay attention to your experience, you may notice that long-term it actually feels more pleasant than looking at the phone, reinforcing the new habit until it becomes prevalent.

To put this in terms of the subagent model, we might drastically simplify things by saying that the neural pattern corresponding to the old habit is a subagent reacting to a specific sensation (boredom) in the consciousness workspace: its reaction is to generate an intention to look at the phone. At first, you might train the subagent responsible for monitoring the contents of your consciousness, to output moments of introspective awareness highlighting when that intention appears. That introspective awareness helps alert a goal-directed subagent to try to trigger the new habit instead. Gradually, a neural circuit corresponding to the new habit gets trained up, which starts sending its own bids when it detects boredom. Over time, reinforcement learning in the basal ganglia starts giving that subagent’s bids more weight relative to the old habit’s, until it no longer needs the goal-directed subagent’s support in order to win.

Now this model helps incorporate things like the role of having a vivid emotional motivation, a sense of hope, or psyching yourself up when trying to achieve habit change. Doing things like imagining an outcome that you wish the habit to lead to, may activate additional subsystems which care about those kinds of outcomes, causing them to submit additional bids in favor of the new habit. The extent to which you succeed at doing so, depends on the extent to which your mind-system considers it plausible that the new habit leads to the new outcome. For instance, if you imagine your exercise habit making you strong and healthy, then subagents which care about strength and health might activate to the extent that you believe this to be a likely outcome, sending bids in favor of the exercise action.

On this view, one way for the mind to maintain coherence and readjust its behaviors, is its ability to re-evaluate old habits in light of which subsystems get activated when reflecting on the possible consequences of new habits. An old habit having been strongly reinforced reflects that a great deal of evidence has accumulated in favor of it being beneficial, but the behavior in question can still be overridden if enough influential subsystems weigh in with their evaluation that a new behavior would be more beneficial in expectation.

Some subsystems having concerns (e.g. immediate survival) which are ranked more highly than others (e.g. creative exploration) means that the decision-making process ends up carrying out an implicit expected utility calculation. The strengths of bids submitted by different systems do not just reflect the probability that those subsystems put on an action being the most beneficial. There are also different mechanisms giving the bids from different subsystems varying amounts of weight, depending on how important the concerns represented by that subsystem happen to be in that situation. This ends up doing something like weighting the probabilities by utility, with the kinds of utility calculations that are chosen by evolution and culture in a way to maximize genetic fitness on average. Protectors, of course, are subsystems whose bids are weighted particularly strongly, since the system puts high utility on avoiding the kinds of outcomes they are trying to avoid.

The original question which motivated this section was: why are we sometimes incapable of adopting a new habit or abandoning an old one, despite knowing that to be a good idea? And the answer is: because we don’t know that such a change would be a good idea. Rather, some subsystems think that it would be a good idea, but other subsystems remain unconvinced. Thus the system’s overall judgment is that the old behavior should be maintained.

Comment by Kaj_Sotala on Ayn Rand’s model of “living money”; and an upside of burnout · 2024-11-19T15:33:23.122Z · LW · GW

your psyche’s conscious verbal planner “earns” willpower

This seems to assume that there's 1) exactly one planner and 2) it's verbal. I think there are probably different parts that enforce top-down control, some verbal and some maybe not.

For example, exerting willpower to study boring academic material seems like a very different process than exerting willpower to lift weights at the gym.

I think that there is something like:

  • Local beliefs about the usefulness of exerting willpower in a particular context (e.g. someone might not believe that willpower is useful in school but does believe that it's useful in the gym, or vice versa, and correspondingly have more willpower available in one context than the other)
  • To the extent that one has internalized a concept about "willpower" being a single thing, broader beliefs about willpower being useful in general
  • Various neurological and biological variables that determine how strong one's top-down processes are in general, relative to their bottom-up processes (e.g. someone with ADHD will have their bottom-up processes be innately stronger than the top-down ones; medication may then strengthen the amount of top-down control they have).
  • Various neurological and biological variables that determine which of one's processes get priority in any given situation (e.g. top-down control tends to be inhibited when hungry or tired; various emotional states may either reduce or increase the strength of top-down control)

My model of burnout roughly agrees with both your and @Matt Goldenberg . To add to Matt's "burnout as revolt" model, my hunch is that burnout often involves not only a loss of belief that top-down control is beneficial. I think it also involves more biological changes to the neural variables that determine the effectiveness of top-down versus bottom-up control. Something in the physical ability of the top-down processes to control the bottom-up ones is damaged, possibly permanently. 

Metaphorically, it's like the revolting parts don't just refuse to collaborate anymore; they also blow up some of the infrastructure that was previously used to control them.

Comment by Kaj_Sotala on The hostile telepaths problem · 2024-11-12T12:28:02.047Z · LW · GW

Sounds plausible to me. Alternatively, telling you that they didn't over-apologize still communicates that they would have over-apologized in different circumstances, so it can be a covert way of still delivering that apology.

Comment by Kaj_Sotala on The hostile telepaths problem · 2024-11-11T14:51:06.270Z · LW · GW

A crucial part of every IFS session is to ask the protector what age they think you are (often, at least in examples, it would say something like 5-12) and then you could reveal to it that actually you're 30 (or whatever).

I wouldn't put it as strongly as to say that it's a crucial part of every IFS session. It can sometimes be a very useful question and approach, sure, but I've had/facilitated plenty of great sessions that didn't use that question at all. And there are people who that question just doesn't resonate with.

Comment by Kaj_Sotala on Should CA, TX, OK, and LA merge into a giant swing state, just for elections? · 2024-11-08T10:54:34.736Z · LW · GW

As far as I know, the latest representative expert survey on the topic is "Thousands of AI Authors on the Future of AI", in which the median time for a 50% chance of AGI was either in 23 or 92 years, depending on how the question was phrased:

If science continues undisrupted, the chance of unaided machines outperforming humans in every possible task was estimated at 10% by 2027, and 50% by 2047. [...] However, the chance of all human occupations becoming fully automatable was forecast to reach 10% by 2037, and 50% as late as 2116 (compared to 2164 in the 2022 survey).

Not that these numbers would mean much because AI experts aren't experts on forecasting, but it still suggests a substantial possibility for AGI to take quite a while yet.

Comment by Kaj_Sotala on The Median Researcher Problem · 2024-11-03T21:25:41.830Z · LW · GW

Hmm... let me rephrase: it doesn't seem to me like we would actually have a clear community norm for this, at least not one strong enough to ensure that the median community member would actually be familiar with stats and econ.

Comment by Kaj_Sotala on The Median Researcher Problem · 2024-11-03T19:32:57.699Z · LW · GW

community norms which require basically everyone to be familiar with statistics and economics,

I think this is too strong. There are quite a few posts that don't require knowledge of either one to write, read, or comment on. I'm certain that one could easily accumulate lots of karma and become a well-respected poster without knowing either.

Comment by Kaj_Sotala on The Median Researcher Problem · 2024-11-03T19:23:29.890Z · LW · GW

I had the thought while reading the original post that I recall speaking to at least one researcher who, pre-replication crisis, was like "my work is built on a pretty shaky foundation as is most of the research in this field, but what can you do, this is the way the game is played". So that suggested to me that plenty of median researchers might have recognized the issue but not been incentivized to change it.

Lab leaders aren't necessarily in a much better position. If they feel responsibility toward their staff, they might feel even more pressured to keep gaming the metrics so that the lab can keep getting grants and its researchers good CVs.

Comment by Kaj_Sotala on Science advances one funeral at a time · 2024-11-02T20:09:50.842Z · LW · GW

I've seen one paper arguing against Planck's claim:

Unquestionably, there are scientists in every generation who tenaciously cling to knowledge they learned in their youth, and who refuse to consider new theories that challenge fundamental beliefs. The life-long resistance of Joseph Priestley to oxygen theory, Louis Agassiz to evolutionary theory, and Harold Jeffreys to continental drift are among the notable cases. It is virtually a truism that the last adherents to a fading scientific tradition will be elderly scientists. Yet documented episodes where resistance of isolated individuals crystallizes into generational disputes, or where an ageing scientific elite actually delays community-wide adoption of a new idea, are exceedingly rare. A review of the historical record suggests, on the contrary, that the period of active dissemination and adoption of scientific innovations - even those of revolutionary proportion - is typically shorter than that required for one generation of scientists to replace another. [...]

Curiously, the episode which prompted Planck's observation - the 'controversy' surrounding his youthful reformulation of the second law of thermodynamics - seems a poor illustration of the 'fact' Planck claims to have learned. According to Planck's own sketchy chronology (he provides few dates), not much more than ten years seems to have elapsed between his first unsuccessful attempts at gaining recognition, and the 'universal acceptance' of his dissertation thesis on the irreversible process of heat conduction. Nor does it appear that age was an important factor influencing adoption of the theory. Wilhelm Ostwald, one of the leaders of the opposition 'Energetics' school prominently mentioned by Planck, was only five years older than Planck, whereas Ludwig Boltzmann, whose theoretical work on entropy, in no small measure (as Planck grudgingly concedes) helped bring the scientific community around to Planck's view, was fourteen years Planck's senior.

Some quantitative data permit more systematic examination of age differences in receptivity, for both Lavoisier's and Darwin's landmark contributions. In a study of the Chemical Revolution, McCann reports a negative correlation between author's age and the use of the oxygen paradigm in scientific papers written between 1760 and 1795. On closer inspection of the data, he finds that the earliest group of converts to the oxygen paradigm (between 1772 and 1777) were middle-aged men with close ties to Lavoisier; the inverse age effect became manifest only after 1785, during the ten-year period of 'major conversion and consolidation'. McCann also contends that the age structure of the British community during the latter half of the eighteenth century impeded acceptance of the new theory. In contrast to the declining age of French scientists during this period, the increasing average age of British scientists held back the pace of acceptance of oxygen theory among British scientists of all age strata.

As for evolutionary theory, Hull and his colleagues find weak support for 'Planck's Principle' among nineteenth-century British scientists. The small minority of scientists who held out against the theory after 1869 were, on average, almost ten years older than earlier adopters. Age in 1859 (the year the Origin of Species was published) was unrelated, however, to speed of acceptance for the great majority of those converting to evolutionary theory by 1869. [...]

... we can distinguish high- risk and low-risk contexts for theory choices of individual scientists. A high-risk context is one in which there is substantial resistance to the new theory. Prevailing scientific opinion views it as controversial, a heretical assault on existing knowledge, or even being beyond the pale of serious scientific discourse. Adoption of a new theory in a high-risk context presumably exacts some perceived or actual professional costs. Given such a social setting, structural constraints of life-course position would be hypothesized to be more important than motivational factors in determining theory-choice behaviour. This implies, for example, that the earliest adopters of controversial theories should be disproportionately composed of middle-career and senior scientists and a corresponding deficit of young scientists.

In a low-risk context, a new theory is generally regarded as a legitimate claimant to knowledge, or one which has already attracted a sizeable following; consequently its adoption exposes one to only minimal professional costs. The social patterning of theory-choice behaviour in this context is hypothesized to be dominated by motivational factors, tending to reinforce more rapid adoption by younger scientists. [...]

During the early stages in the adoption of a new theory, age differences between supporters of the new theory and defenders of the status quo are expected to be either relatively small or (particularly if the new idea is perceived as being unusually controversial) tending toward older age for the first supporters. With the passage of time and greater acceptance of the new theory, we expect the influx of new converts to be increasingly drawn from the ranks of younger scientists. Such a correspondence between changes in the context of appraisal and age-based differences in theory choice is evident in McCann's data on French scientists' reception of the oxygen paradigm during the different subperiods of his study. It will be recalled that the earliest followers of oxygen theory were middle-aged scientists, and the greater propensity of younger scientists only became manifest several years later, at the point when community-wide conversion was well under way.

In the remainder of this paper, I present a rigorous test of the expanded age hypothesis proposed above. It is based upon new findings from a study of the reception of plate tectonics in the earth sciences during the 1960s. Compared with earlier studies, it permits a more precise delineation of the historical stages in prevailing scientific opinion, and introduces into the analysis controls on possible confounding factors correlated with age, such as foci of research interest and professional eminence. [...]

Development of the theory of plate tectonics ranks among the stellar scientific achievements of this century. General acceptance of this conceptual framework necessitated the abandonment of a communal belief in the horizontal immobility of the earth's crust which had guided geological research since the middle of the nineteenth century. Plate tectonics theory substituted the diametrically opposed premise that the earth's crus is divided into large crustal plates which move slowly over the upper mantle.

The swift adoption of plate tectonics during the late 1960s stands in stark contrast to the extremely bitter controversy encountered by earlier proponents of a 'mobile' earth. Alfred Wegener's continental drift theory, the forerunner of present-day mobilist theory, was subjected to extremely hostile attacks during the 1920s and fell into nearly universal disrepute. British geophysicists working on reconstruction of the ancient configurations of the earth's magnetic field rekindled interest in continental drift during the middle 1950s. [...] Despite the advocacy by Hess and a few other distinguished earth scientists, large-scale horizontal displacement of the crust remained an anathema for most earth scientists well into the 1960s. Then, in 1966-67, a confluence of empirical discoveries in marine geology, geomagnetic studies and seismology provided for many geologists incontrovertible proof favouring the seafloor spreading model, and the mobilist perspective more generally. [...] By the early 1970s, the great majority of earth scientists had adopted plate tectonics, and the theory was well on its way to becoming the dominant theoretical orientation in many fields of the earth sciences. [...]

To obtain data on the dynamics of individual theory choice that under earth scientists' shifts from the stabilist to the mobilist programme research, I examined the publications of ninety-six North American eart scientists actively engaged in pertinent research during the 1960s a early 1970s. I also gathered biographical information for each scientis including their training, research interests and career histories. [...]

The dependent variable for this study is the year in which a scientist decided to adopt the mobilist programme of research rather than to continue working within a stabilist programme. [...] Before 1966, when prevailing scientific opinion still ran strongly against the mobilist perspective, the small number of scientists adopting the programme were considerably older (in terms of career age) than other scientists active during this early period. Thus, scientists adopting the programme through 1963 were on average nineteen years 'older' than non-adopters. [...] Adopters in 1964 were twenty-three years older than non-adopters. [...] Only with the shift in scientific opinion favourable to mobilist concepts beginning in 1966, do we start to see a progressive narrowing, and then reversal, in the age differentials between adopters and non-adopters.

Comment by Kaj_Sotala on The hostile telepaths problem · 2024-10-31T18:45:16.629Z · LW · GW

True, though I think that judgment tends to be hard to effectively mask in this kind of context (though maybe psychopaths would be able to fake it; I don't know). At least my own experience inclines me to agree with this person:

I’ve worked with and/or done swaps with a lot of different practitioners (IFS, aletheia, VIEW, regular talk therapy, bodywork, voice work etc), and what I found to be the most effective element of their skill set (for me) is: non-judgmental, loving presence… 

many times I have explored the same topic with two different practitioners within a few days of each other; and it’s in those cases that the impact of the difference in the quality of non-judgmental loving presence is most noticeable.

the degree to which the quality of the presence is non-judgmental can be VERY subtle, but the system can pick up on it. it might not even be a strong enough signal to notice it consciously, but it will greatly impact how the session unfolds.

Comment by Kaj_Sotala on Habryka's Shortform Feed · 2024-10-29T19:47:21.456Z · LW · GW

On Windows the font feels actively unpleasant right away, on Android it's not quite as bad but feels like I might develop eyestrain if I read comments for a longer time.

Comment by Kaj_Sotala on Habryka's Shortform Feed · 2024-10-29T13:35:56.008Z · LW · GW

Seeing strange artifacts on some of the article titles on Chrome for Android (but not on desktop)

Comment by Kaj_Sotala on Habryka's Shortform Feed · 2024-10-29T10:22:39.407Z · LW · GW

Yeah it feels uncomfortably small to read to me now

Comment by Kaj_Sotala on The hostile telepaths problem · 2024-10-28T12:06:59.032Z · LW · GW

Now I can make the question more precise - why do you think it's safe to have more access to your thoughts and feelings than your subconscious gave you? And how exactly do you plan to deal with all the hostile telepaths out there (possibly including parts of yourself?).

An answer I'd give is that for a lot of people, most of the hostile telepaths are ultimately not that dangerous if you're confident enough to be able to deal with them. As Valentine mentioned, often it's enough to notice that you are actually not anymore in the kind of a situation where the strategies would be necessary.

Unfortunately, many of the strategies also behave in such a way as to make themselves necessary, or to prevent the person from noticing that they could be abandoned:

  • Maybe I had a parent that wanted me to be dependent on them, so that they could control me. Even if I manage to break away from that parent, I may still have the belief that if someone wants to control me, then I have to genuinely believe that I cannot escape their control or they'll hurt me. This belief will tend to get me into abusive relationships... and then that strategy again becomes necessary for protecting me while in the relationship, when I would never have gotten into that relationship in the first place if not for that very strategy!
  • Maybe I believe that if I cause someone else any discomfort, I have to say I'm really sorry and experience genuine distress. As a result, I always execute this strategy, believing it to be crucial for my safety. If I were to ever not execute it, I might notice that some people are actually okay with me not reacting in such an extreme way... but because I always execute it, I never get the chance to notice that it'd be safe not to.

One of the ways by which these kinds of strategies get implemented is that the psyche develops a sense of extreme discomfort around acting in the "wrong" way, with successful execution of that strategy then blocking that sense of discomfort. For example, the thought of not apologizing when you thought someone might be upset at you might feel excruciatingly uncomfortable, with that discomfort subsiding once you did apologize.

I believe this is also related to the way that awareness narrows around the strategy - feeling the original discomfort is very unpleasant, and the mind tends to want to contract awareness in ways that keep discomfort out. If awareness to broaden, then it would become aware of the unpleasant thing that the strategy is trying to push out of awareness. So for example, in the center of that discomfort of not-yet-having-apologized might be a memory of a time when you weren't really sorry and your mother was upset at you... and if you were to instead execute the strategy of desperately apologizing, then that would feel somewhat less painful and your awareness would naturally contract around that act of desperate apology, causing the original memory and the pain associated with that to drop away.

And something that practices like meditation can do is to bring the original discomfort into awareness in such a way that it can gradually stop feeling so unpleasant. (Though this can also go badly and bring something painful into awareness faster than the person is capable of dealing with it.) If that happens so that the original pain stops feeling so painful, then the self-deceptive strategies can stop creating situations where they perpetuate their own need to exist.

Now that's not to say that you would be guaranteed to be safe. A brief discussion I had on Twitter:

Me: I wonder to what extent significant parts of Buddhism got so focused on renunciation because that's the "safe" kind of mental transformation in the sense of not upsetting secular rulers.

While the kind of practice that dismantles societal programming and makes you go out in the world to change things, can easily become a threat for established power structures and a target for being rooted out.

Societal forces exerting evolutionary pressure on spiritual practice and selecting it for increased harmlessness/renunciation.

(Parallels of this idea in the context of corporate mindfulness training programs and such left as an exercise for the reader.)

(Or for that matter, parallels in the context of notions like "our group of ten people meditating is by itself an act of healing the world", which have some truth to them but also conveniently keep any change pretty localized and non-threatening.)

David Chapman: Yes this is very much the case in the history of sutrayana vs vajrayana. Vajrayana was typically reserved for the aristocratic elite, for this reason, and intermittently also appropriated by anti-establishment forces when they could.

Romeo Stevens (@romeostevensit ): The ambitious sects were indeed wiped out

Aneesh Mulye: This wasn't so much of a thing in India; yes, it happened, but engagement with the world and with rulers was def a part of Indian Buddhist (and Shaiva, and most if not all other) traditions.

One solution involved only an elite having access to the hardcore agentifying stuff.

The extermination of all Indian Buddhism (what's called 'Tibetan' today, but that's just because it survived only in Tibet), and all Tantrik institutions (and Indic, generally), engaged with the world as they were, at the hands of the hateful Muslims, is why this didn't survive.

So apparently there were times in history when meditators did get a lot more confidence and self-insight, used that to become more powerful until they were seen as threats and wiped out, and that's why so many of the surviving meditative traditions are focused on things like withdrawing from the world and living as ascetics.

Comment by Kaj_Sotala on The hostile telepaths problem · 2024-10-28T11:17:04.229Z · LW · GW

Like if there's an email I keep freezing around. I can tell there's something there. I might even have some intuitive guesses about what it is!

…but I do not check. I don't introspect on whether my guesses feel right.

Instead, I hypothesize. What hostile telepath problem might someone in my shoes be trying to solve such that this behavior arises?

I tried doing this and it felt promising, and then I noticed a familiar feeling of wanting tell a person affected by my possible self-deception how I'd now solved the problem and would behave differently from now on. And I remembered that on each previous time when I'd had that feeling and told the other person something like that, my behavior had in fact not changed at all as a consequence.

And now I'm chuckling at myself.

Comment by Kaj_Sotala on The hostile telepaths problem · 2024-10-28T11:12:46.375Z · LW · GW

So in many cases, "trauma processing" can basically mean noticing you're not a child anymore. You have power. So you don't have to appease the hostile telepaths just because they're adults.

Yes, definitely. And this is also why it's often so important for the therapist - if this is done in the context of therapy - to exhibit unconditional positive regard toward the client. If the therapist is genuinely accepting of any thoughts and feelings that the client brings up, then that opens the door for the client's parts to start considering the possibility that maybe they can tell the truth and still be accepted. And once it has become possible to tell the truth to at least one person, it becomes possible to tell it to yourself as well.

(Though maybe I should say that the therapist needs to either experience unconditional positive regard toward the client, or successfully deceive themselves and the client into thinking that they do. Heh.)

One additional tangle is that often the client's issue is less about needing to act in a certain way, and more about needing to be a certain way. At some point, one frequently goes from "it's bad to break something and not be genuinely sorry on that particular instance" to "it's bad to be the kind of person who wouldn't automatically feel sorry and who needed to fake being sorry". 

This makes it harder to get to the point where the therapist could provide evidence that they are fine with you not being sorry in that particular instance, because getting there would require you to reveal that it's possible for you to not automatically feel sorry, and that feels dangerous by itself!

And what you've written also gets to the limitations of therapy - that no matter how much positive regard the therapist might have toward their client, if they are still e.g. living with an abusive partner, just the therapist's warmth and support may not be enough to produce a shift. (I haven't had clients with situations that extreme, but I've certainly noticed times when we started making much more progress once they broke up with a partner or quit a job that they had been trying to force themselves to do, and then suddenly new parts of them came to awareness that could now be convinced they were safe.)

Comment by Kaj_Sotala on The hostile telepaths problem · 2024-10-28T10:08:37.517Z · LW · GW

I bet something similar could work for getting kids to appologize.

Also, for getting them to say thank you. When kids are at a certain age, adults frequently seem to be reminding them to say thank you for gifts and such; I have a vague memory of adults also reminding me of this, when I was at that age. But these days I automatically say thank you for various things, and mean it.

Comment by Kaj_Sotala on The Summoned Heroine's Prediction Markets Keep Providing Financial Services To The Demon King! · 2024-10-27T12:01:40.608Z · LW · GW

But the Summoned Heroine doesn't know that until the end, and it's stated that she specifically set up the market to "help them anticipate and counter the Demon King’s next moves".