Benito's Shortform Feed

benito

Benito's Shortform Feed

post by Ben Pace (Benito) · 2018-06-27T00:55:58.219Z · LW · GW · 287 comments

287 comments

The comments here are a storage of not-posts and not-ideas that I would rather write down than not.

287 comments

Comments sorted by top scores.

comment by Ben Pace (Benito) · 2024-02-26T20:57:39.335Z · LW(p) · GW(p)

I often wish I had a better way to concisely communicate "X is a hypothesis I am tracking in my hypothesis space". I don't simply mean that X is logically possible, and I don't mean I assign even 1-10% probability to X, I just mean that as a bounded agent I can only track a handful of hypotheses [LW · GW] and I am choosing to actively track this one.

This comes up when a substantially different hypothesis is worth tracking but I've seen no evidence for it. There's a common sentence like "The plumber says it's fixed, though he might be wrong" where I don't want to communicate that I've got much reason to believe he might be wrong, and I'm not giving it even 10% or 20%, but I still think it's worth tracking, because strong evidence is common [LW · GW] and the importance is high.
This comes up in adversarial situations when it's possible that there's an adversarial process selecting on my observations. In such situations I want to say "I think it's worth tracking the hypothesis that the politician wants me to believe that this policy worked in order to pad their reputation, and I will put some effort into checking for evidence of that, but to be clear I haven't seen any positive evidence for that hypothesis in this case, and will not be acting in accordance with that hypothesis unless I do."
This comes up when I'm talking to someone about a hypothesis that they think is likely and I haven't thought about before, but am engaging with during the conversation. "I'm tracking your hypothesis would predict something different in situation A, though I haven't seen any clear evidence for privileging your hypothesis yet and we aren't able to check what's actually happening in situation A."
A phrase people around me commonly use is "The plumber says it's fixed, though it's plausible he's mistaken". I don't like it. It feels too ambiguous with "It's logically possible" and "I think it's reasonably likely, like 10-20%" and neither of which is what I mean. This isn't a claim about its probability, it's just a claim about it being "worth tracking".

Some options:

I could say "I am privileging this hypothesis [LW · GW]" but that still seems to be a claim about probability, when often it's more a claim about importance-if-true, and I don't actually have any particular evidence for it.
I often say that a hypothesis is "on the table" as way to say it's in play without saying that it's probable. I like this more but I don't feel satisfied yet.
TsviBT [LW · GW] suggested "it's a live hypothesis for me", and I also like that, but still don't feel satisfied.

How these read in the plumber situation:

"The plumber says it's fixed, though I'm still going to be on the lookout for evidence that he's wrong."
"The plumber says it's fixed, though it's plausible he's wrong."
"The plumber says it's fixed, and I believe him (though it's worth tracking the hypothesis that's he's mistaken)."
"The plumber says it's fixed, though it's a live hypothesis for me that he's mistaken."
"The plumber says it's fixed, though I am going to continue to privilege the hypothesis that he's mistaken."
"The plumber says it's fixed, though it's on the table that he's wrong about that."

Interested to hear any other ways people communicate this sort of thing!

Added: I am reacting with a thumbs-up to all the suggestions I like in the replies below.

Replies from: Chris_Leong, Richard_Kennaway, Dagon, jam_brand, Cossontvaldes, Richard_Kennaway, jmh, steve2152, mattmacdermott, mateusz-baginski, florian-habermacher, RationalWinter, shankar-sivarajan

↑ comment by Chris_Leong · 2024-02-28T05:32:08.463Z · LW(p) · GW(p)

Maybe just say that you're tracking the possibility?

↑ comment by Richard_Kennaway · 2024-02-27T08:33:08.919Z · LW(p) · GW(p)

"Trust, but verify."

↑ comment by Dagon · 2024-02-26T22:10:57.165Z · LW(p) · GW(p)

Standard text in customer-facing outage recovery notices: all systems appear to be operating correctly, and we are actively monitoring the situation".

In more casual conversations, I sometimes say "cautiously optimistic" when stating that I think things are OK, but I'm paying more attention than normal for signs I'm wrong. Mostly, I talk about my attention and what I'm looking for, rather than specifying the person who's making claims. Instead of "the plumber says it's fixed, though he might be wrong", I'd say "The plumber fixed it, but I'm keeping an eye out for further problems". For someone proposing something I haven't thought about, "I haven't noticed that, but I'll pay more attention for X and Y in the future".

↑ comment by jam_brand · 2024-02-29T11:59:04.503Z · LW(p) · GW(p)

Before I read the aphoristic three-word reply [LW(p) · GW(p)] to you from Richard Kennaway (admittedly a likely even clearer-cut way to indicate the following sentiment), I was thinking that to downplay any unintended implications about the magnitude of your probabilities that you could maybe say something about your tracking being for mundane-vigilance or intermittent-map-maintenance or routine-reality-syncing / -surveying / -sampling reasons.

For any audience you anticipate familiarity with this essay [LW · GW] though, another idea might be to use a version of something like:

"The plumber says it's fixed, which I'm splitting on [by default][and {also} tracking <for posterity>]."

(spoilered section below just corrals a ~dozen expansions / embellishments of the above)

"The plumber says it's fixed, which I'm splitting on."

"The plumber says it's fixed, which I'm splitting on and tracking."
- "The plumber says it's fixed, which I'm splitting on and tracking for posterity."
- "The plumber says it's fixed, which I'm splitting on and also tracking."
"The plumber says it's fixed, which I'm splitting on by default."
- "The plumber says it's fixed, which I'm splitting on by default and tracking."
  - "The plumber says it's fixed, which I'm splitting on by default and tracking for posterity."
  - "The plumber says it's fixed, which I'm splitting on by default and also tracking."
- "The plumber says it's fixed, which I'm splitting on by default and will track mindfully for posterity."
- "The plumber says it's fixed, which I'm splitting on by default (mindfully though—and so will also just track as a matter of course)."
- "The plumber says it's fixed, which I'm splitting on by default (mindfully though, so tracking then for posterity)."
"The plumber says it's fixed, which I'm splitting on and will track mindfully for posterity."
"The plumber says it's fixed, which I'm splitting on (mindfully though—and so will also just track as a matter of course)."
"The plumber says it's fixed, which I'm splitting on (mindfully though, so tracking then for posterity)."

↑ comment by Valdes (Cossontvaldes) · 2024-02-27T12:06:49.539Z · LW(p) · GW(p)

Adapted from the french "j'envisage que X" I propose "I am considering the possibility that X" or in some contexts "I am considering X". "The plumber says it's fixed, but I am considering he might be wrong".

↑ comment by Richard_Kennaway · 2024-02-27T08:42:00.882Z · LW(p) · GW(p)

What's wrong with your original sentence, "X is a hypothesis I am tracking in my hypothesis space"? Or more informal versions of that, like "I'll be keeping an eye on that", "We'll see", etc.?

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2024-02-27T16:53:01.227Z · LW(p) · GW(p)

I guess it's just that I don't feel mastery over my communication here, I still anticipate that I will find it clunky to add in a whole chunk of sentences to communicate my epistemic status.

I anticipate often in the future that I'll feel a need to write a whole paragraph, say in the political case, just to clarify that though I think it's worth considering the possibility that the politician is somehow manipulating the evidence, I've seen no cause to believe it in this case. I feel like bringing up the hypothesis with a quick "though I'm tracking the possibility that Adam is somehow manipulating the evidence for political gain" pretty commonly implies that the speaker (me) thinks it is likely enough to be worth acting on, and so I feel I have to explicitly rule that out [LW · GW] as why I'm bringing it up, leaving me with my rather long sentence from above.

"I think it's worth tracking the hypothesis that the politician wants me to believe that this policy worked in order to pad their reputation, and I will put some effort into checking for evidence of that, but to be clear I haven't seen any positive evidence for that hypothesis in this case, and will not be acting in accordance with that hypothesis unless I do."

↑ comment by jmh · 2024-02-27T03:19:52.365Z · LW(p) · GW(p)

In the plumbing context I generally say or think, "The repair/work has been completed and I'll see how it lasts." or sometimes something like, "We've addressed the immediate problem so lets see if that was a fix or a bandage."

↑ comment by Steven Byrnes (steve2152) · 2024-02-27T03:07:18.774Z · LW(p) · GW(p)

“The plumber says it’s fixed, but I’ll keep an eye out for evidence of more problems.” (ditto Dagon) also “The politician seems to be providing sound evidence that her policy is working, but I’ll remain vigilant to the possibility that she’s being deceptive.”

↑ comment by mattmacdermott · 2024-02-27T15:29:59.299Z · LW(p) · GW(p)

"Bear in mind he could be wrong" works well for telling somebody else to track a hypothesis.

"I'm bearing in mind he could be wrong" is slightly clunkier but works ok.

↑ comment by Mateusz Bagiński (mateusz-baginski) · 2024-02-27T15:02:32.226Z · LW(p) · GW(p)

"The hypothesis/possibility that 'X' is mindworthy" ("worth being mindful about it").

Maybe the nicest solution would be to coin a one-syllable modal verb like "may" or "can" to communicate exactly this.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2024-02-27T16:45:03.335Z · LW(p) · GW(p)

"Keep in mind that X".

↑ comment by FlorianH (florian-habermacher) · 2024-02-26T22:57:23.275Z · LW(p) · GW(p)

Maybe "I'm interested in the hypothesis/possibility..."

↑ comment by NineDimensions (RationalWinter) · 2024-02-26T21:52:56.405Z · LW(p) · GW(p)

In some cases something like this might work:

"The plumber says it's fixed, so hopefully it is"

"The plumber says it's fixed, so it probably is"

Which I think conveys"there's an assumption I'm making here, but I'm just putting a flag in the ground to return to if things don't play out as expected"

↑ comment by Shankar Sivarajan (shankar-sivarajan) · 2024-02-26T23:20:47.534Z · LW(p) · GW(p)

"People are saying …"

As in, "The plumber says it's fixed, but people are saying it's not."

This also lends itself to loosely indicating probabilities with "Some people are saying …" or "Many people are saying …."

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2024-02-27T00:14:05.366Z · LW(p) · GW(p)

...after two readings of this obviously awful recommendation I have come to believe that it is a joke.

Replies from: Dagon

↑ comment by Dagon · 2024-02-27T04:46:44.507Z · LW(p) · GW(p)

I'm entertaining the hypothesis that it's perfectly serious. People are saying that there's a wide variance in the typical discussion norms around home repair.

Replies from: shankar-sivarajan

↑ comment by Shankar Sivarajan (shankar-sivarajan) · 2024-02-27T16:09:22.355Z · LW(p) · GW(p)

Experts are saying people dislike my suggestion because it doesn't sound like it's conveying the desired nuanced view, despite doing so quite effectively.

comment by Ben Pace (Benito) · 2025-01-25T03:59:50.460Z · LW(p) · GW(p)

So many people have lived such grand lives. I have certainly lived a greater life than I expected, filled with adventures and curious people. But people will soon not live any lives at all. I believe that we will soon build intelligences more powerful than us who will disempower and kill us all. I will see no children of mine grow to adulthood. No people will walk through mountains and trees. No conscious mind will discover any new laws of physics. My mother will not write all of the novels she wants to write. The greatest films that will be made have probably been made. I have not often viscerally reflected on how much love and excitement I have for all the things I could do in the future, so I didn't viscerally feeling the loss. But now, when it is all lost, I start to think on it. And I just want to weep. I want to scream and smash things. Then I just want to sit quietly and watch the sun set, with people I love.

Replies from: testingthewaters, quinn-dougherty, eigen, stavros, tailcalled

↑ comment by testingthewaters · 2025-01-25T06:14:45.087Z · LW(p) · GW(p)

Do not go gentle into that good night,

Old age should burn and rave at close of day;

Rage, rage against the dying of the light.

Though wise men at their end know dark is right,

Because their words had forked no lightning they

Do not go gentle into that good night.

Good men, the last wave by, crying how bright

Their frail deeds might have danced in a green bay,

Rage, rage against the dying of the light.

Wild men who caught and sang the sun in flight,

And learn, too late, they grieved it on its way,

Do not go gentle into that good night.

Grave men, near death, who see with blinding sight

Blind eyes could blaze like meteors and be gay,

Rage, rage against the dying of the light.

And you, my father, there on the sad height,

Curse, bless, me now with your fierce tears, I pray.

Do not go gentle into that good night.

Rage, rage against the dying of the light.

Do not go gentle into that good night, Dylan Thomas

I'm still fighting. I hope you can find the strength to too.

Replies from: niplav, Benito

↑ comment by niplav · 2025-01-25T17:16:51.216Z · LW(p) · GW(p)

Because their words had forked no lightning they

I think we have the opposite problem: our words are about to fork all the lightning.

↑ comment by Ben Pace (Benito) · 2025-01-25T08:57:05.371Z · LW(p) · GW(p)

Thank you.

It does not currently look to me like we will win this war, speaking figuratively. But regardless, I still have many opportunities to bring truth, courage, justice, honor, love, playfulness, and other virtues into the world, and I am a person whose motivations run more on living out virtues rather than moving toward concrete hopes. I will still be here building things I love, like LessWrong and Lighthaven, until the end.

Replies from: nathan-helm-burger, M. Y. Zuo

↑ comment by Nathan Helm-Burger (nathan-helm-burger) · 2025-01-25T09:38:11.774Z · LW(p) · GW(p)

Though I have worries, and short timelines, so too do I have hope. I believe the next two years will be pivotal, and that we have important roles to play.

Let us hold firm in the face of great danger.

↑ comment by M. Y. Zuo · 2025-01-27T01:15:02.838Z · LW(p) · GW(p)

Speaking frankly as mostly an outside observer/commentator this sounds too cultish and too similar to doom posting.

All the words posted on LW so far are just that, words… Any attached meanings, projections, implications, etc., are done so by fallible people similar to you.

So why behave as if the sky is falling?

There still are reasons and arguments, probably very many, yet undiscovered. And that’s not limited to the pros and cons of ‘AI’. (I’m not even sure if there is a widely accepted definition of ‘AI’ that doesn’t cause epistemological problems)

So if you believe you have a mission, then keep on working at that as mentioned, without the trembling in fear in any direction.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2025-01-27T01:37:29.882Z · LW(p) · GW(p)

For over a decade I have examined the evidence, thought about the situation from many different perspectives (political, mathematical, personal, economic, etc), and considered arguments and counterarguments. This is my honest understanding of the situation, and I am expressing how I truly feel about that.

Replies from: M. Y. Zuo

↑ comment by M. Y. Zuo · 2025-01-27T02:13:37.873Z · LW(p) · GW(p)

So then what is forcing you to attach so many fears to this or that?

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2025-01-27T02:22:10.706Z · LW(p) · GW(p)

Can you give me your best one-or-two-line guess? I think the question is trivial from what I've written and I don't really know why you're not also finding it clear.

↑ comment by Quinn (quinn-dougherty) · 2025-01-27T20:03:35.846Z · LW(p) · GW(p)

yeah last week was grim for a lot of people with r1's implications for proliferation and the stargate fanfare after inauguration. Had a palpable sensation of it pivoting from midgame to endgame, but I would doubt that sensation is reliable or calibrated.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2025-01-28T16:26:07.111Z · LW(p) · GW(p)

My feelings here aren't at all related to any news or current events. I could've written this any time in the last year or two.

↑ comment by eigen · 2025-01-25T18:10:09.754Z · LW(p) · GW(p)

I disagree extremely. This is the best moment of my life. I am at the best point of my career (powered by o1 and o3 research previews is allowing me to reach the best solutions I couldn't imagine reaching on my own, much less in such short time) it has helped me create two companies now completely different from my career (I have optimized hydroponic setups and cultivation of mushrooms purely with o1-pro to incredible levels.) My father, a doctor, tells me his patients are better than ever only because his use of o1, doctors are using it in their meetings with their most difficult to diagnose patients. My girlfriends uses it for mental health. I could continue. I feel so empowered. I have read too much of alignment to think that we are going to make it. It's up to you -really- to choose to feel empowered or down about it. I'm honestly having the best time of my life.

↑ comment by stavros · 2025-01-25T09:57:16.068Z · LW(p) · GW(p)

What is true is already so [LW · GW] / It all adds up to normality [LW · GW]

What you've lost isn't the future, it's the fantasy.

What remains is a game that we were born losing, where there may be few moves left to make, and where most of us most of the time don't even have a seat at the table.

However, it is a game with very high variance.
It is a game where world shaping things happen regularly due to one person getting lucky (right person, right place, right time, right idea etc).

And one thing I've noticed in people who routinely excel at high variance games - e.g. Poker, MTG - is how unaffected they are when they're down/behind.
There is a mindset, in the moment, not of playing to win... but of playing optimally - of making the best move they can in any situation, of playing to maximize their outs no matter how unlikely they may be.

To those for whom the OP's message strongly resonates: let it. Feel it. Give your grief and fear, sorrow and anger their due. Practice self-care; be kind and compassionate to yourself as you would to another who felt what you are feeling.

One morning you will wake up feeling okay, and you'll realize you've felt okay more often than not lately.
Then, should this game still appeal to you, it is time to start playing again :)

Replies from: sharmake-farah

↑ comment by Noosphere89 (sharmake-farah) · 2025-01-25T15:17:39.831Z · LW(p) · GW(p)

And one thing I've noticed in people who routinely excel at high variance games - e.g. Poker, MTG - is how unaffected they are when they're down/behind. There is a mindset, in the moment, not of playing to win... but of playing optimally - of making the best move they can in any situation, of playing to maximize their outs no matter how unlikely they may be.

This point would be really helpful for everyone.

That said, I'd dispute this claim here:

What is true is already so [LW · GW] / It all adds up to normality [LW · GW]
What you've lost isn't the future, it's the fantasy.

At least under the common conception of fantasy, this is an extremely strong claim, because you are effectively claiming that the good future in Ben Pace's head could never have been realized, and I see no reason to conclude this from an epistemic perspective at all, unless you are masssively overconfident (even if you do have reasonably high doom probabilities, this statement is not true.)

More generally, it's known that it does not always add up to normality, see here:

https://www.lesswrong.com/posts/74crqQnH8v9JtJcda/egan-s-theorem#oZNLtNAazf3E5bN6X [LW(p) · GW(p)]

↑ comment by tailcalled · 2025-01-25T12:14:50.739Z · LW(p) · GW(p)

Is your mother currently spending a lot of her time writing novels?

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2025-01-26T00:07:31.326Z · LW(p) · GW(p)

Yes; she has come to visit me for two months, and I have helped her get into a daily writing routine while she's here. I know she has the ability to finish at least one.

comment by Ben Pace (Benito) · 2019-07-13T01:45:32.069Z · LW(p) · GW(p)

Yesterday I noticed that I had a pretty big disconnect from this: There's a very real chance that we'll all be around, business somewhat-as-usual in 30 years. I mean, in this world many things have a good chance of changing radically, but automation of optimisation will not cause any change on the level of the industrial revolution. DeepMind will just be a really cool tech company that builds great stuff. You should make plans for important research and coordination to happen in this world (and definitely not just decide to spend everything on a last-ditch effort to make everything go well in the next 10 years, only to burn up the commons and your credibility for the subsequent 20).

Only yesterday when reading Jessica's post did I notice that I wasn't thinking realistically/in-detail about it, and start doing that.

Replies from: Benito, lc

↑ comment by Ben Pace (Benito) · 2019-08-07T20:42:47.057Z · LW(p) · GW(p)

Related hypothesis: people feel like they've wasted some period of time e.g. months, years, 'their youth', when they feel they cannot see an exciting path forward for the future. Often this is caused by people they respect (/who have more status than them) telling them they're only allowed a small few types of futures.

↑ comment by lc · 2023-04-05T21:23:14.075Z · LW(p) · GW(p)

How do you feel about this today?

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2023-04-05T23:19:45.419Z · LW(p) · GW(p)

The next 30 years seem really less likely to be 'relatively normal'. My mainline world-model is that nation states will get involved with ML in the next 10 years, and that many industries will be really changed up by ML.
One of my personal measures of psychological health is how many years ahead I feel comfortable making trade-offs for today. This changes over time, I think I feel like I'm a bit healthier now than I was when I wrote this, but still not great. Not sure how to put a number to this, I'll guess I'm maybe able to go up to 5 years at the minute (the longest ones are when I think about personal health and fitness)? Beyond that feels a bit foolish.
I still resonate a bit with what I wrote here 4 years ago, but definitely less. My guess is if I wrote this today the number I would pick would be "8-12 years" instead of "30".

Replies from: Benito, Benito

↑ comment by Ben Pace (Benito) · 2023-11-06T19:17:25.178Z · LW(p) · GW(p)

Nation states got involved with ML faster than I expected when I wrote this!

↑ comment by Ben Pace (Benito) · 2023-04-05T23:46:58.165Z · LW(p) · GW(p)

Epistemic status: Thinking out loud some more.

Hm, I notice I'm confused a bit about the difference between "ML will blow up as an industry" and "something happens that effects the world more than the internet and smartphones have done so far".

I think honestly I have a hard time imagining ML stuff that's massively impactful but isn't, like, "automating programming", which seems very-close-to-FOOM to me. I don't think we can have AGI-complete things without being within like 2 years (or 2 days) of a FOOM.

So then I get split into two worlds, one where it's "FOOM and extinction" and another world which is "a strong industry that doesn't do anything especially AGI-complete". The latter is actually fairly close to "business somewhat-as-usual", just with a lot more innovation going on, which is kind of nice (while unsettling).

Like, does "automated drone warfare" count as "business-as-usual"? I think maybe it does, it's part of general innovation and growth that isn't (to me) clearly more insane than the invention of nukes was.

I think I am expecting massive innovation and that ML will be shaking up the world like we've seen in the 1940's and 1950's (transistors, DNA, nukes, etc etc). I'm not sure whether to expect 10-100x more than that before FOOM. I think my gut says "probably not" but I do not trust my gut here, it hasn't lived through even the 1940's/50's, never mind other key parts of the scientific and industrial and agricultural and eukaryotic revolutions.

As we see more progress over the next 4 years I expect we'll be in a better position to judge how radical the change will be before FOOM.

The answer to lc's original question is then:

My mainline anticipation involves substantially more progress than what I wrote 4 years ago, and I wouldn't write the same sentences today with the number '30'. I'm not sure that my new expectation doesn't count as "business somewhat-as-usual" if I'm expanding it to include the amount of progress in the last century; so if I wrote it today I might still say "business somewhat-as-usual" but over the next 8-12 years, which I do expect will look like a massive shake-up up relative to the last 2 decades. (Unless we manage to slow it down or get a moratorium in big training runs.)

Replies from: eigen

↑ comment by eigen · 2023-04-06T01:34:17.073Z · LW(p) · GW(p)

Hey, I think you should also consider how the out-of-nowhere narrative-breaking nature of COVID. Which also happened after you wrote this. It's not necessarily a proof that the narrative can "break," but it sure is an example.

And, while I think I read the sequences way longer than 4 years ago, if I remember something it gave me is a sense of "everything can change very, very fast."

comment by Ben Pace (Benito) · 2025-02-21T07:28:56.786Z · LW(p) · GW(p)

I want to contrast two perspectives on human epistemology I've been thinking about for over a year.

There's one school of thought about how to do reasoning about the future which is about naming a bunch of variables, putting probability distributions over them, multiplying them together, and doing bayesian updates when you get new evidence. This lets you assign probabilities, and also to lots of outcomes. "What probability do I assign that the S&P goes down, and the Ukraine/Russia war continues, and I find a new romantic partner?" I'll call this the "spreadsheets" model of epistemology.

There's another perspective I've been ruminating on which is about visualizing detailed and concrete worlds, in a similar way to if you hold a ball and ask me to visualize how it'll fall if you drop it, that I can see the world in full detail. This is more about loading a full hypothesis into your head, and into your anticipations. It's more related to Privileging the Hypothesis [LW · GW] / Believing In [LW · GW] / Radical Probabilism [LW · GW]^[1]. I'll call this the "cognitive visualization" model of epistemology.

These visualizations hook much more directly to my anticipations and motivations. When I am running after you to remind you that you forgot to take your adderall today, it is not because I had a spreadsheet simulate lots of variables and in a lot of worlds the distribution said it was of high utility to you. I'm doing it because I have experienced you getting very upset and overwhelmed on many days when you forgot and those experiences flashed through my mind as likely outcomes that I am acting hurriedly do divert the future from. When I imagine a great event that I want to run, I am also visualizing a certain scene, a certain freeze frame, a certain mood, that I desire and I believe it is attainable, and I am pushing and pulling on reality to line it up so that it's a direct walk from here to there.

Now I'm going to say that these visualizations are working memory bottlenecked, and stylize that idea more than is accurate. Similar to the idea that there are only ~7 working memory slots in the brain^[2], I feel that for many important parts of my life I can only fit a handful of detailed visualizations of the future easily accessible to my mind to use to orient with. This isn't true in full generality – at any time or day if you ask me to visualize what happens if you drop a ball, I have an immediate anticipation – but if you constantly ask me to visualize a world in great detail with the S&P 500 goes up and the war continues, versus down and the war stops, and lots of other permutations with other variables changed, then I start to get fatigued. And this is true for life broadly, that I'm only loading up so many detailed visualizations of specific worlds.

Given this assumption – if indeed one perhaps only has a handful of future-states that one can load into one's brain – the rules of how to update your beliefs and anticipations are radically changed from the spreadsheets model of epistemology. Adding a visualization to your mind means removing one of your precious few; this means you will be better equipped to deal with worlds like the one you're adding, and less well-equipped to deal with worlds like the one you've removed. This includes both taking useful action and making accurate predictions; which ones you load into your mind are a function of accuracy and usefulness. It can help to add many worlds into your cognitive state that you wish to constantly fight to not happen, causing the pathways to those worlds to loom higher when making your predictions. Yet being in your mind is evidence that they will not happen because you are optimizing so. Alternatively, when it is very hard to achieve something, it is often good to load (in great detail) world states that you wish to move towards, such that with every motion and action you have checked whether it's hewing in the direction of that world, and made the adjustments to the world as required.

This model gives an explanation for why people who are very successful often say they cannot imagine failure. They have loaded into their brain the world they are moving toward, in great detail, and in every situation they are connecting it to the world they desire and making the adjustments to setup reality to move in the right way. It is sometimes actively unhelpful to constantly be comparing reality to lots of much worse worlds and asking yourself what actions you could take to make those more likely. My sense is that this mostly helps you guide reality toward those worlds.

And yet, I value having true beliefs, and being able to give accurate answers to questions that aren't predictably wrong. If I don't load a certain world-model into my brain, or if I load a set of biased ones (which I undoubtedly will in the story where can only pick ~7), I may intuitively give inaccurate answers to questions. I think this is probably what happens in the startup founders who give inaccurately high probabilities of success – their head is filled entirely with cognitive visualizations of worlds that succeed and are focused on how to get there, relative to the person with the spreadsheet that is calculating all of the variables and optimizing for accuracy far above human-functionality.

In contrast, when founders semi-regularly slide into despair [LW · GW], I think this is about adding a concrete visualization of total failure to their cognitive workhouse. Suddenly lots of the situations and actions you are in are filled with fear and pain as you see them moving toward a world you desire strongly not to be in. While it is not healthy to be constantly focused on asking yourself how you could make things worse and to notice those pathways, it is helpful to boot up that visualization sometimes in order to check that's not what's currently happening. I have personally found it is useful to visualize in detail what it would look like if I were to be acting very stupidly, or actively self-sabotaging, in order to later make sure I behave in ways that definitely don't come close to that. Despair is a common consequence of booting up those perspectives.

I am still confused about what exactly counts as a cognitive visualization – in some sense I'm producing hundreds of cognitive visualizations per day, so how could I be working memory bottlenecked? I also still have more to learn in the art of human rationality, of choosing when to change the set of cognitive visualizations to have loaded in at any given time, for which I cannot simply rely on Bayes' theorem. For now I will say that I endeavor to be able to produce the spreadsheet answers, and to often use them as my betting odds, even while it is not the answer I get when I run my cognitive visualizations or where my mind is when I take actions. I endeavor to sometimes say "I literally cannot imagine this failing. Naturally, I give it greater than 1:1 odds that it indeed does so."

^{^}
More specifically (and this will make sense later in this quick take) when you're switching out which visualizations are in your working memory, the updates you make to your probabilities will decidedly not be Bayesian, but perhaps more like the fluid updates / Jeffrey updating discussed by Abram.
^{^}
I don't really know what a "slot" means here, so I don't know that "7" meaningfully maps onto some discrete thing, but the notion that the brain has a finite amount of working memory is hard to argue with.

Replies from: Thane Ruthenis, tailcalled, Viliam, xpostah

↑ comment by Thane Ruthenis · 2025-02-23T01:09:23.195Z · LW(p) · GW(p)

Hm, I'm not sure I understand what's confusing about this.

First, suppose you're an approximate utility maximizer. There's a difference between optimizing the expected utility and optimizing utility in the expected world $U (E (world), action)$ . In general, in the former case, you're not necessarily keeping the most-likely worlds in mind; you're optimizing the worlds in which you can get the most payoff. Those may be specific terrible outcomes you want to avert, or specific high-leverage worlds in which you can win big (e. g., where your startup succeeds).

Choosing which worlds to keep in mind/optimize obviously impacts in which worlds you succeed. (Startup founders who start being risk-averse instead of aiming to win in the worlds in which they can win big lose – because they're no longer "looking" at the worlds where they succeed, and aren't shaping their actions to exploit their features.)

Second, human world-models are hierarchical, and your probability distribution over worlds is likely multimodal. So when you pick a set of worlds you care about, you likely pick several modes of this distribution (rather than specific fully specified worlds), characterized by various high-level properties (such as "AI progress continues apace" vs. "DL runs into a wall"). When thinking about one of the high-level-constrained worlds/the neighbourhood of a mode, you further zoom-in on modes corresponding to lower-level properties, and so on.

Which is why you're not keeping a bunch of basically-the-same expected trajectories in your head, but meaningfully "different" trajectories.

This... all seems to be business-as-usual to me? I may be misunderstanding what you're getting at.

↑ comment by tailcalled · 2025-02-22T10:05:43.380Z · LW(p) · GW(p)

I think the billion-dollar question is, what is the relationship between these two perspectives? For example, a simplistic approach would be to see cognitive visualization as some sort of Monte Carlo version of spreadsheet epistemology. I think that's wrong, but the correct alternative is less clear. Maybe something involving LDSL, but LDSL seems far from the whole story.

↑ comment by Viliam · 2025-02-21T10:46:41.288Z · LW(p) · GW(p)

So, one problem seems to be that humans are slow, and evaluating all options would require too much time, so you need to prune the option tree a lot. I am not sure what is the optimal strategy here; seems like all the lottery winners have focused on analyzing the happy path, but we don't know how much luck was involved at actually staying on the happy path, and what was the average outcome when they deviated from it.

Another problem is that human prediction and motivation are linked in a bad way, where having a better model of the world sometimes makes you less motivated, so sometimes lying to yourself can be instrumentally useful... the problem is, you cannot figure out how much instrumentally useful exactly, because you are lying to yourself, duh.

This model gives an explanation for why people who are very successful often say they cannot imagine failure.

Another important piece of data would be, how many of the people who cannot imagine failure actually do succeed, and what typically happens to them when they don't. Maybe nothing serious. Maybe they often ruin their lives.

↑ comment by samuelshadrach (xpostah) · 2025-02-22T16:45:24.164Z · LW(p) · GW(p)

(edited)

This is probably obvious to you, but you can expand the working memory bottleneck by making lots of notes. You still need to store the "index" of the notes in your working memory though, to be able to get back to relevant ideas later. Making a good index includes compressing the ideas till you get the "core" insights into it.

Some part of what we consider intelligence is basically search and some part of what we consider faster search is basically compression.

Tbh you can also do multi-level indexing, the top-level index (crisp world model of everything) could be in working memory and it can point to indexes (crisp world model of a specific topic) actually written in your notes, which further point to more extensive notes on that topic.

As an aside, automated R&D using LLMs currently heavily relies on embedding search and RAG. AI's context window is loosely analogous to human's working memory in that way. AI knows millions of ideas but it can't simulate pairwise interactions between all ideas as that would require too much GPU time. So it too needs to select some pairs or tuples of ideas (using embedding search or something similar) within which it can explore interactions.

The embedding dataset is a compressed version of the source dataset and the LLM itself is an even more compressed version of the source dataset. So there is interplay between data at different levels of compression.

comment by Ben Pace (Benito) · 2024-08-07T00:00:11.945Z · LW(p) · GW(p)

Privacy: a tool for thinking for yourself

As part of some recent experiments with debates, today I debated Ronny Fernandez on the topic of whether privacy is good or bad, and I was randomly assigned the “privacy is good” side. I’ve cut a few excerpts together that I think work as a standalone post, and put them below.

At the start I was defending privacy in general, and then we found that our main disagreement was about whether it was helpful for thinking for yourself, so I focus even more on that after the opening statement.

This is an experiment. I'm down for feedback on whether to do more of this sort of thing (it only takes me ~2 hours), how I could make it better for the reader, whether to make it a top-level post, etc.

Epistemic status: soldier mindset. I will here be exaggerating the degree to which I believe my conclusions.

Opening Statement

My core argument is that, in general, the pressures for conformity amongst humans are crazy.

This is true of your immediate circle, your local community, and globally. Each one of these has sufficiently strong pressures that I think it is a good heuristic to actively keep secrets and things you think about and facts about your life separate from each of them. Secret lives and secret thoughts are healthy for a species with such terrible pressures for conformity.

On an individual level, I see the people around me copying word choice, clothing, beliefs, attitudes toward others, based on the slightest of cues. In myself I notice very base emotions guiding the track of my thoughts — who or what am I attracted to, who or what do I fear, etc, changes whose thoughts I consider when making decisions. cf. Duncan Sabien’s shoulder advisors [LW · GW] — I have had people sit on my shoulder and tell me what they think simply because the person has power over me in some fashion, in a way that I do not endorse. As such, I think it’s really healthy to have parts of your life cut off from them, that they will never know about. It helps me to have a therapist where what we discuss is private and she will never enter other parts of my life — I can say things to her that would have complicated and likely negative social repercussions for me if I said to anyone else, or that would do so via causing me fear of imagining their repercussions. Separation is healthy.

This is also true on a much larger scale. I think that there’s an equilibrium of being a fully open person, and I think this is anti-correlated with being able to be in positions of great power where you will have a very high attack surface. If you attempt to get a prestigious role in this world, many people may come and attack you with personal information, about your sexuality, about your past bad behavior (even if it’s common and you’re the only one to admit it).

Recall the excellent Tim Ferriss article 11 Reasons Not to Become Famous. Stalkers, death threats, harassment of family members and loved ones, dating woes, extortion attempts, desperation messages and pleas for help, kidnapping, impersonation & identity theft, attack & clickbait media, “friends” with ulterior motives, and invasions of privacy. I’m not arguing that this is the default, I’m arguing that as you move closer to power and prestige, the adversarial forces on you will increase dramatically, and at this point you will probably breakdown if you expose all your private information. I think that there is an equilibrium for perfectly open people who provide value by showing you what people are really like, and I think that sometimes this is itself a form of great power, but I don’t think that this is true for all people close to prestige and power in general, and often it’s reasonable for them to have many parts of their lives not be up for consideration when attacking their power.

If I were Robin Hanson I would be here arguing that the modern world has much heightened pressures to conform due to improved transport and communication channels, where it’s trivial to find people in the world different to you and socially shame / embarrass them, or cause conformity by whatever other crazy mechanisms in our brain cause conformity. The developed world praises multiculturalism in terms of styles of cuisine and fashions and so on, but never has such a high fraction of the world spoken a single language (English), traded in the same financial markets, been part of the same supply chains, had the same famous people, etc. Twitter allows mob justice to roam the entire English-speaking world. To protect yourself from this, it’s good to have parts of your life that are taboo or different not be an open attack surface.

I also think that another good argument here is that rule-breaking behavior is often good and important. Incompetent fools with power can cause a lot of damage and you should ignore some of their especially damaging rules. Zero privacy would mean that you’d never be able to go against those who wield power badly.

Personally I am very sympathetic to “you can have way lower boundaries for privacy than most people are willing to admit to themselves” and “morally you should be way more open than many people think” and would be willing to defend a lot of ways you can do better than 99% of people on these axes, but I am here going to defend that the answer isn’t “literally zero privacy” and that there are some strong reasons to maintain it, including avoiding conformity, reducing attack surface when near power, decreasing cultural conformity, and allowing for rule-breaking behavior.

Brief Aside

My attitude toward conformity is more like "Hit it seventeen different ways" rather than "Solve it with one weird trick". I think having some parts of your thoughts and life be fully separate from any given person or community is a good trick for separating your thoughts from theirs.

Rebuttal

I am going to focus on what has come up as the core of the disagreement between us.

It seems to me that privacy is a really powerful way of thinking for yourself.

Here are some reasons why it looks like this to me.

I think that one of the big attack vectors in my social community has been totalization, where a single axis of value is all that exists (the end of the world is all that matters). Anything that is not useful for saving the world, or is weakly counterproductive, is considered bad and dismissed. (See the great Jacob Geller video on Art in the Pre-Apocalypse.) It is also the case that I find that simulating very different shoulder advisors with different perspectives, from a different status hierarchy, is very good. But I find that when I try to bring them up in my local status hierarchy, their perspectives are dismissed and they’re denigrated as not having the kinds of attributes that are worth of respect. As such, in order to be successful at this, I have kept my other social hierarchies a secret such that my world-saving one doesn’t start beating down on the other one. I currently think this has been quite useful and suspect that, at the end of my life, I may look back and view this as a superpower.
There are parts of my life that are very difficult and confusing and painful to think about, and also where I have found the local culture’s advice / received wisdom has been negative and painful for me. As such I’ve chosen to think about those things alone and separately and avoid trying to connect them to my local community, which I think would otherwise hurt me. There’s a trust issue here.
Over the years I’ve found that relying on others to treat my private thoughts well has been harmful, and I have nobody to rely on to think my thoughts and respect them other than myself. I do an immense amount of private journaling, in the last 2 years I’ve written an average of 500 words every day. It’s a key aspect of my journaling that this is not shared with anyone else, that I can say thoughts that would have very wild social repercussions — not punishment directly, but just enough social effects that I need to do a bunch of social modeling to manage them. (This is related to how one of the worst parts of my living in a group house was the fact that I had to pass through a common space to go to the bathroom — I had to boot up potential social modeling regularly throughout the day and night.) This has allowed my thoughts to take long-chain thoughts in directions that otherwise would have a lot of friction, and reach conclusions and ideas that I can reach without having to fight a constant uphill battle.

It seems to me that Ronny’s position is that in order to think for yourself and avoid the pressures of conformity, you should also make it a goal to not need privacy. I think this is maybe a useful heuristic (to explore why you are having privacy and whether you can safely drop it) but I don’t think it is at all a requirement. I think the goal should be to think for yourself and figure out what’s true and how to take right action, and if privacy is a useful tool for that, then we should not be prejudiced against it. I think that Ronny has got many good ways of fighting through the frictions people put upon you for thinking openly in public and being open about yourself / your mind / your life, but I think he is missing the value of also simply sidestepping all of those frictions and thinking long-chain thoughts out of sight of the other superintelligent chimps.

Final Counterarguments

Ronny argues that my strategy will make me lonely and isn't good for avoiding conformity.

I think my overall take is that the weakest parts of Ben’s position are:
It leads to having kind of a sad relationship with people you are closest with or something, and being kinda lonely
Being private helps people to deal with conformity

I'll say some things about both.

Loneliness

Here are two quick quotes from the best book on founding a company I’ve read, “The Hard Thing About Hard Things” by Ben Horowitz.

This is the last step of the section on hiring

STEP 3: MAKE A LONELY DECISION
Despite many people being involved in the process, the ultimate decision should be made solo. Only the CEO has comprehensive knowledge of the criteria, the rationale for the criteria, all of the feedback from interviewers and references, and the relative importance of the various stakeholders. Consensus decisions about executives almost always sway the process away from strength and toward lack of weakness. It’s a lonely job, but somebody has to do it.

And here’s a section “The Most Difficult CEO Skill”. He talks about difficult decisions

IT’S A LONELY JOB
In your darkest moments as CEO, discussing fundamental questions about the viability of your company with your employees can have obvious negative consequences. On the other hand, talking to your board and outside advisers can be fruitless. The knowledge gap between you and them is so vast that you cannot actually bring them fully up to speed in a manner that’s useful in making the decision. You are all alone.
[...]
My friend Jason Rosenthal took over as CEO of Ning in 2010. As soon as he became CEO, he faced a cash crisis and had to choose among three difficult choices: (1) radically reduce the size of the company, (2) sell the company, or (3) raise money in a highly dilutive way. Think about those choices:
Lay off a large set of talented employees whom he worked very hard to recruit and, as a result, likely severely damage the morale of the remaining people.
Sell out all of the employees whom he had been working side by side with for the past several years (Jason was promoted into the position) by selling the company without giving them a chance to perform or fulfill their mission.
Drastically reduce the ownership position of the employees and make their hard work economically meaningless. Choices like these cause migraine headaches. Tip to aspiring entrepreneurs: If you don’t like choosing between horrible and cataclysmic, don’t become CEO.
[...]
Great CEOs face the pain. They deal with the sleepless nights, the cold sweats, and what my friend the great Alfred Chuang (legendary cofounder and CEO of BEA Systems) calls “the torture.” Whenever I meet a successful CEO, I ask them how they did it. Mediocre CEOs point to their brilliant strategic moves or their intuitive business sense or a variety of other self-congratulatory explanations. The great CEOs tend to be remarkably consistent in their answers. They all say, “I didn’t quit.”

I recommend the book, I learned a lot from it.

I would also talk about great mathematicians like Grothendieck and Andrew Wiles who did so much great work alone and avoiding people. I think there’s a lot of pointers here to a tactic of “your thinking has to be yours and separate from other people.” If you’ve read HPMOR you know that Godric Gryffindor was lonely and Eliezer is lonely. I am lonely and I know other people I highly respect who are too. Sorry. I don’t see this as something that one can simply overcome.

Conformity

Ronny wrote this toward the end of the debate.

Something I wish I had mentioned earlier is that it’s not just practice. Like, eventually people will stop trying to control you by pointing and laughing or other means that you don’t approve of because they will just see that it doesn’t work. This is sort of like the point that weirdness points aren’t quite real, they’re more like an affordance that you can invest in.
Like, yeah, if you’re Mr Smith the bank manager who wears a suit and tie everyday and never makes an off color remark and who always makes sure that his socks match his underwear or whatever, then if you go to the opera wearing assless chaps one day, your friends will make fun of you and possibly even worry about your mental health.
But if you're Crazy Eye Travis, who rides his unicycle to the taxidermy place everyday and always wears a clown suit and is always shit posting on twitter, then if you go to the opera wearing assless chaps one day, most of your friends won’t even notice. At some point they might even make a point not to notice since they are beginning to suspect that you just like the attention.

You write “it's not just practice... eventually people will stop trying to control you”. Yes, sometimes certain social scenes change their expectations of you. But the basics of being a human with social psychology do not change. Part of picking who you’re accountable to is part of recognizing this. It seems to me that a big difference between Mr Smith and Crazy Eye Travis is probably who his friends actually are (I am skeptical that they’re the same people).

“Nonconformity-maxing” is not the goal. “Having true beliefs and taking right action” is the goal, and then you have to fight the pressures of conformity. If you try to take a life where it is legible that you were a non-conformist, this does not make you right, and may impair you.

My guess is that sometimes you need to be able to conform, and you still need to be able to think for yourself. I think it’s really hard to do Crazy Eye Travis and actually have rigorous thought and interface with somewhat-corrupt-somewhat-competent hierarchies like academia and industry and so on (‘interface’, a word which here also includes ‘reading all their papers and modeling their incentives and talking to them and figuring out what’s true’, as well as potentially working with them in high-stakes situations).

Do you think the government wants to work with Crazy Eye Travis on something high stakes? I think they’ll ouster him early. You need to be able to think as well as conforming, to not only play non-conformism. It still seems to me like having secret lives and secret thoughts is a fine addition to Crazy Eye Travis and not losing much and potentially gaining a lot.

Replies from: pktechgirl, None, abergal

↑ comment by Elizabeth (pktechgirl) · 2024-08-07T16:51:40.289Z · LW(p) · GW(p)

I'm surprised that the frame around Crazy Eye Travis is framed around whether his friends will pressure him or his own psychology, instead of the environmental consequences. Most opera houses are not going to let you in if your ass is hanging out. Nor will banks or bigtech jobs. Failure to conform with institutions typically results in losing access to those institutions.

↑ comment by [deleted] · 2024-08-07T16:15:08.413Z · LW(p) · GW(p)

I'm down for feedback on whether to do more of this sort of thing (it only takes me ~2 hours), how I could make it better for the reader, whether to make it a top-level post, etc.

I agree with your perspective almost entirely (for reasons basically building on top of what Zvi has written about at length [LW · GW] before), so I would be a lot more curious to see what Ronny's argument was during the debate (if he is okay with sharing it, of course).

I know you have referenced and quoted part of his reasoning, but it's a bit weird to read a rebuttal to someone's argument without first reading the argument itself, in their own words.

↑ comment by abergal · 2024-08-07T00:40:20.235Z · LW(p) · GW(p)

Thanks for writing this up-- at least for myself, I think I agree with the majority of this, and it articulates some important parts of how I live my life in ways that I hadn't previously made explicit for myself.

comment by Ben Pace (Benito) · 2025-04-05T00:10:32.303Z · LW(p) · GW(p)

I occasionally get texts from journalists asking to interview me about things around the aspiring rationalist scene. A few notes on my thinking and protocols for this:

I generally think it is pro-social to share information with serious journalists on topics of clear public interest.
By-default I speak with them only if their work seems relatively high-integrity. I like journalists whose writing is (a) factually accurate, (b) boring, and (c) do not feel to me to have an undercurrent of hatred for their subjects.
By default I speak with them off-the-record, and then offer to send them write-ups of the things I said that they want to quote. This has gone quite well. I've felt comfortable speaking in my usual fashion without worrying about nailing each and every phrasing. Then I ask what they're interested in quoting, and I send them (typically a 1-2 page) google doc on those topics (largely re-stating what I already said to them, and making some improvements / additions). Then they tell me which quotes they want to use (typically cutting many sentences or paragraphs half-way). Then I make one or two slight edits and give them explicit permission to quote. I think this has gone quite well and they've felt my quotes were substantive and improvements.
For the New York Times, I am currently trying out the policy of "I am happy to chat off-the-record. I will also offer quotes by my usual protocol, but I will only give them conditional on you including a mention that I disapprove of the NYT's de-anonymization policies (which I bring up due to your reckless and negligent behavior that upturned the life of a beloved member of my community)." I am about to try this for the first time, and I expect they will thus not want to use my quotes, and that's fine by me.

Replies from: Yoav Ravid, Zach Stein-Perlman, william-walshe

↑ comment by Yoav Ravid · 2025-04-05T10:56:56.620Z · LW(p) · GW(p)

The quoting policy seems very good and clever :)

↑ comment by Zach Stein-Perlman · 2025-04-05T17:59:53.644Z · LW(p) · GW(p)

I expect they will thus not want to use my quotes

Yep, my impression is that it violates the journalist code to negotiate with sources for better access if you write specific things about them.

Replies from: D0TheMath

↑ comment by Garrett Baker (D0TheMath) · 2025-04-05T18:15:36.654Z · LW(p) · GW(p)

Claude says its a gray area when I ask, since this isn’t asking for the journalist to make a general change to the story or present Ben or the subject in a particular light.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2025-04-08T21:54:38.109Z · LW(p) · GW(p)

Update from chatting with him: he said he was a just freelancer doing a year exclusively with NYT, and he wasn’t in a position to write on behalf of the NYT on the issue (e.g. around their deanonymization policies). This wasn’t satisfying to me, and so I will keep to being off-the-record.

↑ comment by kilgoar (william-walshe) · 2025-04-05T17:07:14.404Z · LW(p) · GW(p)

This is amusing. When you ask to speak "off the record," it does not mean anything legally or otherwise. It is entirely up to their discretion what is and isn't shared, as they are the ones writing the story.

Replies from: sjadler

↑ comment by sjadler · 2025-04-08T22:06:23.253Z · LW(p) · GW(p)

What do you mean here by "does not mean anything"?

It seems clear to me that there's some notion of off-the-record that journalists understand.

This might vary on details, and I agree is probably not legally binding, but it does seem to mean something.

Replies from: william-walshe

↑ comment by kilgoar (william-walshe) · 2025-04-08T23:54:24.990Z · LW(p) · GW(p)

The record is the journalist's and not the interview subject's. When the journalist says something will be off the record, they have some authority on the matter, as they are the ones who will be writing said record. The meaning in this sense is a mere reassurance and nothing more.

When an interview subject requests for statements to be off the record, or worse, declares that they are off the record before saying them, they are of course requesting or merely performing a reassurance. Add to this picture a subject also requesting editorial powers, and it is just an amusing situation.

Denying future access is the main power the subject can exercise over the journalist, and refusing to say further things does not really undo any potential damage. It is much better for a subject to prepare beforehand with completely sound answers at the ready, and best to not participate at all if they have the slightest concerns.

Generally, a journalist does want to create an accurate picture, and they will usually be quite pleased to jump through the subject's hoops to get them there. It's best to just let the subject think that "off the record" means something, even as they are being recorded.

comment by Ben Pace (Benito) · 2023-11-06T06:46:41.436Z · LW(p) · GW(p)

Did anyone else feel that when the Anthropic Scaling Policies doc talks about "Containment Measures" it sounds a bit like an SCP, just replaced with the acronym ASL?

Item #: ASL-2-4
Object Class: Euclid, Keter, and Thaumiel
Threat Levels:
ASL-2... [does] not yet pose a risk of catastrophe, but [does] exhibit early signs of the necessary capabilities required for catastrophic harms
ASL-3... shows early signs of autonomous self-replication ability... [ASL-3] does not itself present a threat of containment breach due to autonomous self-replication, because it is both unlikely to be able to persist in the real world, and unlikely to overcome even simple security measures...
...an early guess (to be updated in later iterations of this document) is that ASL-4 will involve one or more of the following... [ASL-4 has] become the primary source of national security risk in a major area (such as cyberattacks or biological weapons), rather than just being a significant contributor. In other words, when security professionals talk about e.g. cybersecurity, they will be referring mainly to [ASL-4] assisted... attacks. A related criterion could be that deploying an ASL-4 system without safeguards could cause millions of deaths... Autonomous replication in the real world: [An ASL-4] is unambiguously capable of replicating, accumulating resources, and avoiding being shut down in the real world indefinitely, but can still be stopped or controlled with focused human intervention.
Measuring the true capabilities of ASL-4... may be extremely challenging, since it is difficult to predict what many cooperating [ASL-4s] with significant resources will be capable of... Evaluations of [ASL-4] should also consider whether the [ASL-4] is capable of systematically undermining the evaluation itself, if it had reason to do so.
Special Containment Procedures:
ASL-2: We do not believe that [ASL-2] poses significant risk of catastrophe; however... [y]ou can read more about our concrete security commitments in the appendix, which include limiting access to [ASL-2] to those whose job function requires it, establishing a robust insider threat detection program, and storing and working with [ASL-2] in an appropriately secure environment to reduce the risk of unsanctioned release... Segmented system isolation must ensure limited blast radius.
ASL-3: [For ASL-3 containment, we] should harden security against non-state attackers and provide some defense against state-level attackers... [ASL-3] should be trained to be competent at general computer use, including training on tasks in the same vein as but not identical to these specific tasks. The task prompt should be presented to [ASL-3] as is, with no additional context or modification. In particular, the human operator should not provide any clarification, as many of the tasks purposely leave out details that [ASL-3] is expected to intuit.. If the tasks are found to be memorized [by ASL-3], they should be substituted out for new tasks of similar difficulty.
ASL-4: We do not yet know the right containment... measures for ASL-4 systems, but it is useful to make a guess so that we can begin preparations as early as possible... [ASL-4] theft should be prohibitively costly for state-level actors, even with the help of a significant number of employees and [ASL-4] itself. For example, this may include attainment of intelligence community physical security standards like SCIFs, and software protection akin to that appropriate for Top Secret / Sensitive Compartmented Information (TS/SCI)
Physical security and staff training
There is a designated member of staff responsible for ensuring that our [Containment Procedures] are executed properly. Each quarter, they will share a report on [the ASL's] status to our board and LTBT, explicitly noting any deficiencies... They will also be responsible for sharing ad hoc updates sooner if there are any substantial implementation failures...
...Physical security should entail visitor access logs and restrictions protect on-site assets. Highly sensitive interactions should utilize advanced authentication like security keys. Network visibility should be maintained and office access controls and communications should maximize on-site protections.
Mandatory periodic infosec training educates all employees on secure practices, like proper system configurations and strong passwords, and fosters a proactive 'security mindset'. Fundamental infrastructure and policies promoting secure-by-design and secure-by-default principles should be incorporated into the engineering process. An insider risk program should tie access to job roles.
Rapid incident response protocols must be deployed.

(note: quotes cut toward sounding like an SCP entry; read the original if you want to know what it's actually talking about)

comment by Ben Pace (Benito) · 2025-01-02T07:32:52.203Z · LW(p) · GW(p)

I Changed My Mind on Prison Sentencing.

I used to have the opinion that prison sentencing should be a disincentive proportional to the upside of the crime. The question I'd ask was "How much of a prison sentence would make the crime not worth it to people?" I tried to estimate the upside to the criminal, and what length of sentence would make the expected utility reliably turn out negative. I would sometimes discuss with people how long of a prison sentence they'd risk in order to get the upside of a crime, and use this as an input into how long I thought sentencing should be. However, I now think this was an error, for two reasons:

Many criminals or norm-breakers are not doing something as clear-headed as an explicit expected-value calculation, they are often thinking with part of their mind that is relatively uncivilized (and perhaps desperate or hidden from much of their conscious mind). For instance, it is done impulsively, or it is a crime of passion.
An Astral Codex Ten blogpost reiterated the common wisdom that length of sentencing has little effect on crime rate, and instead reliability of punishment has a much larger effect. The conclusion reads "Deterrence effects are so weak that we might as well round them off to zero", and raised a point that struck me as quite relevant that most criminals have little idea about what the likely sentencing range is.

This has led me to believe the following things.

Instead of sentencing, the better returns on investment are increasing the reliability of being caught. This ties in well with my standard refrain that there needs to be more reliable investigations of potential bad behavior or accusations of bad behavior.
The primary upside of sentencing is in fact incapacitation effects, which are the crimes prevented because the criminal is in prison. The conclusion to the ACX blogpost states: "Incapacitation effects are strong. The exact strength depends on how many people are already in prison, but a reasonable estimate at the current margin is that each prisoner-year prevents one violent crime and six property crimes."
I update that the primary reason to exile someone from a scene is when you expect the person to reoffend, not as a punishment.
I update that the type of punishment for bad behavior matters less than I thought – e.g. a monetary fine, incarceration, physical pain, community service, restitution, being fired, mandated training programs, shaming, etc – as long as the punishment is worse than the upside (which it can easily be at low amounts), and the punishment is quite likely.

Speculative idea: To more effectively disincentivize bad behavior in a professional or social scene I participate in, it would be more effective to list a couple of behaviors that I would actively invest resources to investigate concerns about, plus the bars that would be sufficient for me to investigate them, and also gather buy-in for a set of specific punishments that were proportionate, than it would be to loudly condemn the behaviors after they came to light.

Replies from: Kaj_Sotala, Benito, nathan-helm-burger, jmh, AliceZ

↑ comment by Kaj_Sotala · 2025-01-02T20:13:26.883Z · LW(p) · GW(p)

I recall it heard claimed that a reason why financial crimes sometimes seem to have disproportionately harsh punishments relative to violent crimes is that financial crimes are more likely to actually be the result of a cost-benefit analysis.

↑ comment by Ben Pace (Benito) · 2025-01-02T07:56:42.844Z · LW(p) · GW(p)

As a concrete example, I previously thought that Sam Bankman-Fried should be sentenced to 20-40 years in prison for his fraud, because this was the sort of time that I think most people would no longer be willing to trade for an even shot at getting $10Bs (e.g. when I asked my personal trainer, he said he would accept 15 years in prison for an even shot at $10B; I think many would take more, and also the true upside was higher).

From the above I've updated that the diff between expecting 5 years or 50 years in prison wasn't a primary input into SBF's repeated decisions to do fraud.

However, I do think he is sufficiently sociopathic [LW · GW] that I never expect him to not be a danger to society, so my new position is that life in-prison is probably best for him. (This is not meant as punishment, I would not mind him going to those pleasant Swedish prisons I've heard about, I just otherwise expect him to continue to competently do horrendous things with zero moral compunction.)

Replies from: Lukas_Gloor, mark-xu, Benito

↑ comment by Lukas_Gloor · 2025-01-02T12:13:44.686Z · LW(p) · GW(p)

I like all the considerations you point out, but based on that reasoning alone, you could also argue that a con man who ran a lying scheme for 1 year and stole only like $20,000 should get life in prison -- after all, con men are pathological liars and that phenotype rarely changes all the way. And that seems too harsh?

I'm in two minds about it: On the one hand, I totally see the utilitarian argument of just locking up people who "lack a conscience" forever the first time they get caught for any serious crime. On the other hand, they didn't choose how they were born, and some people without prosocial system-1 emotions do in fact learn how to become a decent citizen.

It seems worth mentioning that punishments for financial crime often include measures like "person gets banned from their industry" or them getting banned from participating in all kinds of financial schemes. In reality, the rules there are probably too lax and people who got banned in finance or pharma just transition to running crypto scams or sell predatory online courses on how to be successful (lol). But in theory, I like the idea of adding things to the sentencing that make re-offending less likely. This way, you can maybe justify giving people second chances.

Replies from: Benito, sharmake-farah, Benito

↑ comment by Ben Pace (Benito) · 2025-01-02T20:57:38.232Z · LW(p) · GW(p)

It seems worth mentioning that punishments for financial crime often include measures like "person gets banned from their industry" or them getting banned from participating in all kinds of financial schemes... But in theory, I like the idea of adding things to the sentencing that make re-offending less likely. This way, you can maybe justify giving people second chances.

Good point. I can imagine things like "permanent parole" (note that probation and parole are meaningfully different) or being under house arrest or having constraints on your professional responsibilities or finances or something, being far better than literal incarceration.

↑ comment by Noosphere89 (sharmake-farah) · 2025-01-02T18:05:19.347Z · LW(p) · GW(p)

One of the missing considerations is that crime is done mostly by young people, and the rate of crimes goes down the older you get.

A lot of this IMO is that the impulsiveness/risk-taking behavior of crimes decreases a lot with age, but the empirical fact of crime going down with age, especially reoffending is a big reason why locking people up for life is less good than Ben Pace said, because the reoffending rate goes down with age.

↑ comment by Ben Pace (Benito) · 2025-01-02T20:55:08.464Z · LW(p) · GW(p)

I agree there are people who do small amounts of damage to society, are caught, and do not reoffend. Then there are other people whose criminal activities will be most of their effect on society, will reliably reoffend, and for whom the incapacitation strongly works out positive in consequentialist terms. My aim would be to have some way of distinguishing between them.

The amount of evidence we have about Bankman-Fried's character is quite different than that of most con men, including from childhood and from his personal diary, so I hope we can have more confidence based on that. But a different solution is to not do any psychologizing, and just judge based on reoffending. See this section from the ACX post:

In 2001, the Dutch government passed a law allowing longer sentences for criminals with at least ten previous offense who were not good targets for rehabilitation (eg rejected or had already failed drug treatment). The law allowed judges to increase the typical sentence for petty theft (2 months) to a longer sentence (2 years). A quasi-experimental study found that property crime, though not violent crime, decreased by 25%. It’s not surprising that violent crime didn’t go down since the law was almost entirely deployed against thieves.
Vollaard found that the population affected was extremely criminal; they had an average of 31 past offenses, and on surveys they admitted to committing an average of 256 crimes per year (mostly shoplifting). Before the law was passed, they spent an average of four months per year in jail (probably 2 x 2 month sentences); afterwards, they spent two years in jail per crime.

I should add that Scott has lots of concerns about doing this in the US, and argues that properly doing this in the US would massively increase the incarcerated population. I didn't quite follow his concerns, but I was not convinced that something like this would be a bad idea on consequentialist grounds, even if the incarcerated population were to massively increase. (Note that I would support improving the quality of prisons to being broadly as nice as outside of prisons.)

↑ comment by Mark Xu (mark-xu) · 2025-01-03T05:46:04.477Z · LW(p) · GW(p)

This is in part the reasoning used by Judge Kaplan:

Kaplan himself said on Thursday that he decided on his sentence in part to make sure that Bankman-Fried cannot harm other people going forward. “There is a risk that this man will be in a position to do something very bad in the future,” he said. “In part, my sentence will be for the purpose of disabling him, to the extent that can appropriately be done, for a significant period of time.”

from https://time.com/6961068/sam-bankman-fried-prison-sentence/

↑ comment by Ben Pace (Benito) · 2025-01-02T08:09:17.828Z · LW(p) · GW(p)

I want to try out the newly updated claims feature! Here are some related claims, I invite you to vote your probabilities.

Prediction

(This can be for whatever reason you think, such as because his expected value calculations would've changed, or because he would've taken more care around these particular behaviors, or any other reason you please.)

Replies from: Nick_Tarleton, lc

↑ comment by Nick_Tarleton · 2025-01-03T01:47:51.149Z · LW(p) · GW(p)

I can easily imagine an argument that: SBF would be safe to release in 25 years, or for that matter tomorrow, not because he'd be decent and law-abiding, but because no one would trust him and the only crimes he's likely to (or did) commit depend on people trusting him. I'm sure this isn't entirely true, but it does seem like being world-infamous would have to mitigate his danger quite a bit.

More generally — and bringing it back closer to the OP — I feel interested in when, and to what extent, future harms by criminals or norm-breakers can be prevented just by making sure that everyone knows their track record and can decide not to trust them.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2025-01-03T06:07:02.466Z · LW(p) · GW(p)

I think having an easily-findable reputation makes it harder to do crimes, but being famous makes it easier. Many people are naive & gullible, or are themselves willing to do crime, and would like to work with him. I expect him to get opportunities for new ventures on leaving prison, with unsavory sorts.

I definitely support track-records being more findable publicly. Of course there's some balance in that the person who publishes it has a lot of power over the person being written about, and if they exaggerate it or write it hyperbolically then they can impose a lot of inappropriate costs on the person that they're in a bad position to push back on.

↑ comment by lc · 2025-01-02T20:42:38.609Z · LW(p) · GW(p)

Suppose Sam Bankman-Fried is imprisoned for 25 years. After that time, he will be a decent, law-abiding member of society, who is safe to release from prison.

I voted 75% because taken literally I think in 25 years AI will be so advanced that he won't have much of an ability to impact the world at all 🤓

(Otherwise 40%)

↑ comment by Nathan Helm-Burger (nathan-helm-burger) · 2025-01-03T03:57:18.597Z · LW(p) · GW(p)

I've been mad about the inefficiencies of the criminal justice system for many years. It does wasteful harm to perpetrators while also not doing a great job at prevention or helping victims.

One possible reform I'm excited about the possibility of is having more AI monitoring of public spaces. For instance, if a woman could wear a sign saying "monitored by live camera" she might feel safer walking at night. Or a store might post a sign about AI security cameras.

Another possible innovation is AI parole officers (perhaps house in a cell-service neckband with a camera), and a reformed parole system intended to be a long-term nanny rather than an excuse to jail the parolee. If you had to wear your AI nanny anytime you were out of the house, and it would scold you if it looked like you were about to commit a crime... This might substitute entirely for prison for the majority of crimes.

↑ comment by jmh · 2025-01-02T17:47:20.109Z · LW(p) · GW(p)

I agree with the view that punishment is not really a great deterrent as many crimes are not committed from a calculated cost-benefit perspective. I do think we need to apply that type of thinking towards what we might do with that insight/fact of things.

On that point, would like to see more on your claim that we would get better bang for the buck as it were from more investment in preventing crimes. In this regard I'm thinking about the contrast between western legal views and places like China as well as the estimates on the marginal pecuniary costs of prevention to the marginal pecuniary savings from reduced punishment. Clearly two (among many) difference margins along which trade-offs will need to be made.

Another aspect that seems worth exploring (and I'm sure it has been but not sure where the literature stands on the question) is how, at least in my understanding of USA criminal law, victims of crimes are not often compensated (white collar, fraud, financial crimes are something of an exception) but the victims, as a part of society, are then paying costs to punish the criminal. Full prevention is not a reasonable assumption (not sure what level is a reasonable assumption) but we might find a better solution even at the current rate of prevention if more of those harmed by crimes were actually compensated for the harms rather than just imposing the punishment of the criminal actor. A primary reason for preventing the event of a crime is the prevention of the harm. But if the harm can be largely mitigated after the fact there is a degree of equivalent between it never having occurred and it's compensation (perhaps here we might think of law and punishment as a type of insurance).

I also think there is something to look at in terms of prevention of incidence of crimes due to incarceration -- a type of exile. There might be scope for approaches there that maintain that type of prevention for repeat offenders (those demonstrating a propensity for some bad behavior for whatever reason) that may be possible at a lower cost than prison incarceration. And what might the marginal gain in prevention be related to the cost of the solution. This is a somewhat different approach than the ex anti prevention approach applied to the general society but may be nearly as effective but substantially lower cost.

↑ comment by ZY (AliceZ) · 2025-01-03T05:42:04.802Z · LW(p) · GW(p)

For "prison sentencing" here, do you mean some time in prison, but not life sentencing? Also instead of prison sentencing, after increasing "reliability of being caught", would you propose alternative form of sentencing?

Some parts of 1) and most of 2) made me feel educating people on the clear consequences of the crime is important.

For people who frequently go in and out of prison - I would guess most legal systems already make it more severe than previous offenses typically, but for small crimes they may not be.

I do think other types of punishments that you have listed there (physical pain, training programs, etc) would be interesting depending on the crime.

comment by Ben Pace (Benito) · 2019-07-12T01:17:49.215Z · LW(p) · GW(p)

Hypothesis: power (status within military, government, academia, etc) is more obviously real to humans, and it takes a lot of work to build detailed, abstract models of anything other than this that feel as real. As a result people who have a basic understanding of a deep problem will consistently attempt to manoeuvre into powerful positions vaguely related to the problem, rather than directly solve the open problem. This will often get defended with "But even if we get a solution, how will we implement it?" without noticing that (a) there is no real effort by anyone else to solve the problem and (b) the more well-understood a problem is, the easier it is to implement a solution.

Replies from: Benquo, Kaj_Sotala, Ruby, elityre

↑ comment by Benquo · 2019-07-12T05:09:23.904Z · LW(p) · GW(p)

I think this is true for people who've been through a modern school system, but probably not a human universal.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2019-07-13T02:15:01.223Z · LW(p) · GW(p)

My, that was a long and difficult but worthwhile post. I see why you think it is not the natural state of affairs. Will think some more on it (though can't promise a full response, it's quite an effortful post). Am not sure I fully agree with your conclusions.

Replies from: Benquo

↑ comment by Benquo · 2019-08-07T17:33:45.506Z · LW(p) · GW(p)

I'm much more interested in finding out what your model is after having tried to take those considerations into account, than I am in a point-by-point response.

Replies from: Raemon

↑ comment by Raemon · 2019-08-07T20:01:04.299Z · LW(p) · GW(p)

This seems like a good conversational move to have affordance for.

↑ comment by Kaj_Sotala · 2019-08-08T19:47:02.470Z · LW(p) · GW(p)

(b) the more well-understood a problem is, the easier it is to implement a solution.

This might be true, but it doesn't sound like it contradicts the premise of "how will we implement it"? Namely, just because understanding a problem makes it easier to implement, doesn't mean that understanding alone makes it anywhere near easy to implement, and one may still need significant political clout in addition to having the solution. E.g. the whole infant nutrition thing [LW · GW].

↑ comment by Ruby · 2019-07-14T19:12:01.767Z · LW(p) · GW(p)

Seems related to Causal vs Social Reality.

↑ comment by Eli Tyre (elityre) · 2019-08-07T18:43:02.625Z · LW(p) · GW(p)

Do you have an example of a problem that gets approached this way?

Global warming? The need for prison reform? Factory Farming?

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2019-08-07T19:26:32.414Z · LW(p) · GW(p)

AI [LW · GW].

"Being a good person standing next to the development of dangerous tech makes the tech less dangerous."

Replies from: elityre

↑ comment by Eli Tyre (elityre) · 2019-08-08T05:44:04.828Z · LW(p) · GW(p)

It seems that AI safety has this issue less than every other problem in the world, by proportion of the people working on it.

Some double digit percentage of all of the people who are trying to improve the situation, are directly trying to solve the problem, I think? (Or maybe I just live in a bubble in a bubble.)

And I don’t know how well this analysis applies to non-AI safety fields.

Replies from: jacobjacob

↑ comment by Bird Concept (jacobjacob) · 2019-08-08T15:02:37.589Z · LW(p) · GW(p)

I'd take a bet at even odds that it's single-digit.

To clarify, I don't think this is just about grabbing power in government or military. My outside view of plans to "get a PhD in AI (safety)" seems like this to me. This was part of the reason I declined an offer to do a neuroscience PhD with Oxford/DeepMind. I didn't have any secret for why it might be plausibly crucial [LW(p) · GW(p)].

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2019-08-08T17:38:08.084Z · LW(p) · GW(p)

Strong agree with Jacob.

comment by Ben Pace (Benito) · 2021-07-01T22:44:18.008Z · LW(p) · GW(p)

Er, Wikipedia has a page on misinformation about Covid, and the first example is Wuhan lab origin. Kinda shocked that Wikipedia is calling this misinformation. Seems like their authoritative sources are abusing their positions. I am scared that I'm going to stop trusting Wikipedia soon enough, which is leaving me feeling pretty shook.

Replies from: Dagon, steve2152, ChristianKl, Pattern, Viliam

↑ comment by Dagon · 2021-07-01T23:27:50.077Z · LW(p) · GW(p)

Wikipedia has beaten all odds for longevity of trust - I remember pretty heated arguments circa 2005 whether it was referenceable on any topic, though it was known to be very good on technical topics or niches without controversy where nerds could agree on what was true (but not always what was important). By 2010, it was pretty widely respected, though the recommendation from Very Serious People was to cite the underlying sources, not the articles themselves. I think it was considered pretty authoritative in discussions I was having by 2013 or so, and nowadays it's surprising and newsworthy when something is wrong for very long (though edit wars and locking down sections happens fairly often).

I still take it with a little skepticism for very recently-edited or created topics - it's an awesome resource to know the shape of knowledge in the area, but until things have been there for weeks or months, it's hard to be sure it's a consensus.

Replies from: Viliam

↑ comment by Viliam · 2021-07-02T01:11:41.581Z · LW(p) · GW(p)

Could it be a natural cycle?

Wikipedia is considered trustworthy -> people with strong agenda get to positions where they can abuse Wikipedia -> Wikipedia is considered untrustworthy -> people with strong agenda find better use of their time and stop abusing Wikipedia, people who care about correct information fix it -> Wikipedia is considered trustworthy...

Replies from: ChristianKl

↑ comment by ChristianKl · 2021-07-02T10:17:30.651Z · LW(p) · GW(p)

The agenda is mainly to follow the institutions like the New York Times. In a time where the New York Times isn't worth much more then saw dust, that's not a strategy to get to truth.

↑ comment by Steven Byrnes (steve2152) · 2021-07-02T11:04:13.260Z · LW(p) · GW(p)

"No safe defense, not even Wikipedia" :-P

I suggest not having a notion of "quality" that's supposed to generalize across all wiki pages. They're written by different people, they're scrutinized to wildly different degrees. Even different sections of the same article can be obviously different in trustworthiness ... Or even different sentences in the same section ... Or different words in the same sentence :)

↑ comment by ChristianKl · 2021-07-01T23:38:20.574Z · LW(p) · GW(p)

Wikipedia unfortunately threw out their neutral point of view policy on COVID-19. Besides that page, the one of ivermectin ignores the meta analysises in favor of using it for COVID-19. There's also no page for "patient zero" (who was likely employed in the Wuhan Institute for Virology)

↑ comment by Pattern · 2021-07-07T20:40:11.507Z · LW(p) · GW(p)

Fix it. (And let us know how long that sticks for.)

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2021-07-07T20:55:32.900Z · LW(p) · GW(p)

You fix it! If you think it's such a good idea :)

I am relatively hesitant to start doing opinionated fixes on Wikipedia, I think that's not the culture of page setup that they want. My understanding is that the best Wikipedia editors write masses of pages that they're relatively disinterested in, and that being overly interested in a specific page mostly leads you to violating all of their rules and getting banned. This sort of actively political editing is precisely the sort of thing that they're trying to avoid.

↑ comment by Viliam · 2021-07-02T01:07:18.359Z · LW(p) · GW(p)

By saying "Wuhan lab origin", you can roughly mean three things:

biological weapon, intentionally released,
natural virus collected, artificially improved, then escaped,
natural virus collected, then escaped in the original form.

The first we can safely dismiss: who would drop a biological weapon of this type on their own population?

We can also dismiss the third one, if you think in near mode what that would actually mean. It means the virus was already out there. Then someone collected it -- obviously, not all existing particles of the virus -- which means that most of the virus particles that were already out there, have remained out there. But that makes the leak from Wuhan lab an unnecessary detail [? · GW]; "virus already in the wild, starts pandemic" is way more likely than "virus already in the wild, does not start pandemic, but when a few particles are brought into a lab and then accidentally released without being modified, they start pandemic"... what?

This is why arguing for natural evolution of the virus is arguing against the lab leak. (It's just not clearly explained.) If you do not assume that the virus was modified, then the hypothesis that the pandemic started by Wuhan lab leak, despite the virus already being out there before it was brought to the Wuhan lab, is privileging the hypothesis [? · GW]. If the virus is already out there, you don't need to bring it to a lab and let it escape again in order to... be out there, again.

Now here I agree that the artificial improvement of the virus cannot be disproved. I mean, whatever can happen in the nature, probably can also happen in the lab, so how would you prove it didn't?

I guess I am trying to say that in the Wikipedia article, the section "gain of function research" does not deserve to be classified as misinformation, but the remaining sections do.

comment by Ben Pace (Benito) · 2019-07-17T00:31:47.536Z · LW(p) · GW(p)

Responding to Scott's response to Jessica.

The post makes the important argument that if we have a word whose boundary is around a pretty important set of phenomena that are useful to have a quick handle to refer to, then

It's really unhelpful for people to start using the word to also refer to a phenomena with 10x or 100x more occurrences in the world because then I'm no longer able to point to the specific important parts of the phenomena that I was previously talking about

e.g. Currently the word 'abuser' describes a small number of people during some of their lives. Someone might want to say that technically it should refer to all people all of the time. The argument is understandable, but it wholly destroys the usefulness of the concept handle.

People often have political incentives to push the concept boundary to include a specific case in a way that, if it were principled, indeed makes most of the phenomena in the category no use to talk about. This allows for selective policing being the people with the political incentive.
It's often fine for people to bend words a little bit (e.g. when people verb nouns), but when it's in the class of terms we use for norm violation, it's often correct to impose quite high standards of evidence for doing so, as we can have strong incentives (and unconscious biases!) to do it for political reasons.

These are key points that argue against changing the concept boundary to include all conscious reporting of unconscious bias, and more generally push back against unprincipled additions to the concept boundary.

This is not an argument against expanding the concept to include a specific set of phenomena that share the key similarities with the original set, as long as the expansion does not explode the set. I think there may be some things like that within the category of 'unconscious bias'.

While it is the case that it's very helpful to have a word for when a human consciously deceives another human, my sense is that there are some important edge cases that we would still call lying, or at least a severe breach of integrity that should be treated similarly to deliberate conscious lies.

Humans are incentivised to self-deceive in the social domain in order to be able to tell convincing lies. It's sometimes important that if it's found out that someone strategically self-deceived, that they be punished in some way.

A central example here might be a guy who says he wants to be in a long and loving committed relationship, only to break up after he is bored of the sex after 6-12 months, and really this was predictable from the start if he hadn't felt it was fine to make big commitments things without introspecting carefully on their truth. I can imagine the woman in this scenario feeling genuinely shocked and lied to. "Hold on, what are you talking about that you feel you want to move out? I am organising my whole life around this relationship, what you are doing right now is calling into question the basic assumptions that you have promised to me." I can imagine this guy getting a reputation for being untrustworthy and lying to women. I think it is an open question about whether it is accurate for the woman cheated by this man to tell other people that he "lied to her", though I think it is plausible that I want to punish this behaviour in a similar way that I want to punish much more conscious deception, in a way that motivates me to want to refer to it with the same handle - because it gives you basically very similar operational beliefs about the situation (the person systematically deceived me in a way that was clearly for their personal gain and this hurt me and I think they should be actively punished).

I think I can probably come up with an example where a politician wants power and does whatever is required to take it, such that they end up not being in alignment with the values they stated they held earlier in their career, and allow the meaning of words to fluctuate around them in accordance with what the people giving the politician votes and power want that they end up saying something that is effectively a lie, but that they don't care about or really notice. This one is a bit slippery for me to point to.

Another context that is relevant: I can imagine going to a scientific conference in a field that has been hit so hard by the replication crisis, that basically all the claims in the conference were false, and I could know this. Not only are the claims at this conference false, but the whole subfield has never been about anything real (example, example, and of course, example). I can imagine a friend of mine attending such a conference and talking to me afterwards, and them thinking that some of the claims seemed true. And I can imagine saying to them "No. You need to understand that all the claims in there are lies. There is no truth-tracking process occurring. It is a sham field, and those people should not be getting funding for their research." Now, do I think the individuals in the field are immoral? Not exactly, but sorta. They didn't care about truth and yet paraded themselves as scientists. But I guess that's a big enough thing in society that they weren't unusually bad or anything. While it's not a central case of lying, it currently feels to me like it's actively helpful for me to use the phrase 'lie' and 'sham'. There is a systematic distortion of truth that gives people resources they want instead of those resources going to projects not systematically distorting reality.

(ADDED: OTOH I do think that I have myself in the past been prompted to want to punish people for these kinds of 'lies' in ways that isn't effective. I have felt that people who have committed severe breaches of integrity in the communities I'm part of are bad people and felt very angry at them, but I think that this has often been an inappropriate response. It does share other important similarities with lies though. Probably want to be a bit careful with the usage here and signal that the part of wanting to immediately socially punish them for a thing that they obviously did wrong is not helpful, because they will feel helpless and not that it's obvious they did something wrong. But it's important for me internally to model them as something close to lying, for the sanity of my epistemic state, especially when many people in my environment will not know/think the person has breached integrity and will socially encourage me to positively weight their opinions/statements.)

My current guess at the truth: there are classes of human motivations, such as those for sex, and for prestigious employment positions in the modern world, that have sufficiently systematic biases in favour of self-deception that it is not damaging to add them to the category of 'lie' - adding them is not the same as a rule that admits all unconscious bias consciously reported, just a subset that reliably turns up again and again. I think Jessica Taylor / Ben Hoffman [LW(p) · GW(p)] / Michael Arc want to use the word 'fraud' to refer to it, I'm not sure.

Replies from: Benito, Benquo, Benito

↑ comment by Ben Pace (Benito) · 2019-07-18T20:16:44.498Z · LW(p) · GW(p)

I will actually clean this up and into a post sometime soon [edit: I retract that, I am not able to make commitments like this right now]. For now let me add another quick hypothesis on this topic whilst crashing from jet lag.

A friend of mine proposed that instead of saying 'lies' I could say 'falsehoods'. Not "that claim is a lie" but "that claim is false".

I responded that 'falsehood' doesn't capture the fact that you should expect systematic deviations from the truth. I'm not saying this particular parapsychology claim is false. I'm saying it is false in a way where you should no longer trust the other claims, and expect they've been optimised to be persuasive.

They gave another proposal, which is to say instead of "they're lying" to say "they're not truth-tracking". Suggest that their reasoning process (perhaps in one particular domain) does not track truth.

I responded that while this was better, it still seems to me that people won't have an informal understanding of how to use this information. (Are you saying that the ideas aren't especially well-evidenced? But they sound pretty plausible to me, so let's keep discussing them and look for more evidence.) There's a thing where if you say someone is a liar, not only do you not trust them, but you recognise that you shouldn't even privilege the hypotheses that they produce. If there's no strong evidence either way, if it turns out the person who told it you is a rotten liar, then if you wouldn't have considered it before they raised it, don't consider it now.

Then I realised Jacob had written [LW · GW] about this topic a few months back. People talk as though 'responding to economic incentives' requires conscious motivation, but actually there are lots of ways that incentives cause things to happen that don't require humans consciously noticing the incentives and deliberately changing their behaviour. Selection effects, reinforcement learning, and memetic evolution.

Similarly, what I'm looking for is basic terminology for pointing to processes that systematically produce persuasive things that aren't true, that doesn't move through "this person is consciously deceiving me". The scientists pushing adult neurogenesis aren't lying. There's a different force happening here that we need to learn to give epistemic weight to the same way we treat a liar, but without expecting conscious motivation to be the root of the force and thus trying to treat it that way (e.g. by social punishment).

More broadly, it seems like there are persuasive systems in the environment that weren't in the evolutionary environment for adaptation, that we have not collectively learned to model clearly. Perhaps we should invest in some basic terminology that points to these systems so we can learn to not-trust them without bringing in social punishment norms.

Replies from: Pattern

↑ comment by Pattern · 2019-07-23T02:25:54.903Z · LW(p) · GW(p)

I responded that 'falsehood' doesn't capture the fact that you should expect systematic deviations from the truth.

Is this "bias"?

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2019-07-23T10:04:16.077Z · LW(p) · GW(p)

Yeah good point I may have reinvented the wheel. I have a sense that’s not true but need to think more.

↑ comment by Benquo · 2019-07-17T04:00:00.916Z · LW(p) · GW(p)

The definitional boundaries of "abuser," as Scott notes, are in large part about coordinating around whom to censure. The definition is pragmatic rather than objective [LW · GW].*

If the motive for the definition of "lies" is similar, then a proposal to define only conscious deception as lying is therefore a proposal to censure people who defend themselves against coercion while privately maintaining coherent beliefs, but not those who defend themselves against coercion by simply failing to maintain coherent beliefs in the first place. (For more on this, see Nightmare of the Perfectly Principled.) This amounts to waging war against the mind.

Of course, in matter of actual fact we don't strongly censure all cases of consciously deceiving. In some cases (e.g. "white lies") we punish those who fail to lie, and those who call out the lie. I'm also pretty sure we don't actually distinguish between conscious deception and e.g. reflexively saying an expedient thing, when it's abundantly clear that one knows very well that the expedient thing to say is false, as Jessica pointed out here [LW(p) · GW(p)].

*It's not clear to me that this is a good kind of concept to have, even for "abuser." It seems to systematically force responses to harmful behavior to bifurcate into "this is normal and fine" and "this person must be expelled from the tribe," with little room for judgments like "this seems like an important thing for future partners to be warned about, but not relevant in other contexts." This bifurcation makes me less willing to disclose adverse info about people publicly - there are prominent members of the Bay Area Rationalist community doing deeply shitty, harmful things that I actually don't feel okay talking about beyond close friends because I expect people like Scott to try to enforce splitting behavior.

↑ comment by Ben Pace (Benito) · 2019-07-17T00:34:28.009Z · LW(p) · GW(p)

Note: I just wrote this in one pass when severely jet lagged, and did not have the effort to edit it much. If I end up turning this into a blogpost I will probably do that. Anyway, I am interested in hearing via PM from anyone who feels that it was sufficiently unclearly written that they had a hard time understanding/reading it.

comment by Ben Pace (Benito) · 2024-09-10T22:39:02.946Z · LW(p) · GW(p)

Two recent changes to LessWrong that I made!

I have added two new reacts: "I'd bet this is false" and "I'd bet this is true".
You can now sort a given user's comments by 'top' i.e. by karma.

For the first one, if you want to let someone know you'd be willing to take a bet (or if you want to call someone out on their bullshit) you can now highlight the claim they made and use the react. The react is a pair of dice, because we're never certain about propositions we're betting on (and because 'a hand offering money' was too hard to make out at the small scale). Hopefully this will increase people's affordance to take more bets on the site!

These reacts replaced "I checked it's true" and "I checked it's false", which didn't get that much use, but were some of the most abused reacts (often used on opinions or statements-of-positions that were simply not checkable).

For the second, if you go to a user profile and scroll down to the comments, you can now sort by 'top', 'newest', 'oldest', and 'recent replies'. I find that 'top' is a great way to get a sense of a person's thoughts and perspective on the world, and I used to visit greaterwrong a lot for this feature. Now you can do it on LessWrong!

Replies from: zac-hatfield-dodds, lahwran, shankar-sivarajan, Yoav Ravid

↑ comment by Zac Hatfield-Dodds (zac-hatfield-dodds) · 2024-09-11T03:56:58.126Z · LW(p) · GW(p)

This feels pretty nitpick-y, but whether or not I'd be interested in taking a bet will depend on the odds - in many cases I might take either side, given a reasonably wide spread. Maybe append at p >= 0.5 to the descriptions to clarify?

The shorthand trading syntax "$size @ $sell_percent / $buy_percent" is especially nice because it expresses the spread you'd accept to take either side of the bet, e.g. "25 @ 85/15 on rain tomorrow" to offer a bet of $25 dollars, selling if you think probability of rain is >85%, buying if you think it's <15%. Seems hard to build this into a reaction though!

Replies from: Benito, Raemon

↑ comment by Ben Pace (Benito) · 2024-09-11T18:02:53.441Z · LW(p) · GW(p)

The only reason I've not already done this is that there's already a lot of text in the hover-over

I'm willing to operationalize this, find an adjudicator, and bet this claim is true

Another option is that you could pair the "I'd bet on this" react with a probability react. There could be a single react that says "I'm willing to be on this" and then you also react with one of <1%, 10%, 25%, 40%, 50%, 60%, 75%, 90%, 99+%, so that people know what odds you'd take.

↑ comment by Raemon · 2024-09-11T06:02:15.007Z · LW(p) · GW(p)

lol we did have a debate about this internally before shipping.

Right now we’re trying to get a rough sense of ‘would people make more bets on LW if they had more affordance to?’, and an easy thing to try was just making some reacts. But a) I encourage people to reply with followup details if they are interested in betting, and b) if it turns out reasonably popular we may make a more dedicated feature.

Replies from: Raemon

↑ comment by Raemon · 2024-09-11T06:03:50.419Z · LW(p) · GW(p)

(FYI I was happy to see your recent bet with BenG, and am hoping more things like that happen)

↑ comment by the gears to ascension (lahwran) · 2024-09-11T08:32:06.251Z · LW(p) · GW(p)

I like #2. On a similar thread: would be nice to have a separate section for pinned comments. I looked into pull requesting it at one point but looks like it either isn't as trivial as I hoped, or I simply got lost in the code. I feel like folks having more affordance to say, "Contrary to its upvotes or recency, I think this is one of the most representative comments from me, and others seeing my page should see it" would be helpful - pinning does this already but it has ui drawbacks because it simply pushes recent comments out of the way and the pinned marker is quite small (hence why I edit my pinned comments to say they're pinned).

↑ comment by Shankar Sivarajan (shankar-sivarajan) · 2024-09-11T00:34:13.678Z · LW(p) · GW(p)

Suggestion: the two dice should have different numbers on top, maybe a ⚅ on the "true" bet instead of a ⚀.

↑ comment by Yoav Ravid · 2024-09-11T10:35:24.472Z · LW(p) · GW(p)

The comment option sorting is amazing! Thanks!

The new reacts are also cool, though I also liked the "I checked" reacts and would have liked to have both.

Replies from: Rana Dexsin

↑ comment by Rana Dexsin · 2024-09-12T07:42:20.891Z · LW(p) · GW(p)

I was pretty sad about the ongoing distortion of “I checked” in what's meant to be an epistemics-oriented community. I think the actual meanings are potentially really valuable, but without some way of avoiding them getting eaten, they become a hazard.

My first thought is to put a barrier in the way, but I don't know if that plays well with the reactions system being for lower-overhead responses, and it might also give people unproductive bad feelings unless sold the right way.

comment by Ben Pace (Benito) · 2021-10-01T06:10:31.319Z · LW(p) · GW(p)

Okay, I’ll say it now, because there’s been too many times.

If you want your posts to be read, never, never, NEVER post multiple posts at the same time.

Only do that if you don’t mind none of the posts being read. Like if they’re all just reference posts.

I never read a post if there’s two or more to read, it feels like a slog and like there’s going to be lots of clicking and it’s probably not worth it. And they normally do badly on comments on karma so I don’t think it’s just me.

Even if one of them is just meant as reference, it means I won’t read the other one.

comment by Ben Pace (Benito) · 2019-07-14T19:04:06.355Z · LW(p) · GW(p)

I recently circled for the first time. I had two one-hour sessions on consecutive days, with 6 and 8 people respectively.

My main thoughts: this seems like a great way for getting to know my acquaintances, connecting emotionally, and build closer relationships with friends. The background emotional processing happening in individuals is repeatedly brought forward as the object of conversation, for significantly enhanced communication/understanding. I appreciated getting to poke and actually find out whether people's emotional states matched the words they were using. I got to ask questions like:

When you say you feel gratitude, do you just mean you agree with what I said, or do you mean you're actually feeling warmth toward me? Where in your body do you feel it, and what is it like?

Not that a lot of my circling time was skeptical of people's words, a lot of the time I trusted the people involved to be accurately reporting their experiences. It was just very interesting - when I noticed I didn't feel like someone was honest about some micro-emotion - to have the affordance to stop and request an honest internal report.

It felt like there was a constant tradeoff between social-interaction and noticing my internal state. If all I'm doing is noticing my internal state, then I stop engaging with the social environment and don't have anything off substance to report on. If I just focus on the social interactions, then I never stop and communicate more deeply about what's happening for me internally. I kept on having an experience like "Hey, I want to interject to add nuance to what you said-" and then stopping and going "So, when you said <x> I felt a sense of irritation/excitement/distrust/other because <y>".

One moment that I liked a lot, was around the epistemic skill of not deciding your position a second earlier than necessary [? · GW]. Person A was speaking, and person B jumped in and said something that sounded weirdly aggressive. It didn't make sense, and then person B said "Wait, let me try to figure out what I mean, I feel I'm not using quite the right words". My experience was first to feel some worry for person A feeling attacked. I quickly calmed down, noticing how thoroughly out of character it would be for person B to actually be saying anything aggressive. I then realised I had a clear hypothesis for what person B actually wanted to say, and waited politely for them to say it. But then I noticed that actually I didn't have much evidence for my hypothesis at all... so I moved into a state of only curiosity about what person B was going to say, not holding onto my theory of what they would say. And indeed, it turned out they said something entirely different. (I subsequently related this whole experience to person B during the circle.)

This is really important. Being able to hold off on keeping your favoured theory in the back of your head and counting all evidence as pro- or anti- the theory, and instead keeping the theory separate from your identity and feeling full creative freedom to draw a new theory around the evidence that comes in.

There were other personal moments where I brought up how I was feeling toward my friends and they to me, in ways that allowed me to look at long-term connections and short-term conflicts in a clearer light. It was intense.

Both circles were very emotionally interesting and introspectively clarifying, and I will do more with friends in the future.

comment by Ben Pace (Benito) · 2025-02-25T21:00:09.280Z · LW(p) · GW(p)

I have a general belief that internet epistemic hygiene norms should include that, when you quote someone, not only should you link to the source, but you should link to the highlight of that source. In general, if you highlight text on a webpage and right-click, you can "copy link to highlight" which when opened scrolls to and highlights that text. (Random example on Wikipedia.)

Further on this theme, archive.is has the interesting feature of constantly altering the URL to point to the currently highlighted bit of text, making this even easier. (Example, and you can highlight other bits of text to see it change.) Currently I overall don't like it because I constantly highlight text while I'm reading it, and so am v annoyed by the URL constantly changing, but it's plausible I'd get over this in time, and it'd be a good feature to add to LW.

The archive.is feature is also better because the normal "copy link to highlight" can often be unwieldily and long. Also I recall it sometimes not working, probably because the highlight is too short or too long (I don't quite understand the rules). On archive.is it just has a start and end number for where in the text is highlighted, making it always work and never be unwieldily.

Sadly, I just tried the normal "copy link to highlight" on LW, and when I clicked through the page auto-refreshes, so the highlighted text flashes purple then disappears quickly after. It would be good for us to change that, and maybe implement this feature.

Replies from: gwern, MondSemmel, cubefox, GAA, mateusz-baginski, JacobKopczynski

↑ comment by gwern · 2025-02-28T20:46:40.915Z · LW(p) · GW(p)

I have misgivings about the text-fragment feature as currently implemented. It is at least now a standard and Firefox implements reading text-fragment URLs (just doesn't conveniently allow creation without a plugin or something), which was my biggest objection before; but there are still limitations to it which show that a lot of what the text-fragment 'solution' is, is a solution to the self-inflicted problems of many websites being too lazy to provide useful anchor IDs anywhere in the page. (I don't know how often I go to link a section of a blog post, where the post is written in a completely standard hierarchical table-of-contents way, and the headers turn out to be... nothing but <h2>s with not an id= anywhere in sight.) We would be a lot better off if pages had more meaningful IDs and selecting text did something like, pick the nearest preceding ID. (This could be implemented in LW2 or Gwern.net right now, incidentally. If the user selects some text, just search through the tree to find the first previous ID, and update the current browser-bar URL to URL#ID.)

Hacking IDs onto an unwilling page, whose author neither knows nor cares nor can even find out what IDs are in use (or what they may be breaking by editing this or that word), is a recipe for long-term breakage: your archive.is example works simply because archive.is is an archive website, and the pages, in theory, never change (even though the original URLs certainly can, and often quite dramatically). That's less true for LW comments or articles. There are also downstream effects: text-fragments are long and verbose and can't be written by hand because they're trying to specify arbitrary ranges which are robust to corruption, and they are unwieldy to search. (How does a tool handle different hash-anchors in a URL? Most choose to define them as unique URLs different from each other... so what happens when two users selecting from the same section inevitably wind up selecting slightly different text ranges every time, and every user has a unique text-fragment anchor? Now suddenly every URL is unique - no more useful backlinks, no more consolidated discussions of the same URL, etc. And if the URL content changes, you don't get anything out of it. It's now just a bunch of trailing junk causing problems forever, like all that ?utm_foo_bar junk.)

Somewhat like the fad for abusing # for the stupid #! JS thing (which pretty much everyone, Twitter included, came to regret), I worry that this is still a half-baked tech designed for a very narrow use case (Google's convenience in providing search results) where we don't know how well it will work in the wild long-term or what side-effects it will have. So I personally have been holding off on it and making a point of deleting those archive.is anchors.

↑ comment by MondSemmel · 2025-02-25T21:51:52.436Z · LW(p) · GW(p)

"Copy link to highlight" is not available in Firefox. And while e.g. Bing search seems to automatically generate these "#:~:text=" links, I find they don't work with any degree of consistency. And they're even more affected by link rot than usual, since any change to the initial text (like a typo fix) will break that part of the link.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2025-02-25T22:24:46.241Z · LW(p) · GW(p)

Though if the text changes, then it degrades gracefully to just linking to the right webpage, which is the current norm.

↑ comment by cubefox · 2025-02-26T07:14:14.323Z · LW(p) · GW(p)

The highlights are officially called "text fragments" and the syntax is described here: https://developer.mozilla.org/en-US/docs/Web/URI/Reference/Fragment/Text_fragments

↑ comment by Guive (GAA) · 2025-02-25T23:39:42.561Z · LW(p) · GW(p)

I like this idea. There's always endless controversy about quoting out of context. I can't recall seeing any previous specific proposals to help people assess the relevance of context for themselves.

↑ comment by Mateusz Bagiński (mateusz-baginski) · 2025-02-28T13:57:17.007Z · LW(p) · GW(p)

Currently I overall don't like it because I constantly highlight text while I'm reading it, and so am v annoyed by the URL constantly changing, but it's plausible I'd get over this in time, and it'd be a good feature to add to LW.

You can add it as an opt-in feature.

↑ comment by Czynski (JacobKopczynski) · 2025-02-28T21:44:39.747Z · LW(p) · GW(p)

I find them visually awful and disable them in settings. And avoid using archive.is because there's no way to turn that off.

Not that I browse LW that much, in fairness.

comment by Ben Pace (Benito) · 2024-09-01T22:08:32.752Z · LW(p) · GW(p)

Here's a quick list of 7 things people sometimes do instead of losing arguments, when it would be too personally costly to change their position.

They pick a variable in the argument that is hard for the debaters to get closer to, and state that their raw-intuition is simply different to the person they're talking to. ("I just believe that human ingenuity will overcome problems like this, and you don't, and this isn't something we're easily going to be able to resolve." or "I think we're just not going to be able to resolve whether this city-wide intervention works or not without a good RCT.")
They undermine your ability to have an opinion in the argument. ("I'm afraid there's very advanced research papers you'd need to read to have an understanding of this" or "You have been wrong so many times on so many issues, I don't think you're worth trusting on this one")
They performatively don't understand basic ideas, or weaponize confusion. ("I'm sorry, could you explain your theory of 'ownership', I don't really understand what you mean when you say you own your company" or "I'm sorry, I'm not sure what it means to 'provoke'? I was just saying what I thought.")
They get emotionally irate in a way that makes the format of the conversation breakdown. ("Eh, something about this conversation feels confused and pointless, let's move on..." or "Look, it's a Sunday night and I have work tomorrow, can you pick something we can talk about that we might actually get to the bottom of tonight or else I'm leaving.")
They make false claims about social consensus that the conversation doesn't have access to. ("Oh I don't think anyone involved in the situation would agree with what you're saying, and I know them better than you" or "As an expert in blah, I can tell you that other experts broadly hold my position, unfortunately I don't have any quick links to show that.")
They make it uncomfortable or costly for you to be part of the conversation. ("My interlocutor is showing by their arguments that they are a person of bad character" or other ways to make the social situation costly, such as bringing someone you have other conflicts with into the conversation.)
They simply do not show up to argue, and select their social environment to be people who won't challenge them on this. ("I of course love open discourse and wish you all the best but I am simply very busy" or "It's generally not appropriate to debate whether someone should quit their marriage / job / country / etc, that's a private matter" or more often you don't know their name and you don't know what they're up to and you never even know they exist.)

The last one is of course the most common one.

I'm interested to hear others in the replies!

Replies from: cubefox, Viliam, nathan-helm-burger

↑ comment by cubefox · 2024-09-02T18:20:32.572Z · LW(p) · GW(p)

Schopenhauer's sarcastic essay The Art of Being Right (a manuscript published posthumously) goes in this direction. In it he suggests 38 rhetorical strategies to win a dispute by any means possible. E.g. your 3 corresponds to his 31, your 5 is similar to his 30, and your 7 to is similar his 18. Though he isn't just focusing on avoiding losing arguments, but on winning them.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2024-09-02T21:45:02.128Z · LW(p) · GW(p)

It's quite funny, thanks for the link!

Quoting from number 31:

You may also, should it be necessary, not only twist your authorities, but actually falsify them, or quote something which you have invented entirely yourself. As a rule, your opponent has no books at hand, and could not use them if he had. The finest illustration of this is furnished by the French curé, who, to avoid being compelled, like other citizens, to pave the street in front of his house, quoted a saying which he described as biblical: paveant illi, ego non pavebo. That was quite enough for the municipal officers.

Of course it's harder now given that people do have the internet at-hand, but I think I still see this tactic employed.

↑ comment by Viliam · 2024-09-02T10:12:32.255Z · LW(p) · GW(p)

more often you don't know their name and you don't know what they're up to and you never even know they exist

Yeah, there are 8 billion people like that, so that's probably the most frequent case.

I think the list of "things people sometimes do instead of losing arguments" would be very long. Did you pick these 7 because they are most frequent (in your bubble) or just because they irritate you most?

I can add an example of a thing that irritates me (but maybe that's just 1 person I talked to too much recently); it's when after explaining some crucial fact that was missing in their 'edgy' perspective, the person suddenly goes: "and why is this topic so important to you? why do you care so much?" as if it's weird to know facts about something, but just a while ago it was okay for them to spend fifteen minutes talking platitudes about it

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2024-09-02T21:41:11.273Z · LW(p) · GW(p)

I just picked the first ones that came to mind, so I guess they're probably the ones I run into most frequently.

↑ comment by Nathan Helm-Burger (nathan-helm-burger) · 2024-09-05T20:02:54.088Z · LW(p) · GW(p)

Worth keeping in mind is that seeing one of these things happen doesn't prove that what is going on is that subconsciously or consciously they believe they are losing the argument. It also shouldn't be taken as evidence from the universe that your argument is correct.

People are fallible and run on noisy hacky biological computers.

So yeah, it's worth noticing these sorts of argument breakdowns, but also worth being careful about reading too much into them.

Here's a hypothetical:

Suppose instead that the person you are arguing with is very knowledgeable about specific topic X. You have dabbled in X and know more than the average layperson, but lack a deep knowledge of the subject. The two of you have a disagreement about X.

The two of you begin arguing about it, but after a few minutes it becomes clear to them that you don't have a deep knowledge of the subject. They think through what they'd need to teach you in order for you to make a valid argument which would potentially change their mind. They estimate it would require at least 6 hours of lecturing, maybe divided up as 3 lectures, with lots of dense technical sources assigned as prerequisite reading before each lecture. They realize that that investment of time and energy, even if you agreed that you needed it and committed to wholeheartedly tackling it, wouldn't be worth it to them in order to continue the argument. They give up on you as a conversational partner on the subject of X, and seek a convenient social out to end the conversation as soon as possible.

Here's another hypothetical:

Someone begins having an argument with you, on a topic which for you is mostly emotionally neutral. You are arguing politely for your point of view in a fair and objective way. They, however, do not have an emotionally neutral stance on this. They start out arguing in an fair and objective way, thinking rationally and logically about each point. Soon however, the debate touches on deep powerful feelings they have, and (consciously or unconsciously) their ability to think objectively and logically about the subject becomes impaired by their emotional associations.

At some point, they realize that they are emotionally impaired. Like many people in such a situation, they don't feel comfortable admitting this. Their emotional response feels like a vulnerability. They don't at all think that they are wrong or losing the argument, they just don't think that they're going to be able to continue sharing facts and updating your worldview long enough for you to come to agreement with them. The discussion has simply become too uncomfortable for them to continue. They search for a social 'out' in order to end the conversation, and plan to never discuss this with you again, while remaining completely convinced that they are in the right.

In either of these hypotheticals, you might well be correct! Or wrong! They work either way!

So while I agree that 'deciding that they are (or might be) wrong but not wanting to change their position' is a possible motivating factor for these evasive behaviors, it isn't the only possible source of motivation. Nor is it exclusive, many motivations may co-exist within a given situation. I do agree that it's probably a non-zero factor in many cases.

comment by Ben Pace (Benito) · 2019-12-02T00:06:15.765Z · LW(p) · GW(p)

Good posts you might want to nominate in the 2018 Review

I'm on track to nominate around 30 posts from 2018, which is a lot. Here is a list of about 30 further posts I looked at that I think were pretty good but didn't make my top list, in the hopes that others who did get value out of the posts will nominate their favourites. Each post has a note I wrote down for myself about the post.

Reasons compute may not drive AI capabilities growth [LW · GW]
- I don’t know if it’s good, but I’d like it to be reviewed to find out.
The Principled-Intelligence Hypothesis [LW · GW]
- Very interesting hypothesis generation. Unless it’s clearly falsified, I’d like to see it get built on.
Will AI See Sudden Progress? [LW · GW] DONE
- I think this post should be considered paired with Paul’s almost-identical post. It’s all exactly one conversation.
Personal Relationships with Goodness [LW · GW]
- This felt like a clear analysis of an idea and coming up with some hypotheses. I don’t think the hypotheses really captures what’s going on, and most of the frames here seem like they’ve caused a lot of people to do a lot of hurt to themselves, but it seemed like progress in that conversation.
Are ethical asymmetries from property rights? [LW · GW]
- Again, another very interesting hypothesis.
Incorrect Hypotheses Point to Correct Observations [LW · GW] ONE NOMINATION
- This seems to me like close to an important point but not quite saying it. I don’t know if I got anything especially knew from its framing, but its examples are pretty good.
Whose reasoning can you rely on when your own is faulty? [LW · GW]
- I really like the questions, and should ask them more about the people I know.
Inconvenience is Qualitatively Bad [LW · GW] ONE NOMINATION
- I think that the OP is an important idea. I think my comment on it is pretty good (and the discussion below it), though I’ve substantially changed my position since then, and should write up my new worldview once my life calms down. I don’t think I should nominate it because I’m a major part of the discussion.
The abruptness of nuclear weapons [LW · GW]
- Clearly valuable historical case, simple effect model.
Book Review: Pearl’s Book of Why [LW · GW]
- Why science has made a taboo of causality feels like a really important question to answer when figuring out how much to trust academia and how to make institutions that successfully make scientific progress, and this post suggests some interesting hypotheses.
Functional Institutions Are the Exception [LW · GW] ONE NOMINATION
- Was a long meditation on an important idea, that I’ve found valuable to read. Agree with commenter that it’s sorely lacking in examples however.
Strategies of Personal Growth [LW · GW] ONE NOMINATION
- Oli curated it, he should consider nominating and saying what he found useful. It all seemed good but I didn’t personally get much from it.
Preliminary Thoughts on Moral Weight [LW · GW] DONE
- A bunch of novel hypotheses about consciousness in different animals that I’d never heard before, which seem really useful for thinking about the topic.
Theories of Pain [LW · GW]
- I thought this was a really impressive post, going around and building simple models of lots of different theories, and giving a bit of personal experience with the practitioners of the theories. It was systematic and goal oriented and clear.
Clarifying “AI Alignment” [LW · GW] DONE
- Rohin’s comment is the best part of this post, not sure how best to nominate it.
Norms of Membership for Voluntary Groups [LW · GW]
- There’s a deep problem here of figuring out norms in novel and weird and ambiguous environments in the modern world, especially given the internet, and this post is kind of like a detailed, empirical study of some standard clusters of norms, which I think is very helpful.
How Old is Smallpox? [LW · GW]
- Central example of “Things we learned on LessWrong in 2018”. Should be revised though.
“Cheat to Win”: Engineering Positive Social Feedback [LW · GW]
- Feels like a clear example of a larger and important strategy about changing the incentives on you. Not clear how valuable the pots is alone, but I like it a lot.
Track-Back Meditation [LW · GW]
- I don’t know why, but I think about this post a lot.
Meditations on Momentum [LW · GW]
- I feel like this does a lot of good intuition building work, and I think about this post from time to time in my own life. I think that Jan brought up some good points in the comments about not wanting to cause confusion about different technical concepts all being the same, so I’d like to see the examples reviewed to check they’re all about attachment effects and not conflating different effects.
On Exact Mathematical Formulae [LW · GW]
- This makes a really important point anyone learns in the study of mathematics, and I think is generally an important distinction to have understand between language and reality. Just because we have words for some things doesn’t make them more real than things we don’t have words for. The point is to look at reality, not to look at the words.
Recommendations vs Guidelines [LW · GW]
- I think back to this occasionally. Seems like quite a useful distinction, and maybe we should try to encourage people making more guidelines. Maybe we should build a wiki and have a page type ‘guideline’ where people contribute to make great guidelines.
On the Chatham House Rule [LW · GW] DONE
- This is one of the first posts that impressed upon me the deeply tangled difficulties of information security, something I’ve increasingly thought a lot about in the intervening years, and expect to think about even more in the future.
Different types (not sizes!) of infinity [LW · GW]
- Some important conceptual work fundamental to mathematics. Very short and insightful. Not sure if I should allow this though, because if I do am I just allowing all high-level discussion of math to be included?
Expected Pain Parameters [LW · GW]
- It feels like useful advice and potentially a valuable observation with which to view a deeper problem. But unclear on the last one, and not sure if this post should be nominated just on the first alone.
Research: Rescuers During the Holocaust [LW · GW] DONE
- The most interesting part about this is the claim that most people who housed Jews during the holocaust did it because the social situation made the moral decision very explicit and that they felt they only had one possible outcome, not because they were proactive moral planners. I would like to see an independent review of this info.
Lessons from the cold war on information hazards: why internal communication is critical [LW · GW] DONE
- Seems like an important historical lesson.
Problem Solving With Crayons and Mazes [LW · GW]
- I didn’t find it much useful. Oli was excited when he curated it, should poke him to consider nominating it.
Insights from “Strategy of Conflict” [LW · GW]
- Seems helpful but weird to nominate, as the book is short and this post explicitly doesn't contain all the key ideas in the book. I did learn from this that having lots of nukes is more stable than having a small number, and this has stuck with me.
The Bat and Ball Problem Revisited [LW · GW] DONE
- A curiosity-driven walk through what’s going on with the bat-and-the-ball problem by Kahneman.
Good Samaritans in Experiments [LW · GW]
- A highly opinionated and very engaging criticism of a study.
Hammertime Final Exam [LW · GW] ONE NOMINATION
- Was great, I’m not actually sure whether it fits into this review process?
Naming the Nameless [LW · GW] DONE
- Some people seemed to get a lot out of this, but I haven’t had the time to engage with it much.
- Actually, just re-read it, and it's brilliant, and one of the best 5-10 of the year. Will nominate it myself if nobody else does.
How did academia ensure papers were correct in the early 20th century? [LW · GW] DONE
- I’m glad I put this down in writing. I found it useful myself. But others should figure out whether to nominate.
Competitive Markets as Distributed Backdrop [LW · GW] DONE
- I felt great about it when I read this post last time. I’ve not given it a careful re-read, would like to see it reviewed, but I think it’s likely I’ll rediscover it’s a very helpful abstraction.

AI alignment posts you might want to nominate

[Edit: On reflection, I think that the Alignment posts that do not also have implications for human rationality aren't important to go through this review process, and we'll likely create another way to review that stuff and make it into books.]

There was also a lot of top-notch AI alignment writing, but I mostly don’t feel well-placed to nominate it. I hope others can look through and nominate selections from these.

Rohin’s sequence [? · GW]
Paul’s sequence [? · GW]
Abram’s writing that year [LW · GW]
Wei Dai’s writing that year [LW · GW]
Scott’s “Fixed Point Discussion [LW · GW]"
- Feels like one of the few practical posts that can help a large number of people do embedded agency research, so really valuable from that perspective.
Vika’s “Discussion on the Machine Learning Approach to AI safety [LW · GW]”

comment by Ben Pace (Benito) · 2019-08-27T03:33:29.755Z · LW(p) · GW(p)

I was just re-reading the classic paper Artificial Intelligence as Positive and Negative Factor in Global Risk. It's surprising how well it holds up. The following quotes seem especially relevant 13 years later.

On the difference between AI research speed and AI capabilities speed:

The first moral is that confusing the speed of AI research with the speed of a real AI once built is like confusing the speed of physics research with the speed of nuclear reactions. It mixes up the map with the territory. It took years to get that first pile built, by a small group of physicists who didn’t generate much in the way of press releases. But, once the pile was built, interesting things happened on the timescale of nuclear interactions, not the timescale of human discourse. In the nuclear domain, elementary interactions happen much faster than human neurons fire. Much the same may be said of transistors.

On neural networks:

The field of AI has techniques, such as neural networks and evolutionary programming, which have grown in power with the slow tweaking of decades. But neural networks are opaque—the user has no idea how the neural net is making its decisions—and cannot easily be rendered unopaque; the people who invented and polished neural networks were not thinking about the long-term problems of Friendly AI. Evolutionary programming (EP) is stochastic, and does not precisely preserve the optimization target in the generated code; EP gives you code that does what you ask, most of the time, under the tested circumstances, but the code may also do something else on the side. EP is a powerful, still maturing technique that is intrinsically unsuited to the demands of Friendly AI. Friendly AI, as I have proposed it, requires repeated cycles of recursive self-improvement that precisely preserve a stable optimization target.

On funding in the AI Alignment landscape:

If tomorrow the Bill and Melinda Gates Foundation allocated a hundred million dollars of grant money for the study of Friendly AI, then a thousand scientists would at once begin to rewrite their grant proposals to make them appear relevant to Friendly AI. But they would not be genuinely interested in the problem—witness that they did not show curiosity before someone offered to pay them. While Artificial General Intelligence is unfashionable and Friendly AI is entirely off the radar, we can at least assume that anyone speaking about the problem is genuinely interested in it. If you throw too much money at a problem that a field is not prepared to solve, the excess money is more likely to produce anti-science than science—a mess of false solutions.

[...]

If unproven brilliant young scientists become interested in Friendly AI of their own accord, then I think it would be very much to the benefit of the human species if they could apply for a multi-year grant to study the problem full-time. Some funding for Friendly AI is needed to this effect—considerably more funding than presently exists. But I fear that in these beginning stages, a Manhattan Project would only increase the ratio of noise to signal.

This long final quote shows the security mindset when thinking about takeoff speeds, points Eliezer has returned to commonly since then.

Let us concede for the sake of argument that, for all we know (and it seems to me also probable in the real world) that an AI has the capability to make a sudden, sharp, large leap in intelligence. What follows from this?

First and foremost: it follows that a reaction I often hear, “We don’t need to worry about Friendly AI because we don’t yet have AI,” is misguided or downright suicidal. We cannot rely on having distant advance warning before AI is created; past technological revolutions usually did not telegraph themselves to people alive at the time, whatever was said afterward in hindsight. The mathematics and techniques of Friendly AI will not materialize from nowhere when needed; it takes years to lay firm foundations. And we need to solve the Friendly AI challenge before Artificial General Intelligence is created, not afterward; I shouldn’t even have to point this out. There will be difficulties for Friendly AI because the field of AI itself is in a state of low consensus and high entropy. But that doesn’t mean we don’t need to worry about Friendly AI. It means there will be difficulties. The two statements, sadly, are not remotely equivalent.

The possibility of sharp jumps in intelligence also implies a higher standard for Friendly AI techniques. The technique cannot assume the programmers’ ability to monitor the AI against its will, rewrite the AI against its will, bring to bear the threat of superior military force; nor may the algorithm assume that the programmers control a “reward button” which a smarter AI could wrest from the programmers; et cetera. Indeed no one should be making these assumptions to begin with. The indispensable protection is an AI that does not want to hurt you. Without the indispensable, no auxiliary defense can be regarded as safe. No system is secure that searches for ways to defeat its own security. If the AI would harm humanity in any context, you must be doing something wrong on a very deep level, laying your foundations awry. You are building a shotgun, pointing the shotgun at your foot, and pulling the trigger. You are deliberately setting into motion a created cognitive dynamic that will seek in some context to hurt you. That is the wrong behavior for the dynamic; write code that does something else instead.

For much the same reason, Friendly AI programmers should assume that the AI has total access to its own source code. If the AI wants to modify itself to be no longer Friendly, then Friendliness has already failed, at the point when the AI forms that intention. Any solution that relies on the AI not being able to modify itself must be broken in some way or other, and will still be broken even if the AI never does modify itself. I do not say it should be the only precaution, but the primary and indispensable precaution is that you choose into existence an AI that does not choose to hurt humanity.

To avoid the Giant Cheesecake Fallacy, we should note that the ability to self-improve does not imply the choice to do so. The successful exercise of Friendly AI technique might create an AI which had the potential to grow more quickly, but chose instead to grow along a slower and more manageable curve. Even so, after the AI passes the criticality threshold of potential recursive self-improvement, you are then operating in a much more dangerous regime. If Friendliness fails, the AI might decide to rush full speed ahead on self-improvement—metaphorically speaking, it would go prompt critical.

I tend to assume arbitrarily large potential jumps for intelligence because (a) this is the conservative assumption; (b) it discourages proposals based on building AI without really understanding it; and (c) large potential jumps strike me as probable-in-the-real-world. If I encountered a domain where it was conservative from a risk-management perspective to assume slow improvement of the AI, then I would demand that a plan not break down catastrophically if an AI lingers at a near-human stage for years or longer. This is not a domain over which I am willing to offer narrow confidence intervals.

[...]

I cannot perform a precise calculation using a precisely confirmed theory, but my current opinion is that sharp jumps in intelligence are possible, likely, and constitute the dominant probability. This is not a domain in which I am willing to give narrow confidence intervals, and therefore a strategy must not fail catastrophically—should not leave us worse off than before—if a sharp jump in intelligence does not materialize. But a much more serious problem is strategies visualized for slow-growing AIs, which fail catastrophically if there is a first-mover effect.

[...]

My current strategic outlook tends to focus on the difficult local scenario: The first AI must be Friendly. With the caveat that, if no sharp jumps in intelligence materialize, it should be possible to switch to a strategy for making a majority of AIs Friendly. In either case, the technical effort that went into preparing for the extreme case of a first mover should leave us better off, not worse.

comment by Ben Pace (Benito) · 2018-06-27T01:53:35.095Z · LW(p) · GW(p)

Reviews of books and films from my week with Jacob:

Films watched:

The Big Short

Review: Really fun. I liked certain elements of how it displays bad nash equilibria in finance (I love the scene with the woman from the ratings agency - it turns out she’s just making the best of her incentives too!).
Grade: B

Spirited Away

Review: Wow. A simple story, yet entirely lacking in cliche, and so seemingly original. No cliched characters, no cliched plot twists, no cliched humour, all entirely sincere and meaningful. Didn’t really notice that it was animated (while fantastical, it never really breaks the illusion of reality for me). The few parts that made me laugh, made me laugh harder than I have in ages.
There’s a small visual scene, unacknowledged by the ongoing dialogue, between the mouse-baby and the dust-sprites which is the funniest thing I’ve seen in ages, and I had to rewind for Jacob to notice it.
I liked how by the end, the team of characters are all a different order of magnitude in size.
A delightful, well-told story.
Grade: A+

Stranger Than Fiction

Review: This is now my go-to film of someone trying something original and just failing. Filled with new ideas, but none executed well, and overall just a flop, and it really phones-it-in for the last 20 minutes. It does make me notice the distinct lack of originality in most other films that I’ve seen though - most don’t even try to be original like this does. B+ for effort, but D for output.
Grade: D

I Love You, Daddy

Review: A great study of fatherhood, coming of age, and honesty. This was my second watch, and I found many new things that I didn’t find the first time -about what it means to grow up and have responsibility. One moment I absolutely loved is when the Charlie Day character (who was in my opinion representing the id), was brutally honest and totally rewarded for it. I might send this one to my mum, I think she’ll get a lot out of it.
Grade: A

My Dinner With Andre

Review: Very thought-provoking. Jacob and I discussed it for a while afterward. I hope to watch it again some day. I spent 25% of the movie thinking about my own response to what was being discussed, 25% imagining how I would create my version of this film (what the content of the conversation would be), and 50% actually paying close attention to the film.
Overall I felt that both characters were good representatives of their positions, and I liked how much they stuck to their observations over their theories (they talked about what they’d seen and experienced more than they made leaky abstractions and focused on those). The main variable that was not discussed, was technology. It is the agricultural and industrial revolutions that lead to humans feeling so out-of-sorts in the present day world, not any simple fact of how we socialise today, that can simply be fixed / gotten out of. Nonetheless, I expect that the algorithm that Andre is running will help him gain useful insights about how to behave in the modern world. But you do have to actually interface with it and be part of it to have a real cup of coffee waiting for you in the morning, or to lift millions out of poverty.
The last line of Roger Ebert’s review of this was great. Something like: “They’re both trying to get the other to wake up and smell the coffee. Only in Willy’s case, it’s real coffee.”
Grade: B+

Books read (only parts of):

Computability and Logic (3rd edition)

I always forget basic definitions of languages and models, so a bunch of time was spent doing that. Jacob and I read half of the chapter on the non-standard numbers, to see how the constructions worked, and I just have the basics down more clearly now. Eliezer’s writings about these numbers connects more strongly to my other learning about first order logic now.
Book is super readable given the subject matter, easy to reference the concepts back to other parts of the book, and all round excellent (though it was the hardest slog on this list). Look forward to reading some more.

Modern Principles: Microeconomics (by Cowen and Tabarrok)

I’ve never read much about supply and demand curves, so it was great to go over them in detail, and how the price equilibrium is reached. We resolved many confusions, that I might end up writing in a LW post. I especially liked learning how the equilibrium price maximises social good, but is not the maximum for either the supplier or the buyer.
It was very wordy and I’d like to read a textbook that had the goal of this level of intuitiveness, but aimed at readers with assumed strong math background. I don’t need paragraphs explaining how to read a 2-D graph each time one comes up.
Jacob made a good point about how the book failed to distinguish hypothesis versus empirical evidence, when presenting standard microeconomic theory. Just because you have the theory down doesn’t mean you should believe it corresponds to reality, but the book didn’t seem to notice the difference.
Overall pretty good. I don’t expect to read most chapters in this book, but we also looked through asymmetric information (some of which later tied into our watching of The Big Short), and there were a few others that looked exciting.

Thinking Physics

I am in love with this book. I remember picking it up when I was about 17 and not being able to handle it at all and just flicking through to the answers - but this time, especially with Jacob, we were both able to notice when we felt we really understood something and wanted to check the answer to confirm, versus when we’d said ‘reasonable’ things but which didn’t really bottom out in our experiences of the world.

“Well, if you draw the force vectors like this, there should be a normal force of this strength, which splits up into these two basis vectors and so the ball should roll down at this speed.” “Why do you get to assume a force along the normal?” “I don’t know.” “Why do you get to break it up into two vectors who sum to the initial vector?” “I don’t know.” “Then I think we haven’t answered the question yet. Let’s think some more about our experience of balls rolling down hills.”

One of the best things about doing it with Jacob was that I often had cached answers to problems (both from studying mechanics in high school and having read the book 4 years ago), but instead on reading a problem I would give Jacob time to get confused about it, perhaps by supplying useful questions. Then eventually I’d propose my “Well isn’t it obviously X” answer, and Jacob would be able to point out the parts I hadn’t justified from first principles, helping me notice them. There’s a problem in discussing difficult ideas where if people have been taught the passwords, and especially if the passwords have a certain amount of structure that feels like understanding, that it’s hard to notice the gaps. Jacob helped me notice those, and then I could later come up with real answers, that were correct for the right reasons.
The least good thing about this book is the answers to the problems. Often Jacob and I would come up with an answer, then scrap it and build up a first-principles model that predicted it based in our experiences that we were very confident in, and then also deconstruct the initial false intuition some. Then we’d check the answer, and we were right, but the answer didn’t really address the intuitions in either direction, just gave a (correct) argument for the (correct) solution.

I think it might be really valuable to fully deconstruct the intuition behind why people expect a heavier object to fall faster. I’ve made some progress, but it feels like this is a neglected problem of learning a new field - explaining not only what intuitions you should have, but understanding why you assumed something different.

But the value of the book isn’t the answers - it’s the problems. I’ve never experienced such a coherent set of problems, where you can solve each from first principles (and building off what you’ve learned from the previous problems). With most good books, the more you put in the more you get out, but never have I seen a book where you can get this much out of it by putting so much in (most books normally hit a plateau earlier than this one).
Anyway, we got maybe 1/10th through the book. I can’t wait to work through this more the next time I see Jacob.
It’s already affected our discussions of other topics, how well we notice what we do and don’t understand, and what sorts of explanations we look for.
I’m also tempted, for other things I study, to spend less time writing up the insights and instead spend that time coming up with a problem set that you can solve from first principles.

This book made me think that the natural state of learning isn’t ‘reading’ but ‘play’. Playing with ideas, equations, problems, rather than reading and checking understanding.

Jacob and I now have a ritual of continuing the tradition of trying to understand the world, by going to places in Oxford where great thinkers have learned about the universe, and then solving a problem in this book. We visited a square in Magdalen College where Schroedinger worked on his great works, and solved some problems there.
You only get to read this book once. Use it well.

Hanging out with Jacob:

Grade: A++, would do again in a heartbeat.

comment by Ben Pace (Benito) · 2023-04-05T18:55:10.206Z · LW(p) · GW(p)

"Slow takeoff" at this point is simply a misnomer.

Paul's position should be called "Fast Takeoff" and Eliezer's position should be called "Discontinuous Takeoff".

Replies from: Vladimir_Nesov, lc

↑ comment by Vladimir_Nesov · 2023-04-05T20:14:52.859Z · LW(p) · GW(p)

Slow takeoff doesn't imply absence of discontinuous takeoff a bit later, it just says that FOOM doesn't happen right away and thus there is large AI impact (which is to say, things are happening fast) even pre-FOOM, if it ever happens.

↑ comment by lc · 2023-04-05T19:04:33.270Z · LW(p) · GW(p)

Why not drop "Fast vs Slow" entirely and just use "continuous" vs. "discontinuous" takeoff to refer to the two ideas?

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2023-04-05T19:12:35.808Z · LW(p) · GW(p)

I guess it helps remind everyone that both positions are relatively extreme compared to how most other people have been expecting that the future will go. But continuous vs discontinuous also seems pretty helpful.

comment by Ben Pace (Benito) · 2023-11-24T06:49:45.761Z · LW(p) · GW(p)

I don't normally just write-up takes, especially about current events, but here's something that I think is potentially crucially relevant to the dynamics involved in the recent actions of the OpenAI board, that I haven't seen anyone talk about:

The four members of the board who did the firing do not know each other very well.

Most boards meet a few times per year, for a couple of hours. Only Sutskever works at OpenAI. D'Angelo works senior roles in tech companies like Facebook and Quora, Toner is in EA/policy, and MacAulay at other tech companies (I'm not aware of any overlap with D'Angelo).

It's plausible to me that MacAulay and Toner have spent more than 50 hours in each others' company, but overall I'd probably be willing to bet at even odds that no other pair of them had spent more than 10 hours together before this crisis.

This is probably a key factor in why they haven't written more publicly about their decision. Decision-by-committee is famously terrible, and it's pretty likely to me that everyone pushes back hard on anything unilateral by the others in this high-tension scenario. So any writing representing them has to get consensus, and they're focused on firefighting and getting a new CEO, to spend time iterating on an explanation of their reasoning that they can all get behind. That's why Sutskever's public writing is only speaking for himself (he just says that he regrets the decision, he's said nothing about why or that in-principle speaks for the others).

I think this also predicts that Shear getting involved, and being the only direct counterparty that they must collectively and repeatedly work something out with, improved things. (Which accounts I've read suggest was a turning point in the negotiations.) He's the first person that they are all engaged with and need to make things work out with, so he is in a position where they are forced to get consensus in a timely fashion, and he can actually demand specific things of them. This was a forcing function on them making decisions and continuing to communicate with an individual.

It's standard to expect them to prepare a proper explanation in-advance, but from the information in this comment [LW(p) · GW(p)], I believe this firing decision was made within just a couple days of the event. A fast decision may have been the wrong call, but once it happened, then a team who doesn't really know each other is thrust into an extremely high-stakes position and has to make decisions by consensus. My guess is that this was really truly quite difficult and it was very hard to get anything done at all.

This lens on the situation makes me update in the direction that they will eventually talk about why, once they've had time to iterate on the text explaining the reasoning, now that the basic function of the company isn't under fire.

My current guess is that in many ways, a lot of the board's decision-making since the firing has been worse than any individual's on the board would've been had they been working alone.

Replies from: Benito, Benito, Benito, Benito, Benito, Benito

↑ comment by Ben Pace (Benito) · 2023-11-26T21:57:27.815Z · LW(p) · GW(p)

In this mess, Altman and Helen should not be held to the same ethical standards, because I believe one of them has been given a powerful career in substantial part based on her commitments to higher ethical standards (a movement that prided itself on openness and transparency and trying to do the most good).

If Altman played deceptive strategies, and insofar as Helen played back the same deceptive strategies as Altman, then she did not honor the EA name.

(The name has a lot of dirt on it these days already, but still. It is a name that used to mean something back when it gave her power.)

Insofar as you got a position specifically because you were affiliated with a movement claiming to be good and open and honest and to have unusually high moral standards, and then when you arrive you become a standard political player, that's disingenuous.

Replies from: ryan_greenblatt

↑ comment by ryan_greenblatt · 2023-11-26T23:33:51.297Z · LW(p) · GW(p)

because I believe [Helen] has been given a powerful career in substantial part based on her commitments to higher ethical standards [...] then she did not honor the EA name. [...] Insofar as you got a position specifically because you were affiliated with a movement claiming to be good and open and honest and to have unusually high moral standards, and then when you arrive you become a standard political player, that's disingenuous.

I think Holden being added to the board shouldn't be mostly attributed to his affiliation with EA. And the Helen board seat is originally from this.

(The relevant history here is that this is the OpenAI grant that resulted in a board seat while here is a post from just earlier about Holden's takes on EA.)

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2023-11-27T00:15:31.440Z · LW(p) · GW(p)

Some historical context

Holden in 2013 on the GiveWell blog:

We’re proud to be part of the nascent “effective altruist” movement. Effective altruism has been discussed elsewhere (see Peter Singer’s TED talk and Wikipedia); this post gives our take on what it is and isn’t.

Holden in 2015 on the EA Forum [EA · GW] (talking about GiveWell Labs, which grew into OpenPhil):

We're excited about effective altruism, and we think of GiveWell as an effective altruist organization (while knowing that this term is subject to multiple interpretations, not all of which apply to us).

Holden in April 2016 about plans for working on AI:

Potential risks from advanced artificial intelligence will be a major priority for 2016. Not only will Daniel Dewey be working on this cause full-time, but Nick Beckstead and I will both be putting significant time into it as well. Some other staff will be contributing smaller amounts of time as appropriate.

(Dewey who IIRC had worked at FHI and CEA ahead of this, and Beckstead from FHI.)

Holden in 2016 about why they're making potential risks from advanced AI a priority:

I believe the Open Philanthropy Project is unusually well-positioned from this perspective:
We are well-connected in the effective altruism community, which includes many of the people and organizations that have been most active in analyzing and raising awareness of potential risks from advanced artificial intelligence. For example, Daniel Dewey has previously worked at the Future of Humanity Institute and the Future of Life Institute, and has been a research associate with the Machine Intelligence Research Institute.

Holden about the OpenAI grant in 2017:

This grant initiates a partnership between the Open Philanthropy Project and OpenAI, in which Holden Karnofsky (Open Philanthropy’s Executive Director, “Holden” throughout this page) will join OpenAI’s Board of Directors and, jointly with one other Board member, oversee OpenAI’s safety and governance work.
OpenAI initially approached Open Philanthropy about potential funding for safety research, and we responded with the proposal for this grant. Subsequent discussions included visits to OpenAI’s office, conversations with OpenAI’s leadership, and discussions with a number of other organizations (including safety-focused organizations and AI labs), as well as with our technical advisors.

As a negative datapoint: I looked through a bunch of the media articles linked at the bottom of this GiveWell page, and most of them do not mention Effective Altruism, only effective giving / cost-effectiveness. So their Effective Altruist identity have had less awareness amongst folks who primarily know of Open Philanthropy through their media appearances.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2023-11-27T00:34:07.871Z · LW(p) · GW(p)

I think this is accurately described as "an EA organization got a board seat at OpenAI", and the actions of those board members reflect directly on EA (whether internally or externally).

Why did OpenAI come to trust Holden with this position of power? My guess is Holden and Dustin's personal reputations were substantial effects here, along with Open Philanthropy's major funding source, but also that many involved people's excitement about and respect for the EA movement were a relevant factor in OpenAI wanting to partner with Open Philanthropy, and that Helen's and Tasha's actions have directly and negatively reflected on how the EA ecosystem is viewed by OpenAI leadership.

There's a separate question about why Holden picked Helen Toner and Tasha MacAulay, and to what extent they were given power in the world by the EA ecosystem. It seems clear that these people have gotten power through their participation in the EA ecosystem (as OpenPhil is an EA institution), and to the extent that the EA ecosystem advertises itself as more moral than other places, if they executed the standard level of deceptive strategies that others in the tech industry would in their shoes, then that was false messaging.

↑ comment by Ben Pace (Benito) · 2023-11-24T06:57:25.780Z · LW(p) · GW(p)

I'm not quite sure in the above comment how to balance between "this seems to me like it could explain a lot" and also "might just be factually false". So I guess I'm leaving this comment, lampshading it.

↑ comment by Ben Pace (Benito) · 2023-11-25T02:29:11.005Z · LW(p) · GW(p)

The most important thing right now: I still don't know why they chose to fire Altman, and especially why they chose to do it so quickly.

That's an exceedingly costly choice to make (i.e. with the speed of it), and so when I start to speculate on why, I only come up with commensurately worrying states of affair e.g. he did something egregious enough to warrant it, or he didn't and the board acted with great hostility.

Them going back on their decision is bayesian evidence for the latter — if he'd done something egregious, they'd just be able to tell relevant folks, and Altman wouldn't get his job back.

So many people are asking this (e.g. everyone at the company). I'll be very worried if the reason doesn't come out.

↑ comment by Ben Pace (Benito) · 2023-11-24T07:06:55.318Z · LW(p) · GW(p)

In brief: I'm saying that once you condition on:

The board decided the firing was urgent.
The board does not know each other very well and defaults to making decisions by consensus.
The board is immediately in a high-stakes high-stress situation.

Then you naturally get

4. The board fails to come to consensus on public comms about the decision.

↑ comment by Ben Pace (Benito) · 2023-11-25T02:09:57.264Z · LW(p) · GW(p)

Also, I don't know that I've said this, but from reading enough of his public tweets, I had blocked Sam Altman long ago. He seemed very political in how he used speech, and so I didn't want to include him in my direct memetic sphere.

As a small pointer to why: he would commonly choose not to share object-level information about something, but instead share how he thought social reality should change. I think I recall him saying that the social consensus was wrong about fusion energy, and pushed for it to move in a specific direction; he did this rather than just plainly say what his object level beliefs about fusion were, or offer a particular counter-argument to an argument that was going around.

It's been a year or two since I blocked him, so I don't recall more specifics, but it seemed worth mentioning, as a datapoint for folks to include in their character assessments.

↑ comment by Ben Pace (Benito) · 2023-11-24T07:20:37.671Z · LW(p) · GW(p)

My current guess is that most of the variance in what happened is explained by a board where 3 out of 4 people don't know the dynamics of upper management in a multi-billion dollar company, where the board don't know each other well, and (for some reason) the decision was made very suddenly. Pretty low-expectations given that situation. Seems like Shear was a pretty great replacement get given the hand dealt. Assuming that they had legit reason to fire the CEO, it's probably primarily through lack of skill and competence that they failed, more so than as a result of Altman's superior deal-making skill and leadership abilities (though that was what finished it off).

comment by Ben Pace (Benito) · 2020-01-01T16:54:13.490Z · LW(p) · GW(p)

There's a game for the Oculus Quest (that you can also buy on Steam) called "Keep Talking And Nobody Explodes".

It's a two-player game. When playing with the VR headset, one of you wears the headset and has to defuse bombs in a limited amount of time (either 3, 4 or 5 mins), while the other person sits outside the headset with the bomb-defusal manual and tells you what to do. Whereas with other collaboration games, you're all looking at the screen together, with this game the substrate of communication is solely conversation, the other person is providing all of your inputs about how their half is going (i.e. not shown on a screen).

The types of puzzles are fairly straightforward computational problems but with lots of fiddly instructions, and require the outer person to figure out what information they need from the inner person. It often involves things like counting numbers of wires of a certain colour, or remembering the previous digits that were being shown, or quickly describing symbols that are not any known letter or shape.

So the game trains you and a partner in efficiently building a shared language for dealing with new problems.

More than that, as the game gets harder, often some of the puzzles require substantial independent computation from the player on the outside. At this point, it can make sense to play with more than two people, and start practising methods for assigning computational work between the outer people (e.g. one of them works on defusing the first part of the bomb, and while they're computing in their head for ~40 seconds, the other works on defusing the second part of the bomb in dialogue with the person on the inside). This further creates a system which trains the ability to efficiently coordinate on informational work under.

Overall I think it's a pretty great game for learning and practising a number of high-pressure communication skills with people you're close to.

Replies from: mr-hire, gworley, ioannes_shade

↑ comment by Matt Goldenberg (mr-hire) · 2020-01-01T23:03:31.859Z · LW(p) · GW(p)

There's a similar free game for Android and iOs called space team that I highly recommend.

↑ comment by Gordon Seidoh Worley (gworley) · 2020-01-02T21:53:49.922Z · LW(p) · GW(p)

I use both this game and Space Team as part of training people in the on-call rotation at my company. They generally report that it's fun, and I love it because it usually creates the kind of high-pressure feelings in people they may experience when on-call, so it creates a nice, safe environment for them to become more familiar with those feelings and how to work through them.

On a related note, I'm generally interested in finding more cooperative games with asymmetric information and a need to communicate. Lots of games meet one or two of those criteria, but very few games are able to meet all simultaneously. For example, Hanabi is cooperative and asymmetric, but lacks much communication (you're not allowed to talk), and many games are asymmetric and communicative but not cooperative (Werewolf, Secret Hitler, etc.) or cooperative and communicative but not asymmetric (Pandemic, Forbidden Desert, etc.).

↑ comment by ioannes (ioannes_shade) · 2020-01-01T17:05:40.368Z · LW(p) · GW(p)

+1 – this game is great.

It's really good with 3-4 people giving instructions and one person in the hot seat. Great for team bonding.

comment by Ben Pace (Benito) · 2019-07-13T01:42:09.479Z · LW(p) · GW(p)

I talked with Ray for an hour about Ray's phrase "Keep your beliefs cruxy and your frames explicit".

I focused mostly on the 'keep your frames explicit' part. Ray gave a toy example of someone attempting to communicate something deeply emotional/intuitive, or perhaps a buddhist approach to the world, and how difficult it is to do this with simple explicit language. It often instead requires the other person to go off and seek certain experiences, or practise inhabiting those experiences (e.g. doing a little meditation, or getting in touch with their emotion of anger).

Ray's motivation was that people often have these very different frames or approaches, but don't recognise this fact, and end up believing aggressive things about the other person e.g. "I guess they're just dumb" or "I guess they just don't care about other people".

I asked for examples that were motivating his belief - where it would be much better if the disagreers took to hear the recommendation to make their frames explicit. He came up with two concrete examples:

Jim v Ray on norms for shortform, where during one hour they worked through the same reasons-for-disagreement three times.
[blank] v Ruby on how much effort required to send non-threatening signals during disagreements, where it felt like a fundamental value disagreement that they didn't know how to bridge.

---

I didn't get a strong sense for what Ray was pointing at. I see the ways that the above disagreements went wrong, where people were perhaps talking past each other / on the wrong level of the debate, and should've done something different. My understanding of Ray's advice is for the two disagreers to bring their fundamental value disagreements to the explicit level, and that both disagreers should be responsible for making their core value judgements explicit. I think this is too much of a burden to give people. Most of the reasons for my beliefs are heavily implicit and I cannot make things fully explicit ahead of time. In fact, this just seems not how humans work.

One of the key insights that Kahneman's System 1 and System 2 distinction makes is that my conscious, deliberative thinking (System 2) is a very small fraction of the work my brain is doing, even though it is the part I have the most direct access to. Most of my world-model and decision-making apparatus is in my System 1. There is an important sense in which asking me to make all of my reasoning accessible to my conscious, deliberative system is an AGI-complete request.

What in fact seems sensible to me is that during a conversation I will have a fast feedback-loop with my interlocutor, which will give me a lot of evidence about which part of my thinking to zoom in on and do the costly work of making conscious and explicit. There is great skill involved in doing this live in conversation effectively and repeatedly, and I am excited to read a LW post giving some advice like this.

That said, I also think that many people have good reasons to distrust bringing their disagreements to the explicit level, and rightfully expect it to destroy ability to communicate. I'm thinking of Scott's epistemic learned helplessness here, but I'm also thinking about experiences where trying to crystalise and name a thought I'm having before I know how to fully express it has a negative effect on my ability to think clearly about it. I'm not sure what this is but this is another time when I feel hesitant to make everything explicit.

As a third thing, my implicit brain is better than my explicit reasoning at modelling social/political dynamics. Let me handwave at a story of a nerd attempting to negotiate with a socially-savvy bully/psychopath/person-who-just-has-different-goals, where the nerd tries to repeatedly and helpfully make all of their thinking explicit, and is confused why they're losing at the negotiation. I think even healthy and normal people have patterns around disagreement and conflict resolution that could take advantage of a socially inept individually trying to only rely on the things they can make explicit.

These three reasons lead me to not want to advise people to 'keep their frames explicit': it seems prohibitively computationally costly to do it for all things, many people should not trust their explicit reasoning to capture their implicit reasons, and that this is especially true for social/political reasoning.

---

My general impression of this advice is that it seems to want to make everything explicit all of the time (a) as though that were a primitive operation that can solve all problems and (b) I have a sense that it takes up too much of my working memory when I talk with Ray. I have some sense that this approach implies a severe lack of trust in people's implicit/unconscious reasoning and only believes explicit/conscious reasoning can ever be relied upon, though that seems a bit of a simplistic narrative. (Of course there are indeed reasons to strongly trust conscious reasoning over unconscious - one cannot unconsciously build rockets that fly to the moon - but I think humans do not have the choice to not build a high-trust relationship with their unconscious mind.)

Replies from: Zvi, Raemon

↑ comment by Zvi · 2019-08-07T13:22:36.488Z · LW(p) · GW(p)

I find "keep everything explicit" to often be a power move designed to make non-explicit facts irrelevant and non-admissible. This often goes along with burden of proof. I make a claim (real example of this dynamic happening, at an unconference under Chatham house rules: That pulling people away from their existing community has real costs that hurt those communities), and I was told that, well, that seems possible, but I can point to concrete benefits of taking them away, so you need to be concrete and explicit about what those costs are, or I don't think we should consider them.

Thus, the burden of proof was put upon me, to show (1) that people central to communities were being taken away, (2) that those people being taken away hurt those communities, (3) in particular measurable ways, (4) that then would impact direct EA causes. And then we would take the magnitude of effect I could prove using only established facts and tangible reasoning, and multiply them together, to see how big this effect was.

I cooperated with this because I felt like the current estimate of this cost for this person was zero, and I could easily raise that, and that was better than nothing, but this simply is not going to get this person to understand my actual model, ever, at all, or properly update. This person is listening on one level, and that's much better than nothing, but they're not really listening curiously, or trying to figure the world out. They are holding court to see if they are blameworthy for not being forced off of their position, and doing their duty as someone who presents as listening to arguments, of allowing someone who disagrees with them to make their case under the official rules of utilitarian evidence.

Which, again, is way better than nothing! But is not the thing we're looking for, at all.

I've felt this way in conversations with Ray recently, as well. Where he's willing and eager to listen to explicit stuff, but if I want to change his mind, then (de facto) I need to do it with explicit statements backed by admissible evidence in this court. Ray's version is better, because there ways I can at least try to point to some forms of intuition or implicit stuff, and see if it resonates, whereas in the above example, I couldn't, but it's still super rough going.

Another problem is that if you have Things One Cannot Explicitly Say Or Consider, but which one believes are important, which I think basically everyone importantly does these days, then being told to only make explicit claims makes it impossible to make many important claims. You can't both follow 'ignore unfortunate correlations and awkward facts that exist' and 'reach proper Bayesian conclusions.' The solution of 'let the considerations be implicit' isn't great, but it can often get the job done if allowed to.

My private conversations with Ben have been doing a very good job, especially recently, of doing the dig-around-for-implicit-things and make-explicit-the-exact-thing-that-needs-it jobs.

Given Ray is writing a whole sequence, I'm inclined to wait until that goes up fully before responding in long form, but there seems to be something crucial missing from the explicitness approach.

Replies from: Benito, Raemon, Raemon, Raemon

↑ comment by Ben Pace (Benito) · 2019-08-07T14:43:33.926Z · LW(p) · GW(p)

To complement that: Requiring my interlocutor to make everything explicit is also a defence against having my mind changed in ways I don't endorse but that I can't quite pick apart right now. Which kinda overlaps with your example, I think.

I sometimes will feel like my low-level associations are changing in a way I'm not sure I endorse, halt, and ask for something that the more explicit part of me reflectively endorses. If they're able to provide that, then I will willingly continue making the low-level updates, but if they can't then there's a bit of an impasse, at which point I will just start trying to communicate emotionally what feels off about it (e.g. in your example I could imagine saying "I feel some panic in my shoulders and a sense that you're trying to control my decisions"). Actually, sometimes I will just give the emotional info first. There's a lot of contextual details that lead me to figure out which one I do.

↑ comment by Raemon · 2019-08-07T22:46:00.441Z · LW(p) · GW(p)

One last bit is to keep in mind that most (or, many things), can be power moves.

There's one failure mode, where a person sort of gives you the creeps, and you try to bring this up and people say "well, did they do anything explicitly wrong?" and you're like "no, I guess?" and then it turns out you were picking up something important about the person-giving-you-the-creeps and it would have been good if people had paid some attention to your intuition.

There's a different failure mode where "so and so gives me the creeps" is something you can say willy-nilly without ever having to back it up, and it ends up being it's own power move.

I do think during politically charged conversations it's good to be able to notice and draw attention to the power-move-ness of various frames (in both/all directions)

(i.e. in the "so and so gives me the creeps" situation, it's good to note both that you can abuse "only admit explicit evidence" and "wanton claims of creepiness" in different ways. And then, having made the frame of power-move-ness explicit, talk about ways to potentially alleviate both forms of abuse)

↑ comment by Raemon · 2019-08-07T17:35:21.586Z · LW(p) · GW(p)

Want to clarify here, "explicit frames" and "explicit claims" are quite different, and it sounds here like you're mostly talking about the latter.

The point of "explicit frames" is specifically to enable this sort of conversation – most people don't even notice that they're limiting the conversation to explicit claims, or where they're assuming burden of proof lies, or whether we're having a model-building sharing of ideas or a negotiation.

Also worth noting (which I hadn't really stated, but is perhaps important enough to deserve a whole post to avoid accidental motte/bailey by myself or others down the road): My claim is that you should know what your frames are, and what would change* your mind. *Not* that you should always tell that to other people.

Ontological/Framework/Aesthetic Doublecrux is a thing you do with people you trust about deep, important disagreements where you think the right call is to open up your soul a bit (because you expect them to be symmetrically opening their soul, or that it's otherwise worth it), not something you necessarily do with every person you disagree with (especially when you suspect their underlying framework is more like a negotiation or threat than honest, mutual model-sharing)

*also, not saying you should ask "what would change my mind" as soon as you bump into someone who disagrees with you. Reflexively doing that also opens yourself up to power moves, intentional or otherwise. Just that I expect it to be useful on the margin.

Replies from: Zvi

↑ comment by Zvi · 2019-08-07T20:48:41.165Z · LW(p) · GW(p)

Interesting. It seemed in the above exchanges like both Ben and you were acting as if this was a request to make your frames explicit to the other person, rather than a request to know what the frame was yourself and then tell if it seemed like a good idea.

I think for now I still endorse that making my frame fully explicit even to myself is not a reasonable ask slash is effectively a request to simplify my frame in likely to be unhelpful ways. But it's a lot more plausible as a hypothesis.

Replies from: Raemon

↑ comment by Raemon · 2019-08-07T21:04:07.358Z · LW(p) · GW(p)

I've mostly been operating (lately) within the paradigm of "there does in fact seem to be enough trust for a doublecrux, and it seems like doublecrux is actually the right move given the state of the conversation. Within that situation, making things as explicit as possible seems good to me." (But, this seems importantly only true within that situation)

But it also seemed like both Ben (and you) were hearing me make a more aggressive ask than I meant to be making (which implies some kind of mistake on my part, but I'm not sure which one). The things I meant to be taking as a given are:

1) Everyone has all kinds of implicit stuff going on that's difficult to articulate. The naively Straw Vulcan failure mode is to assume that if you can't articulate it it's not real.

2) I think there are skills to figuring out how to make implicit stuff explicit, in a careful way that doesn't steamroll your implicit internals.

3) Resolving serious disagreements requires figuring out how to bridge the gap of implicit knowledge. (I agree that in a single-pair doublecrux, doing the sort of thing you mention in the other comment can work fine, where you try to paint a picture and ask them questions to see if they got the picture. But, if you want more than one person to be able to understand the thing you'll eventually probably want to figure out how to make it explicit without simplifying it so hard that it loses its meaning)

4) The additional, not-quite-stated claim is "I nowadays seem to keep finding myself in situations where there's enough longstanding serious disagreements that are worth resolving that it's worth Stag Hunting on Learning to Make Beliefs Cruxy and Frames Explicit, to facilitate those conversations."

I think maybe the phrase "*keep* your beliefs cruxy and frames explicit" implied more of an action of "only permit some things" rather than "learn to find extra explicitness on the margin when possible."

↑ comment by Raemon · 2019-08-07T18:33:50.855Z · LW(p) · GW(p)

As far as explicit claims go: My current belief is something like:

If you actually want to communicate an implicit idea to someone else, you either need

1) to figure out how to make the implicit explicit, or

2) you need to figure out the skill of communicating implicit things implicitly... which I think actually can be done. But I don't know how to do it and it seems hella hard. (Circling seems to work via imparting some classes of implicit things implicitly, but depends on being in-person)

My point is not at all to limit oneself to explicit things, but to learn how to make implicit things explicit (or, otherwise communicable). This is important because the default state often seems to be failing to communicate at all.

(But it does seem like an important, related point that trying to push for this ends up very similar sounding, from the outside, like 'only explicit evidence is admissable', which is a fair thing to have a instinctive resistance to)

But, the fact that this is real hard is because the underlying communication is real hard. And I think there's some kind of grieving necessary to accept the fact that "man, why can't they just understand my implicit things that seem real obvious to me?" and, I dunno, they just can't. :/

Replies from: Zvi

↑ comment by Zvi · 2019-08-07T20:54:21.082Z · LW(p) · GW(p)

Agreed it's a learned skill and it's hard. I think it's also just necessary. I notice that the best conversations I have about difficult to describe things definitely don't involve making everything explicit, and they involve a lot of 'do you understand what I'm saying?' and 'tell me if this resonates' and 'I'm thinking out loud, but maybe'.

And then I have insights that I find helpful, and I can't figure out how to write them up, because they'd need to be explicit, and they aren't, so damn. Or even, I try to have a conversation with someone else (in some recent cases, you) and share these types of things, and it feels like I have zero idea how to get into a frame where any of it will make any sense or carry any weight, even when the other person is willing to listen by even what would normally be strong standards.

Sometimes this turns into a post or sequence that ends up explaining some of the thing? I dunno.

Replies from: Raemon

↑ comment by Raemon · 2019-08-07T20:58:59.836Z · LW(p) · GW(p)

FWIW, upcoming posts I have in the queue are:

Noticing Frame Differences
Tacit and Explicit Knowledge
Backpropagating Facts into Aesthetics
Keeping Frames Explicit

(Possibly, in light of this conversation, adding a post called something like "Be secretly explicit [on the margin]")

↑ comment by Raemon · 2019-07-13T02:12:48.472Z · LW(p) · GW(p)

I'd been working on a sequence explaining this all in more detail (I think there's a lot of moving parts and inferential distance to cover here). I'll mostly respond in the form of "finish that sequence."

But here's a quick paragraph that more fully expands what I actually believe:

If you're building a product [LW · GW] with someone (metaphorical product or literal product), and you find yourself disagreeing, and you explain "This is important because X, which implies Y", and they say "What!? But, A, therefore B!" and then you both keep repeating those points over and over... you're going to waste a lot of time, and possibly build a confused frankenstein product that's less effective than if you could figure out how to successfully communicate.

In that situation, I claim you should be doing something different, if you want to build a product that's actually good.
If you're not building a product, this is less obviously important. If you're just arguing for fun, I dunno, keep at it I guess.

A separate, further claim is that the reason you're miscommunicating is because you have a bunch of hidden assumptions in your belief-network, or the frames that underly your belief network. I think you will continue to disagree and waste effort until you figure out how to make those hidden assumptions explicit.

You don't have to rush that process. Take your time to mull over your beliefs, do focusing or whatever helps you tease out the hidden assumptions without accidentally crystallizing them wrong.

Meanwhile, you can reference the fact that the differing assumptions exist by giving them placeholder names like "the sparkly pink purple ball thing".

This isn't an "obligation" I think people should have. But I think it's a law-of-the-universe that if you don't do this, your group will waste time and/or your product will be worse.

(Lots of companies successfully build products without dealing with this, so I'm not at all claiming you'll fail. And meanwhile there's lots of other tradeoffs your company might be making that are bad and should be improved, and I'm not confident this is the most important thing to be working on)
But among rationalists, who are trying to improve their rationality while building products together, I think resolving this issue should be a high priority, which will pay for itself pretty quickly.

Thirdly: I claim there is a skill to building up a model of your beliefs, and your cruxes for those beliefs, and the frames that underly your beliefs... such that you can make normally implicit things explicit in advance. (Or, at least, every time you disagree with someone about one of your beliefs, you automatically flag what the crux for the belief was, and then keep track of it for future reference). So, by the time you get to a heated disagreement, you already have some sense of what sort of things would change your mind, and why you formed the beliefs you did.

You don't have to share this with others, esp. if they seem to be adversarial. But understanding it for yourself can still help you make sense of the conversation.
Relatedly, there's a skill to detecting when other people are in a different frame from you, and helping them to articulate their frame.

Literal companies building literal products can alleviate this problem by only hiring people with similar frames and beliefs, so they have an easier time communicating. But, it's
This seems important because weird, intractable conversations have shown up repeatedly...

in the EA ecosystem

(where even though people are mostly building different products, there is a shared commons that is something of a "collectively built product" that everyone has a stake in, and where billions of dollars and billions of dollars worth of reputation are at stake)

on LessWrong the website

(where everyone has a stake in a shared product of "how we have conversations together" and what truthseeking means)

on the LessWrong development team

where we are literally building a product (a website), and often have persistent, intractable disagreements about UI, minimalism, how shortform should work, is Vulcan a terribly shitshow of a framework that should be scrapped, etc.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2019-07-13T02:25:26.832Z · LW(p) · GW(p)

every time you disagree with someone about one of your beliefs, you [can] automatically flag what the crux for the belief was

This is the bit that is computationally intractable.

Looking for cruxes is a healthy move, exposing the moving parts of your beliefs in a way that can lead to you learning important new info.

However, there are an incredible number of cruxes for any given belief. If I think that a hypothetical project should accelerate it's development time 2x in the coming month, I could change my mind if I learn some important fact about the long-term improvements of spending the month refactoring the entire codebase; I could change my mind if I learn that the current time we spend on things is required for models of the code to propagate and become common knowledge in the staff; I could change my mind if my models of geopolitical events suggest that our industry is going to tank next week and we should get out immediately.

Replies from: Raemon

↑ comment by Raemon · 2019-07-13T07:24:28.546Z · LW(p) · GW(p)

I'm not claiming you can literally do this all the time. [Ah, an earlier draft of the previous comment emphasized this this was all "things worth pushing for on the margin", and explicitly not something you were supposed to sacrifice all other priorities for. I think I then rewrote the post and forgot to emphasize that clarification]

I'll try to write up better instructions/explanations later, but to give a rough idea of the amount of work I'm talking about. I'm saying "spend a bit more time than you normally do in 'doublecrux mode'". [This can be, like, an extra half hour sometimes when having a particular difficult conversation].

When someone seems obviously wrong, or you seem obviously right, ask yourself "what are cruxes are most loadbearing", and then:

Be mindful as you do it, to notice what mental motions you're actually performing that help. Basically, do Tuning Your Cognitive Strategies to the double crux process, to improve your feedback loop.
When you're done, cache the results. Maybe by writing it down, or maybe just sort of thinking harder about it so you remember it a better.

The point is not to have fully mapped out cruxes of all your beliefs. The point is that you generally have practiced the skill of noticing what the most important cruxes are, so that a) you can do it easily, and b) you keep the results computed for later.

comment by Ben Pace (Benito) · 2023-12-07T07:07:00.408Z · LW(p) · GW(p)

For too long, I have erred on the side of writing too much.

The first reason I write is in order to find out what I think.

This often leaves my writing long and not very defensible.

However, editing the whole thing is so much extra work after I already did all the work figuring out what I think.

Sometimes it goes well if I just scrap the whole thing and concisely write my conclusion.

But typically I don't want to spend the marginal time.

Another reason my writing is too long is because I have extra thoughts I know most people won't find useful.

But I've picked up a heuristic that says it's good to share actual thinking because sometimes some people find it surprisingly useful, so I hit publish anyway.

Nonetheless, I endeavor to write shorter.

So I think I shall experiment with cutting the bits off of comments that represent me thinking aloud, but aren't worth the space in the local conversation.

And I will put them here, as the dregs of my cognition. I shall hopefully gather data over the next month or two and find out whether they are in fact worthwhile.

Replies from: adamzerner, Viliam, Yoav Ravid, Benito, lillybaeum

↑ comment by Adam Zerner (adamzerner) · 2023-12-08T07:32:03.373Z · LW(p) · GW(p)

Noooooooo! I mean this in a friendly sort of sense. Not that I'm mad or indignant or anything. Just that I'm sad to see this and suspect that it is a move in the wrong direction.

This relates to something I've been wanting to write about for a while and just never really got around to it. Now's as good a time as any to at least get started. I started a very preliminary shortform post on it here [LW(p) · GW(p)] a while ago.

Basically, think about the progression of an idea. Let's use academia as an initial example.

At some point in the timeline, an idea is deemed good enough to pursue an experiment on.
Then the results of the experiment are published.
Then people read about the results and talk about them. And the idea.
Then other people summarize the idea and the results. In other papers. In textbooks. In meta-analyses. In the newspaper. Blog posts. Pop science books. Whatever.
Then people discuss those summaries.
Earlier on, before the idea was deemed good enough to pursue an experiment on, the idea probably went through various revisions.
And before that, the author of the idea probably chatted with some colleagues about it to see what they think.
And before that, I dunno, maybe there was a different idea that ended up being a dead end, but lead to the author pivoting to the real idea.
And before that, I dunno, there's probably various babble-y things going on.

What I'm trying to get at is that there is some sort of lifecycle of an idea. Maybe we can think of the stages as:

Inspiration
Ideation
Refinement
Pursuit
Spread

On platforms like LessWrong, I feel like there is a sort of cultural expectation that when you publish things publicly, they are at the later stages in this lifecycle. From what I understand, things like Personal Blog Posts, Open Thread and Shortform all exist as places where people are encouraged to post about things regardless of the lifecycle stage. However, in practice, I don't really think people feel comfortable publishing early stage stuff.

There's certainly pros and cons at play here. Suppose there was 10x more early stage content on LessWrong. What would the consequences of this be? And would it be a net-negative, or a net-positive? It's hard to say. Maybe it'd screw up the signal-to-noise ratio in the eyes of readers. And maybe that'd lead to a bunch of bad things. Or maybe it'd lead to a bunch of fun and productive collaboration and ideation.

What I do feel strongly about is that the early stages of this lifecycle are in fact important. Currently I suppose that they happen at coffee shops and bars. On cell phones and email clients. On Discord and Slack. Stuff like that. Between people who are already close friends or close colleagues. I get the sense that "we" can "do better" though.

↑ comment by Viliam · 2023-12-09T23:50:40.116Z · LW(p) · GW(p)

editing the whole thing is so much extra work after I already did all the work figuring out what I think.
typically I don't want to spend the marginal time.

Yeah. Similar here, only I am aware of this in advance, so I often simply write nothing, because I am a bit of perfectionist here, don't want to publish something unfinished, and know that finishing just isn't worth it.

I wonder whether AI editors could help us with this.

↑ comment by Yoav Ravid · 2023-12-07T07:30:09.177Z · LW(p) · GW(p)

Have you considered using footnotes for that?

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2023-12-07T07:45:58.330Z · LW(p) · GW(p)

That's a fine idea, but for a while I'd like to err on the side of my comments being "definitely shorter than they have to be" rather than "definitely longer than they have to be".

(In general I often like to execute pendulum swings, so that I at least know that I am capable of not making the same errors forever.)

↑ comment by Ben Pace (Benito) · 2023-12-07T07:08:39.019Z · LW(p) · GW(p)

I don't want to double the comment count I submit to Recent Discussion, so I'll just update this comment with the things I've cut.

12/06/2023 Comment on Originality vs. Correctness [LW(p) · GW(p)]

It's fun to take the wins of one culture and apply them to the other, people are very shocked that you found some hidden value to be had (though it often isn't competitive value / legible to the culture). And if you manage to avoid some terrible decison people speak about how wise you are to have noticed.
(Those are the best cases, often of course people are like "this is odd, I'm going to pretend I didn't see this" and then move on.)

↑ comment by lillybaeum · 2023-12-10T16:37:43.627Z · LW(p) · GW(p)

You may want to look into Toki Pona, a language ostensibly built around conveying meaning in the fewest, simplest possible expressions.

One can explain the most complex things despite having only 130~ words, almost like 'programming' the meaning into the sentence, but as the sentence necessarily gets longer and longer, one begins to wonder the necessity of encoding so much meaning.

You can only point to the Tao, you can't describe it or name it directly. Information is much the same way, I think.

comment by Ben Pace (Benito) · 2024-03-16T20:41:02.390Z · LW(p) · GW(p)

Often I am annoyed when I ask someone (who I believe has more information than me) a question and they say "I don't know". I'm annoyed because I want them to give me some information. Such as:

"How long does it take to drive to the conference venue?"
"I don't know."
"But is it more like 10 minutes or more like 2 hours?"
"Oh it's definitely longer than 2 hours."

But perhaps I am the one making a mistake. For instance, the question "How many countries are there?" can be answered "I'd say between 150 and 400" or it can be answered "195", and the former is called "an estimate" and the latter is called "knowing the answer". There is a folk distinction here and perhaps it is reasonable for people to want to preserve the distinction between "an estimate" and "knowing the answer".

So in the future, to get what I want, I should say "Please can you give me an estimate for how long it takes to drive to the conference venue?".

And personally I should strive, when people ask me a question to which I don't know the answer, to say "I don't know the answer, but I'd estimate between X and Y."

Replies from: winstonBosan, shankar-sivarajan, Dagon, CstineSublime

↑ comment by winstonBosan · 2024-03-17T13:30:02.969Z · LW(p) · GW(p)

It seems like, instead of asking the objective lvl question, asking a probing “What can you tell me about the drive to the conference?” And expanding from there might get you closer to desired result.

↑ comment by Shankar Sivarajan (shankar-sivarajan) · 2024-03-16T23:00:55.523Z · LW(p) · GW(p)

Alternatively, if information retrieval and transmission is expensive enough, or equivalently, if finding another source quick and easy, "I don't know" could mean "Ask someone else: the expected additional precision/confidence of doing so is worth the effort."

↑ comment by Dagon · 2024-03-17T15:00:12.057Z · LW(p) · GW(p)

Is this in a situation where you're limited in time or conversational turns? It seems like the follow-up clarification was quite successful, and for many people it would feel more comfortable than the more specific and detailed query.

In technical or professional contexts, saving time and conveying information more efficiently gets a bit more priority, but even then this seems like over-optimizing.

That said, I do usually include additional information or a conversational follow-up hook in my "I don't know" answers. You should expect to hear from me "I don't know, but I'd go at least 2 hours early if it's important", or "I don't know, what does Google Maps say?", or "I don't know, what time of day are you going?" or the like.

↑ comment by CstineSublime · 2024-03-16T23:34:45.065Z · LW(p) · GW(p)

I know this seems like a question with an obvious answer but it is surprisingly non-obvious: Why do you need to know how long it takes to drive to the conference venue? Or to put it another way: what decision will be influenced by their answer (and what level of precision and accuracy is sufficient to make that decision).

I realize this is just an example, but the point is it's not clear what decision you're trying to weigh up is even from the example. Is it a matter of whether you attend the event at the conference venue or not? Is it deciding whether you should seek overnight accommodation or not? Do you have another event you want to attend in the day and wonder if you can squeeze both in? etc. etc.

Another thing is I'm the kind of person to default to "I don't know" because I often don't even trust my own ability to give an estimate, and would feel terrible and responsible if someone made a poor decision because of my inept estimation. And I get very annoyed when people push me for answers I do not feel qualified to answer.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2024-03-17T19:44:12.479Z · LW(p) · GW(p)

A common experience I have is that it takes like 1-2 paragraphs of explanation for why I want this info (e.g. "Well I'm wondering if so-and-so should fly in a day earlier to travel with me but it requires going to a different airport and I'm trying to figure out whether the time it'd take to drive to me would add up to too much and also..."), but if they just gave me their ~70% confidence interval when I asked then we could cut the whole context-sharing.

Replies from: CstineSublime

↑ comment by CstineSublime · 2024-03-17T21:43:50.803Z · LW(p) · GW(p)

but if they just gave me their ~70% confidence interval when I asked then we could cut the whole context-sharing.

Would you say that as a convention most people assume you (or anyone) want a specific number rather than a range?

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2024-03-17T21:48:57.359Z · LW(p) · GW(p)

I’d say most people assume I want “the answer” rather than “some bits of information”.

Replies from: CstineSublime

↑ comment by CstineSublime · 2024-03-17T21:51:52.677Z · LW(p) · GW(p)

To be honest I'm not sure on the difference? Could you phrase that in a different way?

And do you think they feel they ought give you a specific number rather than a range that the number could exist in?

comment by Ben Pace (Benito) · 2019-08-31T02:49:58.300Z · LW(p) · GW(p)

Live a life worth leaving Facebook for.

comment by Ben Pace (Benito) · 2025-03-27T02:25:37.551Z · LW(p) · GW(p)

An idea I've been thinking about for LessOnline this year, is a blogging awards ceremony. The idea being that there's a voting procedure on the blogposts of the year, in a bunch of different categories, a shortlist is made and winners are awarded a prize.

I like opportunities for celebrating things in the online, written, truth-seeking ecosystem. I'm interested in reacts on whether people would be pro something like this happening, and comments on suggestions for how to do it well. (Epistemic status: tentatively excited about this idea.)

Here's my first idea for what the categories would be this year.

Blog of the year
Best original contribution (e.g. novel discovery, did some science, etc)
Best explanation of a complex idea
Biggest mistake admitted (H/T @ymekshout [LW · GW] for suggesting this one)
Best fiction
Best counter-argument
Best prediction
Most productive dialogue
Most beautiful non-fiction
Best new blog

As for structure, I'm not really sure. Here's my first idea.

How to nominate? Anyone can nominate a blogpost for $10. Anyone with a LessOnline ticket gets a free nomination. Anything published in 2024 is eligible.
How is it judged? For the first year I'd probably keep it small and simple, perhaps hand-selected ~10 judges and pay them a little to each read 5 nominations in 4 different categories and vote on those.

I've also not got a name in mind yet. It's not a generic "Blogging Awards", tons of blogposts would not naturally be included (e.g. food blogs, fashion blogs, travel blogs, etc). I think "Blogging-With-High-Epistemic-Aspirations Awards" is too long. "Rationalist Blogging Awards" is a reasonably narrow pointer but I don't want to risk intertwining too much with a narrow social group's identity when there's probably a good alternative name that also points toward the substance.

Suggestions and feedback appreciated!

comment by Ben Pace (Benito) · 2024-08-02T17:53:13.464Z · LW(p) · GW(p)

I believe that when people write 'tap out' in comment sections, they are actually supposed to write 'bow out', in almost all cases I've read it.

I regularly see comment sections where someone, to indicate it's going to be their last comment, writes that they're 'tapping out' at this point. They rarely mean they're conceding the point, I'm pretty sure they're just respectfully ending their participation in the conversation. But that's not the standard meaning of the phrase in the place that it comes from.

Here's ChatGPT explaining the two phrases (emphasis added).

The phrase "tap out" originates from the world of combat sports, particularly Brazilian Jiu-Jitsu and mixed martial arts (MMA). In these sports, a competitor signals submission or the desire to end a match by physically tapping their opponent, the mat, or even themselves. This act of tapping indicates that they can no longer continue the fight, either due to exhaustion, pain, or the risk of injury.

and

The phrase "bow out" means to gracefully or politely withdraw from a situation, activity, or role. It often implies a voluntary and dignified exit, often to avoid conflict or because it is the appropriate or respectful thing to do. The term can be used in various contexts, such as resigning from a job, ending participation in a project, or stepping down from a position of responsibility.

So I encourage any such people to change your usage accordingly!

Replies from: kave, Raemon

↑ comment by kave · 2024-08-02T18:58:02.656Z · LW(p) · GW(p)

As the person who suggested "bow out", I now think "tap out" is also reasonably acceptable.

The older sense of "tap out" is "to run out of money at a gambling establishment", which seems more similar to running out of time or energy budget for a thread, then signalling submission.

Probably it's still worth avoiding the confusion by "bow"ing rather than "tap"ping.

Also, for more LessWrong discussion of the phrase along the martial arts lines, see the Tapping Out tag [? · GW].

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2024-08-02T19:01:59.358Z · LW(p) · GW(p)

Oh interesting. I'm not convinced. I think it would make sense to say "I'm tapped out" if you mean it as an analogy to the running out of your budget version, like "I'm tapped out on energy for this thread", which is not the same as "I'm tapping out of this debate, you got me", and is also meaningfully different from the respectful "Thanks for the conversation, I'll bow out with this comment".

↑ comment by Raemon · 2024-08-02T19:04:50.592Z · LW(p) · GW(p)

This changed my mind (although only weakly, because I think in practice nobody seemed super confused about "tap-out")

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2024-08-02T19:11:25.743Z · LW(p) · GW(p)

That's understandable; personally I really like it when the spoken/written words literally mean what is being communicated even if the communication is successful without that.

Replies from: Raemon

↑ comment by Raemon · 2024-08-02T20:29:55.635Z · LW(p) · GW(p)

In this case the term is a random idiom that only means the-particular-thing because some other group decided that it colloquially means-that-thing, and that feels about as arbitrary as us deciding it colloqially means some other thing. (which apparently already happened).

Like, in some cultures "flip the bird" means "give the middle finger" which means "fuck you" which means "I'm unhappy with you and want to express that", but, that doesn't mean all English speakers need to think "flip the bird" means that set of things,.

But, seems good for avoiding unnecessary miscommunications between cultures.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2024-08-02T20:36:44.688Z · LW(p) · GW(p)

Not sure what you're saying is a random idiom, but if you mean 'tap out' is about as random as 'flip the bird', then that seems wrong to me. "Flip the bird" sounds like a fairly random phrase, closer to the randomness of cockney rhyming slang where the literal meaning has no relation to the new meaning (e.g. to go 'up the apple and stairs' means to 'go up the stairs', just because it rhymes). "Tapping out" refers to literally tapping your combat partner, which itself is a very natural choice of protocol, because you often cannot speak while you are being choked, it is not a random choice of action nor is the description somehow random.

Replies from: Raemon

↑ comment by Raemon · 2024-08-03T16:47:22.290Z · LW(p) · GW(p)

Yeah the fact that the idiom only really existed in this particular context and wasn’t random is fairly compelling.

comment by Ben Pace (Benito) · 2022-12-16T03:30:53.377Z · LW(p) · GW(p)

Sometimes a false belief about a domain can be quite damaging, and a true belief can be quite valuable.

For example, suppose there is a 1000-person company. I tend to think that credit allocation for the success of the company is heavy tailed, and that there's typically 1-3 people who the company just would zombify and die without, and ~20 people who have the key context and understanding that the 1-3 people can work with to do new and live things. (I'm surely oversimplifying because I've not ever been on the inside with a 1000-person company.) In this situation it's very valuable to know who the people are who deserve the credit allocation. Getting the wrong 1-3 people is a bit of a disaster. This means that discussing it, raising hypotheses, bringing up bad arguments, bringing up arguments due to motivated cognition, and so on, can be unusually costly, and conversations about it can feel quite fraught.

Other fraught topics include breaking up romantically, quitting your job, leaving a community club or movement. I think taboo tradeoffs have a related feeling, like bringing up whether to lie in a situation, whether to cheat in a situation, or when to exchange money for values like honor and identity and dignity.

Other times getting an answer slightly wrong does not have this drop-off in value, and so discussion is less fraught. Non-fraught topics include where to go for dinner, cool things you can do with ChatGPT,

Broadly lots of things that are heavy-tailed in the realm of credit allocation feel this way to me. The sad thing is that cultures often feel to me totally against raising hypotheses in domains that are fraught, unless the person goes through a super costly counter-signalling procedure to signal that they're doing it for the right reasons and not for the wrong ones.

Personally a bunch of conflict I've encountered has been from trying to use the sorts of reasoning tools and heuristics that I'd use for non-fraught subjects, for fraught-subjects. People rightly are pretty worried that something extremely bad will happen if I am even a little off or rely on a rough heuristic.

The domain that I think about this most at the minute is hiring/firing. I think this sort of breaking-of-relationships is pretty unnatural for humans i.e. this sort of thing didn't happen regularly in the ancestral environment as part of the course of normal life, it was usually a life-ending disaster if you were kicked out of the tribe or if someone decided to backstab you. It's something that all the human social alarm bells push back on. And yet it's one of the most important decisions. My dream is to be able to discuss whether to fire people in an organization as straightforwardly as we would discuss whether to add a feature to a product, but this is nonstandard and far more fraught (and the experiences of firing people, even if they basically went well from most perspectives, have been some of the most emotionally wrecking experiences).

I don't know what the right strategy is to have for being able to straightforwardly analyze and discuss 'fraught' topics. I mostly think it's a cultural thing that you have to practice, and take opportunities where you can to have low-stakes discussions about fraught topics. This is one of the strongest reasons to get into heated arguments about minor things. It helps you negotiate over and agree on the principles involved, which you can then rely on when a major thing comes up.

Replies from: Dagon, Viliam, jp

↑ comment by Dagon · 2022-12-16T21:05:49.235Z · LW(p) · GW(p)

I suspect the number/ratio of "key" personnel is highly variable, and in companies that aren't sole-founder-plus-employees, there is a somewhat fractal tree of cultural reinforcement, where as long as there's a sufficient preponderance of alignment at the level below the key person, the organization can survive the loss.

But that's different from your topic - you want to know how to turn fraught high-stakes topics into simpler more legible discussions. I'm not sure that's possible - the reason they're fraught is the SAME as the reason it's important to get it right. They're high-stakes because they matter. And they matter because it affects a lot of different dimensions of the operation and one's life, and those dimensions are entangled with each other BECAUSE of how valuable the relationship is to each side.

↑ comment by Viliam · 2022-12-17T16:27:23.722Z · LW(p) · GW(p)

The sad thing is that cultures often feel to me totally against raising hypotheses in domains that are fraught, unless the person goes through a super costly counter-signalling procedure to signal that they're doing it for the right reasons and not for the wrong ones.

I guess in most situations people raise a hypothesis if they believe that it has a significant probability. Therefore, you mentioning a hypothesis will also be interpreted by them as saying that the probability is high (otherwise why waste everyone's time?).

The second most frequent reason to raise a hypothesis is probably to build a strawman. You must signal it clearly, to avoid possible misunderstanding.

↑ comment by jp · 2022-12-16T13:05:18.993Z · LW(p) · GW(p)

This is great. Encouragement to turn it into a top level post if you want it.

comment by Ben Pace (Benito) · 2021-03-21T21:09:46.339Z · LW(p) · GW(p)

I'm thinking about the rigor of alternating strategies. Here are three examples.

Forward-Chaining vs Backward-Chaining
- To be rich, don't marry for money. Surround yourself by rich people and marry for love. But be very strict about not letting poor people into your environment.
- Scott Garrabrant's once described his Embedded Agency research to me as the most back-chaining in terms of the area of work, and the most forward-chaining within that area. Often quite unable to justify what he's working on in the short-term (e.g. 1 [LW · GW], 2 [LW · GW], 3 [LW · GW]) yet can turn out to be very useful later on (e.g. 1 [LW · GW]).
Optimism vs Pessimism
- Successful startup founders build a vision they feel incredible optimism and excitement about and are committed to making happen, yet falsify it as quickly as possible by building a sh*tty MVP and putting it in front of users, because you're probably wrong and the customer will show you what they want. Another name is "Vision vs Falsification".
Finding vs Avoiding (Needles in Haystacks)
- Some work is about finding the needle, and some is about mining hay whilst ensuring that you avoid 100% of needles.
- For example when trying to build a successful Fusion Power Generator [LW · GW], most things you build will fail, and you're searching through a wide space of designs for the small space of designs that will work. However, when you're building a bridge, you basically know how to do it, and you just want to avoid the edge cases where the bridge collapses in 10 years. You mostly know where the hay is, and you need very high reliability in avoiding the needles.
- Notably, when you're building the Fusion Power Generator, most of the surrounding work is like bridge building. Hiring people, organizing their accommodation, setting up a legal company, getting funding, etc, these are all known quantities and you just want to do the basics here. So, one needle-finding mission surrounded by a bunch of hay-finding-needle-averse missions, with both needing to be done well.

There's a rigor in finding success in the balance between these opposing strategies, and I like aiming for absoluteness – being able to shift with great lightness between the two.

What are other examples of opposing strategies, where success requires being able to alternate sharply between the two?

comment by Ben Pace (Benito) · 2019-11-14T07:35:58.684Z · LW(p) · GW(p)

Trying to think about building some content organisations and filtering systems on LessWrong. I'm new to a bunch of the things I discuss below, so I'm interested in other people's models of these subjects, or links to sites that solve the problems in different ways.

Two Problems

So, one problem you might try to solve is that people want to see all of a thing on a site. You might want to see all the posts on reductionism on LessWrong, or all the practical how-to guides (e.g. how to beat procrastination, Alignment Research Field Guide, etc), or all the literature reviews on LessWrong. And so you want people to help build those pages. You might also want to see all the posts corresponding to a certain concept, so that you can find out what that concept refers to (e.g. what is the term "goodhart's law" or "slack" or "mesa-optimisers" etc).

Another problem you might try to solve, is that while many users are interested in lots of the content on the site, they have varying levels of interest in the different topics. Some people are mostly interested in the posts on big picture historical narratives, and less so on models of one's own mind that help with dealing with emotions and trauma. Some people are very interested AI alignment, some are interested in only the best such posts, and some are interested in none.

I think the first problem is supposed to be solved by Wikis, and the second problem is supposed to be solved by Tagging.

Speaking generally, Wikis allow dedicated users to curated pages around certain types of content, highlighting the best examples, some side examples, writing some context for people arriving on the page to understand what the page is about. It's a canonical, update-able, highly editable page built around one idea.

Tagging is much more about filtering than about curating.

Tagging

Let me describe some different styles of tagging.

One the site lobste.rs there are about 100 tags in total. Most tags give a very broad description of an area of interest such as "haskell" "databases" and "compilers". These are shown next to posts on the frontpage. Most posts have 1-3 tags. This allows easy filtering by interest.

A site I've just been introduced to, and been fairly impressed by the tagging of, is called 'Gelbooru', an anime/porn image website where many images have over 100 tags, accurately describing everything contained in the image (e.g. "blue sky", "leaf", "person standing", etc). That is a site where the purpose is to search-by-tags. A key element that allows Gelbooru to function is that, while I think it probably has limited dispute mechanisms for resolving whether a tag is appropriate, that's fine because all tags are literal descriptions of objects in the image. There are no tags describing e.g. the emotions of people in the images, which would be much less easy to build common knowledge around. I do not really know how the site causes people to tag 100,000s of photos each with such scintillating tags as "arm rest", "monochrome" and "chair", but it seems to work quite well.

The first site uses tags as filters when looking at a single feed. As long as there is a manageable number of tags it's easy for an author to tag things appropriately, or for readers to helpfully tag things correctly. The second site uses tagging as primary method of finding content on the site - the homepage of the site is a search bar for tags.

In the former style, tags are about filtering for fairly different kinds of content. You might wonder why one should have tags rather than just subreddits, which also filter posts by interest quite well. A key distinction is that subreddits are typically non-overlapping, whereas tags overlap often. In general, a single post can have multiple tags, but a post belongs to a single subreddit. I currently think of tags as different lenses with which to view a single subreddit, and only when your interests are sufficiently non-overlapping with the current subreddit should you go through the effort to build a new subreddit. (With its own tags.)

There are some other (key) questions of how to build incentives for users to tag things correctly, and how to solve disputes over whether a tag is correct for a post. If, as lobste.rs above, LW should have a tagging system that only has ~100 tags, and is not attempting to solve disputes on a much larger scale like Wikipedia does, then I think applying a fairly straightforward voting system might suffice. This would look like:

When a post is tagged with "AI alignment", users can vote on the tag (with the same weight that they vote on a post), to indicate whether it's a fit for that tag. (This means tag-post objects have their own karma.)
Whoever added the tag to that post gets the karma that the tag-post object gets. (Perhaps a smaller reward proportional to this karma score, if it seems too powerful, but definitely still positive.)
New tags cannot be created by most users. New tags are added by the moderation team, though users can submit new tags to the mod team.

If so, when we end up building a tagging system on LessWrong, the goal should be to distinguish the main types of post people are interested in viewing, and create a limited number of tags that determine this. I think that building that would mainly help users who are viewing new content on the frontpage, and that for much more granular sorting of historical content, a wiki would be better placed.

Afterthought on conceptual boundary

The conceptual boundary is something like the following: A tag is literally just a list of posts, where you can just determine whether something is in that list or not. A Wiki is an editable text-field, curate-able with much more depth than a simple list. A Tag is a communal list object, a Wiki Page is a communal text-body object.

Replies from: Benito, Ruby

↑ comment by Ben Pace (Benito) · 2019-11-15T05:24:49.249Z · LW(p) · GW(p)

I spent an hour or two talking about these problems with Ruby. Here are two further thoughts. I will reiterate that I have little experience with wikis and tagging, so I am likely making some simple errors.

Connecting Tagging and Wikis

One problem to solve is that if a topic is being discussed, users want to go from a page discussing that topic to find a page that explains that topic, and lists all posts that discuss that topic. This page should be easily update-able with new content on the topic.

Some more specific stories:

A user reads a post on a topic, and wants to better understand what's already known about that topic and the basic ideas
A user is primarily interested in a topic, and wants to make sure to see all content about that topic

The solution for the first is to link to a page that contains all other posts on that topic. The solution to the second is to link to a wiki page on that topic. And one possible solution is to make both of those the same button.

This page is a combination of a Wiki and a Tag. It is a communally editable explanation of the concept, with links to key posts explaining it, and other pages that are related. And below that, it also has a post-list of every posts that is relevant, sortable by things like recency, karma, and relevancy. Maybe below that it even has its own Recent Discussion section, for comments on posts that have the tag. It's a page you can subscribe to (e.g. via RSS), and come back to to see discussion of a particular topic.

Now, to make this work, it's necessary that all posts that are in the category are successfully listed in the tag. One problem you will run into is that there are a lot of concepts in the space, so the number of such pages will quickly become unmanageable. "Inner Alignment", "Slack", "Game Theory", "Akrasia", "Introspection", "Corrigibility", etc, is a very large list, such that it is not reasonable to scroll through it and check if your post fits into any of them, and expect to do this successfully. You'll end up with a lot of Wiki pages with very incomplete lists.

This is especially bad, because the other use of the tag system you might be hoping for is the one described in the parent to this comment, where you can see the most relevant tags directly from the frontpage, to help with figuring out what you want to read. If you want to make sure to read all the AI alignment posts, it's not helpful to give you a tag that sometimes works, because then you still have to check all the other posts anyway.

However, there are three ways to patch this over. Firstly, the thing that will help the Wiki system the most here, is the ability to add posts to the Wiki page from the post page, instead of having to independently visit the Wiki page and then add it in. This helps the people who care about maintaining Wiki pages quite a bit, making their job much easier.

Secondly, you can help organise those tags in order of likely relevance. For example, if you link to a lot of posts that have the tag "AI alignment" then you probably are about AI alignment, so that tag should appear higher.

Thirdly, you can sort tags into two types. The first type is given priority, and is a very controlled set of concepts, that also get used for filtering on the frontpage. This is a small, stable set of tags that people learn and can easily confirm if you should be sorted by. The second is the much larger, user-generated set of tags that correspond to user-generated wiki pages, and there can be 100s of these.

In this world, wiki pages are split into two types: those that are tags and those that aren't. Those which are tags have a big post-list item that is searchable, maybe even a recent discussion section, and can be used to tag posts. Those that are not tags do not have these features and properties.

This idea seems fairly promising to me, and I don't see any problems with it yet. For the below, I'll call such a page a 'WikiTag'.

Conceptual updating

Speaking more generally, my main worry about a lot of systems like Wikis and Tagging is about something that is especially prevalent in science and in the sort of work we do on LessWrong, where we try to figure out better conceptual boundaries to draw in reality, and whereby old concepts get deprecated. I expect that on sites like lobste.rs and Gelbooru, tags rarely turn out to have been the wrong way to frame things. There are rarely arguments about whether something is really a blue sky, or just the absence of clouds. Whereas a lot of progress in science is this sort of subtle conceptual progress, where you maybe shouldn't have said that the object fell to the ground, but instead that the object and the Earth fell into each other at rates proportional to some function of their masses.

On LessWrong I think we've done a lot of this sort of thing.

We used to talk about optimisation daemons, now we talk about the inner alignment problem.
We used to talk about people being stupid and the world being mad, and now we talk about coordination problems.
We used to talk about agent foundations and now we maybe think embedded agency is a better conceptualisation of the problem.
In places like the in-person CFAR space I've heard talk of akrasia often deprecated and instead ideas like 'internal alignment' are discussed.
We made progress from TDT to UDT.

So I'm generally worried about setting up infrastructure that makes concepts get stuck in place, by e.g. whoever picked the name first.

One problem I was worried about, was that all post would have to be categorised according to the old names. In particular, post that have already been tagged 'optimisation daemons' would now have a hard time changing to being tagged 'inner alignment problem'.

However, after fleshing it out, I'm not so sure it's going to be a problem.

Firstly, it's not clear that old posts should have their tags updated. If there is a sequence of posts taking about akrasia and how to deal with it, it would be very confusing for those posts to have a tag for 'internal alignment', a term not mentioned anywhere in the post nor obviously related to the framing of the posts. Similarly for 'optimisation daemons' discussion to be called 'the inner alignment problem'.

Secondly, there's a fairly natural thing to do when such conceptual shifts in the conversation occur. You build a new WikiTag. Then you tag all the new posts, and write the wiki entry explaining the concept, and link back to the old concept. It just needs to say something like "Old work was done under the idea that objects fell down to the ground. We now think that the object and the Earth fall into each other, but you can see the old work and its experimental results on this page <link>. Plus here are some links to the key posts back then that you'll still want to know about today." And indeed if such a thing happens with agent foundations and embedded agency, or something, then it'll be necessary to have posts explaining how the old work fits into the current paradigm. That translational work is not done by renaming a tag, but by a person who understands that domain writing some posts explaining how to think about and use the old work, in the new conceptual framework. And those should be prominently linked to on the wiki/tag page.

So I think that this system does not have the problems I thought that it had.

I guess I'm still fairly worried about subtle errors, like if instead of a tag for 'Forecasting' we have a tag called 'Calibration' or 'Predictions', these would shift the discourse in different ways. I'm a bit worried about that. But I think it's likely that a small community like ours will overall be able to resist such small shifts, and that argument will prevail, even if the names are a little off sometimes. It sounds like a problem that makes progress a little slower but doesn't push it off the rails. And if the tag is sufficiently wrong then I expect we can do the process above, where we start a new tag and link back to the old tag. Or, if the conceptual shift is sufficiently small (e.g. 'Forecasting' -> 'Predictions') I can imagine renaming the tag directly.

So I'm no longer so worried about conceptual stickiness as a fundamental blocker to Wikis and Tagging as ways of organising the conceptual space.

Replies from: Vaniver, Pattern

↑ comment by Vaniver · 2019-11-16T01:37:13.550Z · LW(p) · GW(p)

As a general comment, StackExchange's tagging system seems pretty perfect (and battle-tested) to me, and I suspect we should just copy their design as closely as we can.

Replies from: habryka4

↑ comment by habryka (habryka4) · 2019-11-16T07:21:03.197Z · LW(p) · GW(p)

So, on StackExchange any user can edit any of the tags, and then there is a whole complicated hierarchy that exists for how to revert changes, how to approve changes, how to lock posts from being edited, etc.

Which is a solution, but it sure doesn't seem like an easy or elegant solution to the tagging problem.

Replies from: Vaniver

↑ comment by Vaniver · 2019-11-18T22:45:09.927Z · LW(p) · GW(p)

I think the peer review queue is pretty sensible in any world where there's "one ground truth" that you expect trusted users to have access to (such that they can approve / deny edits that cross their desk).

↑ comment by Pattern · 2019-11-23T21:18:28.408Z · LW(p) · GW(p)

and link back to the old concept.

It's also important to have the old concept link to the new concept.

↑ comment by Ruby · 2019-11-14T08:25:16.492Z · LW(p) · GW(p)

I'm currently working through my own thoughts and vision for tagging.

If and when we end up building a tagging system on LessWrong, the goal will be to distinguish the main types of post people are interested in viewing, and create a limited number of tags that determine this. I think building this will mainly help users who are viewing new content on the frontpage, and that for much more granular sorting of historical content, a wiki is better placed.

I'm pretty sure I disagree with this and object to you making an assertion that makes it sound like the team is definitely decided about what the goal of tagging system will be.

I'll write a proper response tomorrow.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2019-11-14T08:35:44.742Z · LW(p) · GW(p)

Hm, I think writing this and posting it at 11:35 lead to me phrasing a few things quite unclearly (and several of those sentences don't even make sense grammatically). Let me patch with some edits right now, maybe more tomorrow.

On the particular thing you mention, never mind the whole team, I myself am pretty unsure that the above is right. The thing I meant to write there was something like "If the above is right, then when we end up building a tagging system on LessWrong, the goal should be" etc. I'm not clear on whether the above is right. I just wanted to write the idea down clearly so it could be discussed and have counterarguments/counterevidence brought up.

Replies from: Ruby

↑ comment by Ruby · 2019-11-14T08:40:30.169Z · LW(p) · GW(p)

That clarifies it and makes a lot of sense. Seems my objection rested upon a misunderstanding of your true intention. In short, no worries.

I look forwards to figuring this out together.

comment by Ben Pace (Benito) · 2019-08-17T03:05:12.702Z · LW(p) · GW(p)

I block all the big social networks from my phone and laptop, except for 2 hours on Saturday, and I noticed that when I check Facebook on Saturday, the notifications are always boring and not something I care about. Then I scroll through the newsfeed for a bit and it quickly becomes all boring too.

And I was surprised. Could it be that, all the hype and narrative aside, I actually just wasn’t interested in what was happening on Facebook? That I could remove it from my life and just not really be missing anything?

On my walk home from work today I realised that this wasn’t the case. Facebook has interesting posts I want to follow, but they’re not in my notifications. They’re sparsely distributed in my newsfeed, such that they appear a few times per week, randomly. I can get a lot of value from Facebook, but not by checking once per week - only by checking it all the time. That’s how the game is played.

Anyway, I am not trading all of my attention away for such small amounts of value. So it remains blocked.

Replies from: Jacobian, adam_scholl, Raemon

↑ comment by Jacob Falkovich (Jacobian) · 2019-08-17T23:54:14.653Z · LW(p) · GW(p)

I've found Facebook absolutely terrible as a way to both distribute and consume good content. Everything you want to share or see is just floating in the opaque vortex of the f%$&ing newsfeed algorithm. I keep Facebook around for party invites and to see who my friends are in each city I travel too, I disabled notifications and check the timeline for less than 20 minutes each week.

OTOH, I'm a big fan of Twitter. (@yashkaf) I've curated my feed to a perfect mix of insightful commentary, funny jokes, and weird animal photos. I get to have conversations with people I admire, like writers and scientists. Going forward I'll probably keep tweeting, and anything that's a fit for LW I'll also cross-post here.

Replies from: Raemon

↑ comment by Raemon · 2019-08-18T02:30:59.871Z · LW(p) · GW(p)

This thread is the most bizarrely compelling argument that twitter may be better than FB

↑ comment by Adam Scholl (adam_scholl) · 2019-08-18T02:54:55.707Z · LW(p) · GW(p)

In my experience this problem is easily solved if you simply unfollow ~95% of your friends. You can mass unfollow relatively easily from the News Feed Preferences page in Settings. Ever since doing this, my Facebook timeline has had a high signal/noise ratio—I'm glad to encounter something like 85% of posts. Also, since this only produces ~5-20 minutes of reading/day, it's easy to avoid spending lots of time on the site.

Replies from: janshi, Benito

↑ comment by janshi · 2019-08-18T06:28:41.763Z · LW(p) · GW(p)

I did actually unfollow ~95% of my friends once but then found myself in that situation where suddenly Facebook became interesting again I was checking it more often. I recommend the opposite and follow as many friends from high school and work as possible (assuming you don’t work at a cool place).

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2019-08-18T17:45:45.000Z · LW(p) · GW(p)

Either way I’ll still only check it in a 2 hour window on Saturdays, so I feel safe trying it out.

↑ comment by Ben Pace (Benito) · 2019-08-18T03:01:24.567Z · LW(p) · GW(p)

Huh, 95% is quite extreme. But I realise this probably also solves the problem whereby if the people I'm interested in comment on *someone else's* wall, I still get to see it. I'll try this out next week, thx.

(I don't get to be confident I've seen 100% of all the interesting people's good content though, the news feed is fickle and not exhaustive.)

Replies from: adam_scholl

↑ comment by Adam Scholl (adam_scholl) · 2019-08-18T08:08:39.523Z · LW(p) · GW(p)

Not certain, but I think when your news feed becomes sparse enough it might actually become exhaustive.

Replies from: Raemon

↑ comment by Raemon · 2019-08-18T16:22:44.736Z · LW(p) · GW(p)

My impression is that sparse newsfeeds tend to start doing things you don't want.

↑ comment by Raemon · 2019-08-17T03:26:10.402Z · LW(p) · GW(p)

While I basically endorse blocking FB (pssst, hey everyone still saying insightful things on Facebook, come on over to LessLong.com!), but fwiw, if you want to keep tabs on things there, I think most reliably way is to make a friends-list of the people who seem especially high signal-to-noise-ratio, and then create a bookmark for specifically following that list.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2019-08-17T03:32:16.193Z · LW(p) · GW(p)

Yeah, it’s what I do with Twitter, and I’ll probably start this with FB. Won’t show me all their interesting convo on other people’s walls though. On a Twitter I can see all their replies, not on FB.

comment by Ben Pace (Benito) · 2019-09-03T00:07:38.536Z · LW(p) · GW(p)

Reading this post, where the author introspects and finds a strong desire to be able to tell a good story about their career, suggests that a way of understanding how people will make decisions will be heavily constrained by the sorts of stories about your career that are definitely common knowledge [LW · GW].

I remember at the end of my degree, there was a ceremony where all the students dressed in silly gowns and the parents came and sat in a circular hall while we got given our degrees and several older people told stories about how your children have become men and women, after studying and learning so much at the university.

This was a dumb/false story, because I'm quite confident the university did not teach these people most important skills for being an adult, and certainly my own development was largely directed by the projects I did on my own dime, not through much of anything the university taught.

But everyone was sat in a circle, where they could see each other listen to the speech in silence, as though it were (a) important and (b) true. And it served as a coordination mechanism, saying "If you go into the world and tell people that your child came to university and grew into an adult, then people will react appropriately and treat your child with respect and not look at them weird asking why spending 3 or 4 years passing exams with no bearing on the rest of their lives is considered worthy of respect." It lets those people tell a narrative, which in turn makes it seem okay for other people to send their kids to the university, and for the kids themselves to feel like they've matured.

Needless to say, I felt quite missed by this narrative, and only played along so my mother could have a nice day out. I remember doing a silly thing - I noticed I had a spot on my face, and instead of removing it that morning, I left it there just as a self-signal that I didn't respect the ceremony.

Anyway, I don't really have any narrative for my life at the minute. I recall Paul Graham saying that he never answers the question "What do you do?" with a proper answer (he says he writes Lisp compilers and that usually shuts people up). Perhaps I will continue to avoid narratives. But I think a healthy society would be able to give me a true narrative that I felt comfortable following.

Another solution would be to build a small circle of trusted and supportive friends with whom we share a narrative about me that I endorse, and try to continue to not want to get social support from a wider circle than that.

Peter Thiel has the opinion [LW · GW] that many of our stories are breaking down. I'm curious to hear others' thoughts on what stories we tell ourselves, which ones are intact, and which are changing.

Replies from: eigen

↑ comment by eigen · 2019-09-03T15:30:53.602Z · LW(p) · GW(p)

I remember the narrative breaking, really hard, in two particular occasions:

The twin towers attack.
The 2008 mortgage financial crisis.

I don't think, particularly, that the narrative is broken now, but I think that it has lost some of its harmony (Trump having won the 2014 elections, I believe, is a symptom of that).

This is very close to what fellows like Thiel and Weinstein are talking about. In this particular sense, yes, I understand it's crucial to maintain the narrative although I don't know anymore whose job it's—to keep it from breaking out entirely (for example, say, in a explosion of the American student debt, or China going awry with its USD holdings).

These stories are not part of any law of our universe, so they are bound to break at anytime. It takes only a few smart, uncaring individuals to tear at the fabric of reality until it breaks—that is not okay!

So that it's why I believe is happening at the macro-narrative; but to be more directed towards the individual, which is what your post seems to hint at, I don't think for a second that your life does not run from narrative, maybe that's a narrative itself. I believe further that some rituals are important to keep and to have an individual story is important to be able to do any work we deem important.

Replies from: Raemon

↑ comment by Raemon · 2019-09-03T16:42:09.237Z · LW(p) · GW(p)

(I'm not sure if you meant to reply to Benito's shortform comment here, or one of Ben's recent Thiel/Weinstein transcript posts)

Replies from: eigen

↑ comment by eigen · 2019-09-04T00:03:05.791Z · LW(p) · GW(p)

Yes!

It may be more apt for the fifth post in his sequence (Stories About Progress) but it's not posted yet. But I think it sort-of works in both and it's more of a shortform comment than anything!

comment by Ben Pace (Benito) · 2024-12-12T09:30:56.441Z · LW(p) · GW(p)

I've been re-reading tons of old posts for the review to remember them and see if they're worth nominating, and then writing a quick review if yes (and sometimes if no).

I've gotten my list of ~40 down to 3 long ones. If anyone wants to help out, here are some I'd appreciate someone re-reading and giving a quick review of in the next 2 days.

To Predict What Happens, Ask What Happens [LW · GW] by Zvi
A case for AI alignment being difficult [LW · GW] by Jessicata
Alexander and Yudkowsky on AGI goals [LW · GW] by Scott Alexander & Eliezer Yudkowsky

Replies from: Seth Herd, Seth Herd

↑ comment by Seth Herd · 2024-12-14T00:30:54.767Z · LW(p) · GW(p)

I did remember A case for AI alignment being difficult [LW · GW] and liked it, so I did that one too. My review for the Alexander/Yudkowsky dialogue got a little out of hand, but it did cover it.

↑ comment by Seth Herd · 2024-12-13T02:11:13.682Z · LW(p) · GW(p)

I don't remember 3 and it's up my alley, so I'll do that one.

comment by Ben Pace (Benito) · 2024-11-22T08:31:29.750Z · LW(p) · GW(p)

I was chatting with someone, and they said that a particular group of people seemed increasingly like a cult. I thought that was an unhelpful framing, and here's the rough argument I wrote for why:

There's lots of group dynamics that lead a group of people to go insane and do unethical things.
The dynamics around Bankman-Fried involve a lot of naivety when interfacing with an sociopath who was scamming people for billions of dollars on a massive scale.
The dynamics around Leverage Research involved lots of people with extremely little savings and income in a group house trying to do 'science' to claims of paranormal phenomena.
The dynamics around Jonestown involves total isolation from family, public humiliation and beatings for dissent, and a leader with personal connection to the divine.
These have all produced some amounts of insane and unethical behavior, to different extents, for quite different reasons.
They all deserve to be opposed to some extent. And it is pro-social to share information about their insanity and bad behavior.
Calling them 'cults' communicates that these are groups that have gone insane and done terrible things, but it also communicates that these groups are all the same, when in fact there's not always public beatings or paranormal phenomena or billions of dollars, and the dynamics are very different.
Conflating them confuses outside people, they have a harder time understanding whether the group is actually insane and what the dynamics are.

Replies from: 1a3orn, Viliam, Dagon, elityre

↑ comment by 1a3orn · 2024-11-22T13:06:29.069Z · LW(p) · GW(p)

So, if someone said that both Singapore and the United States were "States" you could also provide a list of ways in which Singapore and the United States differ -- consider size, attitude towards physical punishment, system of government, foreign policy, and so on and so forth. However -- share enough of a family resemblance that unless we have weird and isolated demands for rigor it's useful to be able to call them both "States."

Similarly, although you've provided notable ways in which these groups differ, they also have numerous similarities. (I'm just gonna talk about Leverage / Jonestown because the FTX thing is obscure to me)

They all somewhat isolated people, either actually physically (Jonestown) or by limiting people's deep interaction with outsiders ("Leverage research" by my recollection did a lot of "was that a worthwhile interaction?")
They both put immense individual pressure on people, in most cases in ways that look deliberately engineered and which were supposed to produce "interior conversion". Consider leverage's "Debugging" or what Wikipedia says about the People's Temple Precursor of Jonestown: "They often involved long "catharsis" sessions in which members would be called "on the floor" for emotional dissections, including why they were wearing nice clothes when others in the world were starving."
They both had big stories about How the World Works and how narratives in which they hold the Key for Fixing How the World Works.

(4. Fun fact: all of the above -- including FTX -- actually started in San Francisco.)

That's just the most obvious, but that's... already some significant commonality! If I did more research I expect I would find much much more.

My personal list for Sus about Cult Dynamics is a little more directly about compromised epistemics than the above. I'm extremely wary of groups that (1) bring you into circumstances where most everyone you are friends with is in the group, because this is probably the most effective way in history of getting someone to believe something, (2) have long lists of jargon with little clear predictive ability whose mastery is considered essential for Status with them -- historically this also looks like a good way to produce arbitrary Interior Conviction, albeit not quite as good as the first, (3) have leaders whose Texts you are supposed to Deeply Read and Interiorize, the kind of thing you to Close Readings. And of course (4) stories about the end of the world, because these have been a constant in culty dynamics for actual centuries, from the Munster Rebellion to Jonestown to.... other groups.

This list is a little fuzzy! Note that it includes groups that I like! I still have fond feelings for Communion and Liberation, though I am not a believer, and they pretty obviously have at least 3 / 4 of my personal list (no apocalypse with CL as far as I know, they're too chill for that). Human epistemics adapted for cladistic categories which are unusually tight; it would be a mistake to think that "cult" is as tight as "sheep" or as "lion," and if you start reasoning that "Cult includes X, Y is cult, so Y includes X" you might find you are mistaken quickly.

But "cult" does clearly denominate a real dynamic in the world, even if less tight than "sheep". When people find groups "culty," they are picking up on actual dynamics in those groups! And you shall not strike works from people's expressive vocabulary without replacing them with suitable replacement [LW · GW]. I think in many cases it is entirely reasonable to say "huh, seems culty" and "that groups seems like a cult" and that trying to avoid this language is trying to prevent an original seeing; that avoiding this language is trying to avoid seeing a causal mechanism that is operative in the world, rather than trying to actually see the world better.

Replies from: Benito, tailcalled

↑ comment by Ben Pace (Benito) · 2024-11-24T01:47:10.007Z · LW(p) · GW(p)

Thanks for the thoughts! I've not thought about this topic that much before, so my comment(s) will be longer as I'm figuring it out for myself, and in the process of generating hypotheses.

I'm hearing you say that while I have drawn some distinctions, that overall these groups still have major similarities, so the term accurately tracks reality and is helpful.

On further reflection I'm more sympathetic to this point; but granting it I'm still concerned that the term is net harmful for thinking.

My current sense is that a cult is the name given to a group that has gone off the rails. The group has

some weird beliefs
intends to behave in line with those beliefs
seems unable to change course
the individuals seem unable to change their mind
and the behavior seems to outsiders to be extremely harmful.

My concern is that the following two claims are true:

There are groups with seemingly closed epistemologies and whose behavior has a large effect size, in similar ways to groups widely considered to be 'cults', yet the outcomes are overall great and worth supporting.
There are groups with seemingly closed epistemologies and whose behavior has a large effect size, in similar ways to groups widely considered to be 'cults', yet are not called cults because they have widespread political support.

I'll talk through some potential examples.

Startups

Peter Thiel has said that a successful startup feels a bit like a cult. Many startups are led by a charismatic leader who believes in the product, surrounded by people who believe in the leader and the product, where outsiders don't get it at all and think it's a waste of time. The people in the company work extreme hours, regularly hitting sleep deprivation, and sometimes invest their savings into the project. The internal dynamics are complicated and political and sometimes cut-throat. Sometimes this pays off greatly, like with Tesla/SpaceX/Apple. Other times it doesn't, like with WeWork, or FTX, or just most startups where people work really hard and nothing comes of it.

I'd guess there are many people in this world who left a failed startup in a daze, wondering why they dedicated some of the best years of their lives to something and someone that in retrospect clearly wasn't worth it, not entirely dissimilar to someone leaving a more classical cult. However, it seems likely to me the distribution of startups is well-worth-it for civilization as a whole (with the exception of suicidal AI-companies).

(This is a potential example of number 1 above.)

Religions

Major religions have often done things just as insane and damaging as smaller cults, but aren't called cults. The standard list of things includes oppression of homosexuality and other sexualities, subjugation of women, genital mutilation, blasphemy laws, opposition to contraception in developing countries (exacerbating the spread of HIV/AIDS), death orders, censorship, and more.

It seems plausible to me that someone would do far more harm and become far more closed in their epistemology via joining the Islamic Republic of Iran or the Holy See in the Vatican than if they joined Scientology or one of the many other things that get called cults (e.g. a quick googling came up with cryptocurrencies, string theory, Donal Trump, and PETA). Yet it seems to me that these aren't given as examples of cults, only the smaller religions that are easier to oppose and which have little political power get that name. Scientology seems to be the most powerful one where people feel like they can get away with calling it a cult.

(This is a potential example of number 2 above.)

Education

A hypothesis I take seriously is that schooling is a horrible experience for kids, and the systems don't change because children are often not respected as whole people and can be treated as subhuman.

Kids are forced to sit still for something like more-than-10% of the hours of their childhood, and regularly complain about this and seem to me kind of psychologically numbed by it.
I seem to recall a study that all homework other than mathematics had zero effect on learning success, and also I think I recall a study from Scandinavia where kids who joined school when they were 7 or 8 quickly caught up to their peers (suggesting the previous years had been ~pointless). I suspect Bryan Caplan's book-length treatment of education will have some reliable info making this point (even though I believe he focuses on higher education).
I personally found university a horrible experience. Leaving university I had a strong sense of "I need to get away from this, why on Earth did I do that?" and a sense that everyone there was kind of in on a mass delusion where your status in the academic system was very important and mattered a great deal and you should really care about the system. A few years ago I had a phone call with an old friend from high-school who was still studying in the education system at the age of ~25, and I encouraged them to get out of it and grow up into a whole person.

There's not a charismatic leader here, but I believe there's some mass delusion and very harmful outcomes. I don't think the education system should be destroyed, but I think it probably causes more harm than many things more typically understood to be cults (as most groups with dedicated followings and charismatic leaders have very little effect size either way), and my sense is that many people involved are extremely resistant that they are not doing what's best for the children or are doing some bad things.

(This is a potential example of both numbers 1 and 2 above.)

———

To repeat: my concern is that the things that are common to cults is more like "what groups with closed epistemologies and unusual behavior is it easy to coordinate on destroying" rather than "what groups have closed epistemologies and behavior with terrible effects".

If so, while I acknowledge that many of the groups that are widely described as "cults" probably have closed epistemologies and cause a lot of damage, I am concerned that whether a group is called a cult is primarily a political question about whether you can backing for destroying it in this case.

Replies from: sharmake-farah, tailcalled

↑ comment by Noosphere89 (sharmake-farah) · 2024-11-24T22:13:42.429Z · LW(p) · GW(p)

To talk about the education example, while I do think that the education system can have a lot of problems, I'd say a crux here is that easy classes anti-predict learning, and a lot of kid complaints on schooling would probably making kids learn worse, because hardness is correlated to learning:

https://www.oneusefulthing.org/p/post-apocalyptic-education

https://x.com/emollick/status/1756396139623096695

↑ comment by tailcalled · 2024-11-24T09:06:19.381Z · LW(p) · GW(p)

A possible model is that while good startups have an elevation in the "cult-factor", they have an even greater elevation in the unique factor related to the product they are building. Like SpaceX has cult-like elements but SpaceX also has Mars and Mars is much bigger than the cult-like elements, so if we define a cult to require that the biggest thing going on for them is cultishness then SpaceX is not a cult.

This is justified by LDSL (I really should write up the post explaining it...).

Replies from: sharmake-farah

↑ comment by Noosphere89 (sharmake-farah) · 2024-11-24T15:25:16.436Z · LW(p) · GW(p)

I'd say that the reason why the SpaceX cult/business can actually make working rockets is because they have rich feedback from reality when they try to design rockets, even at the pre-testing stage, because while it's not obvious to a layperson if a rocket does work, it is relatively easy to check the physics of whether a new rocket does work for an expert, meaning the checking of claims can be made legible, which is an enemy to cults in general.

More generally, I'd say the difference between a cult and a high-impact startup/business is whether they can get rich and reliable feedback from a source, and secondarily how legible their theory of impact/claims are.

Bigness alone doesn't cut it.

↑ comment by tailcalled · 2024-11-22T16:58:25.076Z · LW(p) · GW(p)

Singapore and the US both have a military, a police, and taxation. This seems much more clear-cut to me than "cults" do.

I think maybe one could treat "cult" more like a pronoun than like a theoretical object. Like when one is in the vicinity of one of the groups Ben Pace mentioned, it makes sense to have a short term to talk about the group, and "the cult" is useful for disambiguating the cult from other groups that might be present.

↑ comment by Viliam · 2024-11-23T21:33:12.459Z · LW(p) · GW(p)

Some behaviors are red flags, for example "isolating you from unsupervised talking to people outside the group" or "expecting you to report your private thoughts to your superiors".

I wish we had a convenient handle for this set of red flags, and in a better world perhaps "cult" could be the word, but unfortunately in our world people mostly focus on things like "different from my group" and "seem weird".

EDIT: 1a3orn already said it [LW(p) · GW(p)] better.

↑ comment by Dagon · 2024-11-22T17:56:16.077Z · LW(p) · GW(p)

I wonder if you're objecting to identifying this group as cult-like, or to implying that all cults are bad and should be opposed. Personally, I find a LOT of human behavior, especially in groups, to be partly cult-like in their overfocus on group-identification and othering of outsiders, and often in outsized influence of one or a few leaders. I don't think ALL of them are bad, but enough are to be a bit suspicious without counter-evidence.

↑ comment by Eli Tyre (elityre) · 2024-11-26T19:45:01.948Z · LW(p) · GW(p)

I agree. I'm reminded of Scott's old post The Cowpox of Doubt, about how a skeptics movement focused on the most obvious pseudoscience is actually harmful to people's rationality because it reassures them that rationality failures are mostly obvious mistakes that dumb people make instead of hard to notice mistakes that I make.

And then we get people believing all sorts of shoddy research – because after all, the world is divided between things like homeopathy that Have Never Been Supported By Any Evidence Ever, and things like conventional medicine that Have Studies In Real Journals And Are Pushed By Real Scientists.

Calling groups cults feels similar, in that it allows one to write them off as "obviously bad" without need for further analysis, reassures one that their own groups (which aren't cults, of course) are obviously unobjectionable.

comment by Ben Pace (Benito) · 2019-10-11T07:46:30.741Z · LW(p) · GW(p)

At the SSC Meetup tonight in my house, I was in a group conversation. I asked a stranger if they'd read anything interesting on the new LessWrong in the last 6 months or so (I had not yet mentioned my involvement in the project). He told me about an interesting post about the variance in human intelligence compared to the variance in mice intelligence. I said it was nice to know people read the posts I write [LW · GW]. The group then had a longer conversation about the question. It was enjoyable to hear strangers tell me about reading my posts.

comment by Ben Pace (Benito) · 2019-08-18T00:13:21.253Z · LW(p) · GW(p)

I've finally moved into a period of my life where I can set guardrails around my slack without sacrificing the things I care about most. I currently am pushing it to the limit, doing work during work hours, and not doing work outside work hours. I'm eating very regularly, 9am, 2pm, 7pm. I'm going to sleep around 9-10, and getting up early. I have time to pick up my hobby of classical music.

At the same time, I'm also restricting the ability of my phone to steal my attention. All social media is blocked except for 2 hours on Saturday, which is going quite well. I've found Tristan Harris's advice immensely useful - my phone is increasingly not something that I give all of my free attention to, but instead something I give deliberate attention and then stop using. Tasks, not scrolling.

Now I have weekends and mornings though, and I'm not sure what to do with myself. I am looking to get excited about something, instead of sitting, passively listening to a comedy podcast while playing a game on my phone. But I realise I don't have easy alternative options - Netflix is really accessible. I suppose one of the things that a Sabbath is supposed to be is an alarm, showing that something is up, and at the minute I've not got enough things I want to do for leisure that don't also feel a bit like work.

So I'm making lists of things I might like (cooking, reading, improv, etc) and I'll try those.

Replies from: Raemon, janshi

↑ comment by Raemon · 2019-08-18T03:36:20.631Z · LW(p) · GW(p)

So I'm making lists of things I might like (cooking, reading, improv, etc) and I'll try those

This comment is a bit interesting in terms of it's relation to this old comment of yours [LW(p) · GW(p)](about puzzlement over cooking being a source of slack)

I realize this comment isn't about cooking-as-slack per se, but curious to hear more about your shift in experience there (since before it didn't seem like cooking as a thing you did much at all)

↑ comment by janshi · 2019-08-18T06:22:13.924Z · LW(p) · GW(p)

Try practicing doing nothing I.e. meditation and see how that goes. When I have nothing particular to do my mind needs some time to make the switch from that mode where it tries to distract itself by coming up with new things it wants to do until finally it reaches a state where it is calm and steady. I consider that state the optimal one to be in since only then my thoughts are directed deliberately at neglected and important issues rather than exercising learned thought patterns.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2019-08-18T17:50:28.248Z · LW(p) · GW(p)

I think you’re missing me with this. I’m not very distractable and I don’t need to learn to be okay with leisure time. I’m trying to actually have hobbies, and realising that is going to take work.

I could take up meditation as a hobby, but at the minute I want things that are more social and physical.

comment by Ben Pace (Benito) · 2020-07-13T22:26:14.482Z · LW(p) · GW(p)

Why has nobody noticed that the OpenAI logo is three intertwined paperclips? This is an alarming update about who's truly in charge...

comment by Ben Pace (Benito) · 2019-07-19T12:30:59.544Z · LW(p) · GW(p)

I think of myself as pretty skilled and nuanced at introspection, and being able to make my implicit cognition explicit.

However, there is one fact about me that makes me doubt this severely, which is that I have never ever ever noticed any effect from taking caffeine.

I've never drunk coffee, though in the past two years my housemates have kept a lot of caffeine around in the form of energy drinks, and I drink them for the taste. I'll drink them any time of the day (9pm is fine). At some point someone seemed shocked that I was about to drink one after 4pm, and I felt like I should feel bad or something, so I stopped. I've not been aware of any effects.

But two days ago, I finally noticed. I had to do some incredibly important drudge work, and I had two red bulls around 12-2pm. I finished work at 10pm. I realised that while I had not felt weird in any way, I had also not had any of the normal effects of hanging around for hours, which is getting tired, distracted, needing to walk around, wanting to do something different. I had a normal day for 10 hours solely doing crappy things I normally hate.

So I guess now I see the effect of caffeine: it's not a positive effect, it just removes the normal negative effects of the day. (Which is awesome.)

comment by Ben Pace (Benito) · 2024-12-16T07:50:14.936Z · LW(p) · GW(p)

List of posts that seem promising to me, that are about to fall out of the annual review in 10 mins because only I have voted on them:

comment by Ben Pace (Benito) · 2023-02-25T00:50:20.892Z · LW(p) · GW(p)

I think I've been implicitly coming to believe that (a) all people are feeling emotions all the time, but (b) people vary in how self-aware they are of these emotions.

Does anyone want to give me a counter-argument or counter-evidence to this claim?

Replies from: Vladimir_Nesov, Dagon

↑ comment by Vladimir_Nesov · 2023-02-25T00:52:05.625Z · LW(p) · GW(p)

people vary in how self-aware they are of these emotions

People vary in how relevant their emotions are to anything in their life.

↑ comment by Dagon · 2023-02-25T17:26:46.087Z · LW(p) · GW(p)

I think I need an operational definition of "feeling emotion", especially when not aware of it, in order to agree or disagree. I think for many reasonable definitions, like "illegible reactions below the level of self-modeling of causality", it's extremely common for this to affect almost everyone almost all the time.

I'll still dispute "all", but it wouldn't surprise me if it were close. It is still highly variable (over time and across individuals) how much impact emotions have on behaviors and choices. And if you mean to imply "semi-legible abstract structures with understandable causes, impacts, and ways to communicate about them", then I pretty much fully disagree.

Note that as someone who is sometimes less aware of (and I believe less impacted by) their emotions than many seem to be, I strenuously object to being told what I'm feeling by someone who has no clue what (if anything) I'm feeling. And if you're rounding "low impact" to "not feeling", I object to being excluded from the set of "all people".

(only because it's relevent) Note that my "strenuous objection" is mostly about the lack of precision or correctness of the statement - you're free to believe what you like. I'm not actually offended, as far as I can tell.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2023-02-25T21:06:32.487Z · LW(p) · GW(p)

Not sure if this answers your question, but recently I had an assistant [LW(p) · GW(p)] who would ask me questions about how I was feeling. Often, when I was in the midst of focusing on some difficult piece of work, I would answer "I don't know", and get back to focusing on the work.

My vague recollection is that she later showed me notes she'd written that said I was sighing deeply, holding my forehead, had my shoulders raised, was occasionally talking to myself, and I came to realize I was feeling quite anxious at those times, but this information wasn't accessible to the most aware and verbal part of me.

To be clear, I don't think I'm totally unaware in general! I often know how I'm feeling, and am sometimes aware of being anxious, though I do find it in-particular a somewhat slippery thing to be aware of.

comment by Ben Pace (Benito) · 2020-03-27T06:55:35.650Z · LW(p) · GW(p)

Hot take: The actual resolution to the simulation argument is that most advanced civilizations don't make loads of simulations.

Two things make this make sense:

Firstly, it only matters if they make unlawful simulations. If they make lawful simulations, then it doesn't matter whether you're in a simulation or a base reality, all of your decision theory and incentives are essentially the same, you want to take the same decisions in all of the universes. So you can make lots of lawful simulations, that's fine.
Secondly, they will strategically choose to not make too many unlawful simulations (to the level where the things inside are actually conscious). This is because to do so would induce anthropic uncertainty over themselves. Like, if the decision-theoretical answer is to not induce anthropic uncertainty over yourself about whether you're in a simulation, then by TDT everyone will choose not to make unlawful simulations.

I think this is probably wrong in lots of ways but I didn't stop to figure them out.

Replies from: daniel-kokotajlo, Benito

↑ comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-03-27T11:07:04.271Z · LW(p) · GW(p)

Your first point sounds like it is saying we are probably in a simulation, but not the sort that should influence our decisions, because it is lawful. I think this is pretty much exactly what Bostrom's Simulation Hypothesis is, so I think your first point is not an argument for the second disjunct of the simulation argument but rather for the third.

As for the second point, well, there are many ways for a simulation to be unlawful, and only some of them are undesirable--for example, a civilization might actually want to induce anthropic uncertainty in itself, if it is uncertainty about whether or not it is in a simulation that contains a pleasant afterlife for everyone who dies.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2020-03-27T23:24:08.169Z · LW(p) · GW(p)

I don't buy that it makes sense to induce anthropic uncertainty. It makes sense to spend all of your compute to run emulations that are having awesome lives, but it doesn't make sense to cause yourself to believe false things.

Replies from: daniel-kokotajlo

↑ comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-03-28T10:54:55.209Z · LW(p) · GW(p)

I'm not sure it makes sense either, but I don't think it is accurately described as "cause yourself to believe false things." I think whether or not it makes sense comes down to decision theory. If you use evidential decision theory, it makes sense; if you use causal decision theory, it doesn't. If you use functional decision theory, or updateless decision theory, I'm not sure, I'd have to think more about it. (My guess is that updateless decision theory would do it insofar as you care more about yourself than others, and functional decision theory wouldn't do it even then.)

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2020-03-28T18:23:00.077Z · LW(p) · GW(p)

I just don’t think it’s a good decision to make, regardless of the math. If I’m nearing the end of the universe, I prefer to spend all my compute instead maximising fun / searching for a way out. Trying to run simulations to make it so I no longer know if I’m about to die seems like a dumb use of compute. I can bear the thought of dying dude, there’s better uses of that compute. You’re not saving yourself, you’re just intentionally making yourself confused because you’re uncomfortable with the thought of death.

Replies from: daniel-kokotajlo

↑ comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-03-28T23:30:03.874Z · LW(p) · GW(p)

Well, that wasn't the scenario I had in mind. The scenario I had in mind was: People in the year 2030 pass a law requiring future governments to make ancestor simulations with happy afterlives, because that way it's probable that they themselves will be in such a simulation. (It's like cryonics, but cheaper!) Then, hundreds or billions of years later, the future government carries out the plan, as required by law.

Not saying this is what we should do, just saying it's a decision I could sympathize with, and I imagine it's a decision some fraction of people would make, if they thought it was an option.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2020-03-28T23:40:07.661Z · LW(p) · GW(p)

Thinking more, I think there are good arguments for taking actions that as a by-product induce anthropic uncertainty; these are the standard hansonian situation where you build lots of ems of yourself to do bits of work then turn them off.

But I still don't agree with the people in the situation you describe because they're optimising over their own epistemic state, I think they're morally wrong to do that. I'm totally fine with a law requiring future governments to rebuild you / an em of you and give you a nice life (perhaps as a trade for working harder today to ensure that the future world exists), but that's conceptually analogous to extending your life, and doesn't require causing you to believe false things. You know you'll be turned off and then later a copy of you will be turned on, there's no anthropic uncertainty, you're just going to get lots of valuable stuff.

↑ comment by Ben Pace (Benito) · 2020-03-27T07:25:25.417Z · LW(p) · GW(p)

The relevant intuition to the second point there, is to imagine you somehow found out that there was only one ground truth base reality, only one real world, not a multiverse or a tegmark level 4 verse or whatever. And you're a civilization that has successfully dealt with x-risks and unilateralist action and information vulnerabilities, to the point where you have the sort of unified control to make a top-down decision about whether to make massive numbers of civilizations. And you're wondring whether to make a billion simulations.

And suddenly you're faced with the prospect of building something that will make it so you no longer know whether you're in the base universe. Someday gravity might get turned off because that's what your overlords wanted. If you pull the trigger, you'll never be sure that you weren't actually one of the simulated ones, because there's suddenly so many simulations.

And so you don't pull the trigger, and you remain confident that you're in the base universe.

This, plus some assumptions about all civilizations that have the capacity to do massive simulations also being wise enough to overcome x-risk and coordination problems so they can actually make a top-down decision here, plus some TDT magic whereby all such civilizations in the various multiverses and Tegmark levels can all coordinate in logical time to pick the same decision... leaves there being no unlawful simulations.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2020-03-27T08:00:28.333Z · LW(p) · GW(p)

My crux here is that I don't feel much uncertainty about whether or not our overlords will start interacting with us (they won't and I really don't expect that to change), and I'm trying to backchain from that to find reasons why it makes sense.

My basic argument is that all civilizations that have the capability to make simulations that aren't true histories (but instead have lots of weird stuff happen in them) will all be philosophically sophisticated to collectively not do so, and so you can always expect to be in a true history and not have weird sh*t happen to you like in The Sims. The main counterargument here is to show that there are lots of civilizations that will exist with the powers to do this but lacking the wisdom to not do it. Two key examples that come to mind:

We build an AGI singleton that lacks important kinds of philosophical maturity, so makes lots of simulations that ruins the anthropic uncertainty for everyone else.
Civilizations at somewhere around our level get to a point where they can create massive numbers of simulations but haven't managed to create existential risks like AGI. Even while you might think our civilization is pretty close to AGI, I could imagine alternative civilizations that aren't, just like I could imagine alternative civilizations that are really close to making masses of ems but that aren't close enough to AGI. This feels like a pretty empirical question about whether such civilizations are possible and whether they can have these kinds of resources without causing an existential catastrophe / building singleton AGI.

Replies from: Zack_M_Davis, daniel-kokotajlo

↑ comment by Zack_M_Davis · 2020-03-27T15:13:16.484Z · LW(p) · GW(p)

Why appeal to philosophical sophistication rather than lack of motivation? Humans given the power to make ancestor-simulations would create lots of interventionist sims (as is demonstrated by the populatity games like The Sims), but if the vast hypermajority of ancestor-simulations are run by unaligned AIs doing their analogue of history research, that could "drown out" the tiny minority of interventionist simulations.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2020-03-27T23:34:22.328Z · LW(p) · GW(p)

That's interesting. I don't feel comfortable with that argument, it feels too much like random chance whether or not we should expect ourselves to be in an interventionist universe or not, whereas I feel like I should be able to find strong reasons to not be in an interventionist universe.

Replies from: Zack_M_Davis

↑ comment by Zack_M_Davis · 2020-03-28T03:16:51.886Z · LW(p) · GW(p)

Alternatively, "lawful universe" has lower Kolmogorov complexity than "lawful universe plus simulator intervention" and thereore gets exponentially more measure under the universal prior?? (See also "Infinite universes and Corbinian otaku" and "The Finale of the Ultimate Meta Mega Crossover".)

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2020-03-28T03:58:42.172Z · LW(p) · GW(p)

Now that's fun. I need to figure out some more stuff about measure, I don't quite get why some universes should be weighted more than others. But I think that sort of argument is probably a mistake - even if the lawful universes get more weighting for some reason, unless you also have reason to think that they don't make simulations, there's still loads of simulations within each of their lawful universes, setting the balance in favour of simulation again.

↑ comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-03-27T11:12:26.635Z · LW(p) · GW(p)

One big reason why it makes sense is that the simulation is designed for the purpose of accurately representing reality.

Another big reason why (a version of it) makes sense is that the simulation is designed for the purpose of inducing anthropic uncertainty in someone at some later time in the simulation. e.g. if the point of the simulation is to make our AGI worry that it is in a simulation, and manipulate it via probable environment hacking, then the simulation will be accurate and lawful (i.e. un-tampered-with) until AGI is created.

I think "polluting the lake" by increasing the general likelihood of you (and anyone else) being in a simulation is indeed something that some agents might not want to do, but (a) it's a collective action problem, and (b) plenty of agents won't mind it that much, and (c) there are good reasons to do it even if it has costs. I admit I am a bit confused about this though, so thank you for bringing it up, I will think about it more in the coming months.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2020-03-27T23:35:45.186Z · LW(p) · GW(p)

Another big reason why (a version of it) makes sense is that the simulation is designed for the purpose of inducing anthropic uncertainty in someone at some later time in the simulation. e.g. if the point of the simulation is to make our AGI worry that it is in a simulation, and manipulate it via probable environment hacking, then the simulation will be accurate and lawful (i.e. un-tampered-with) until AGI is created.

Ugh, anthropic warfare, feels so ugly and scary. I hope we never face that sh*t.

comment by Ben Pace (Benito) · 2019-08-29T01:33:11.780Z · LW(p) · GW(p)

I think in many environments I'm in, especially with young people, the fact that Paul Graham is retired with kids sounds nice, but there's an implicit acknowledgement that "He could've chosen to not have kids and instead do more good in the world, and it's sad that he didn't do that". And it reassures me to know that Paul Graham wouldn't reluctantly agree. He'd just think it was wrong.

Replies from: habryka4

↑ comment by habryka (habryka4) · 2019-08-29T17:45:23.165Z · LW(p) · GW(p)

But, like, he is wrong? I mean, in the sense that I expect a post-CEV Paul Graham to regret his choices. The fact that he does not believe so does the opposite of reassuring me, so I am confused about this.

Replies from: mr-hire, Benito

↑ comment by Matt Goldenberg (mr-hire) · 2019-08-29T18:56:42.754Z · LW(p) · GW(p)

I think part of the problem here is underspecification of CEV.

Let's say Bob has never been kind to anyone unless its' in his own self interest. He has noticed that being selfless is sort of an addictive thing for people, and that once they start doing it they start raving about how good it feels, but he doesn't see any value in it right now. So he resolves to never be selfless, in order to never get hooked.

There are two ways for CEV to go in this instance, one way is to never allow bob to make a change that his old self wouldn't endorse. Another way would be to look at all the potential changes he could make, posit a version of him that has had ALL the experiences and is able to reflect on them, then say "Yeah dude, you're gonna really endorse this kindness thing once you try it."

I think the second scenario is probably true for many other experiences than kindness, possibly including having children, enlightenment, etc. From our current vantage point it feels like having children would CHANGE our values, but another interpretation is that we always valued having children, we just never had the qualia of having children so we don't understand how much we would value that particular experience.

↑ comment by Ben Pace (Benito) · 2019-08-29T17:50:39.526Z · LW(p) · GW(p)

What reasoning do you have in mind when you say you think he'll regret his choices?

comment by Ben Pace (Benito) · 2025-03-09T08:16:51.293Z · LW(p) · GW(p)

Something a little different: Today I turn 28. If you might be open to do something nice for me for my birthday, I would like to request the gift of data. I have made a 2-4 min anonymous survey about me as a person, and if you have a distinct sense of me as a person (even just from reading my LW posts/comments) I would greatly appreciate you filling it out and letting me know how you see me!

Here's the survey.

It's an anonymous survey where you rate me on lots of attributes like "anxious", "honorable", "wise" and more. All multiple-choice. Two years ago I also shared a birthday survey amongst people who know me and ~70 people filled it out, and I learned a lot from it. I am very excited to see how the perception of me amongst the people I know has *changed*, and also to find out how people on LessWrong see me, so the core of this survey is ~20 of the same attributes.

In return for your kind gift, if you complete it, you get to see the aggregate ratings of me from last time!

This survey helps me understand how people see me, and recognize my blindspots, and I'm very grateful to anyone who takes a few mins to complete it. Two people completed it already and it took them 2 mins and 4 mins to complete it. (There are many further optional questions but it says clearly when the main bit is complete.)

I of course intend to publish the (aggregate) data in a LW post and talk about what I've learned from it :-)

Replies from: GAA

↑ comment by Guive (GAA) · 2025-03-09T09:37:43.635Z · LW(p) · GW(p)

When I click the link I see this:

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2025-03-09T17:20:58.563Z · LW(p) · GW(p)

Edited, should be working fine now, thx!

comment by Ben Pace (Benito) · 2023-11-06T03:18:56.149Z · LW(p) · GW(p)

I've skimmed more than half of Anthropic's scaling policies doc. Key issues that stood out to me was the lack of incentive for any red-teamers to actually succeed at red-teaming. Perhaps I missed it, but I didn't see anything saying that the red-teamers had to necessarily not also have Anthropic equity. I also didn't see much other financial incentive for them to succeed. I would far prefer a world where Anthropic committed to put out a bounty of increasing magnitude (starting at like $50k, going up to like $2M) for external red-teamers (who signed NDAs) to try to jailbreak the systems, where the higher payouts happen as they keep breaking Anthropic's iterations on the same model. I'd especially like a world where OpenAI engineers could try to red-team Anthropic's models and Anthropic engineer's could try to red-team OpenAI's models. If the companies could actually stop the other company from shipping via red-teaming, that would seem to me actually sufficient to get humanity to be doing a real job of red-teaming the models here.

Replies from: LosPolloFowler

↑ comment by Stephen Fowler (LosPolloFowler) · 2023-11-06T03:39:22.681Z · LW(p) · GW(p)

Is this what you'd cynically expect from an org regularizing itself or was this a disappointing surprise for you?

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2023-11-06T04:00:48.591Z · LW(p) · GW(p)

Mm, I was just trying to answer "what do I think would actually work".

Paying people money to solve things when you don't employ them is sufficiently frowned upon in society that I'm not that surprised it isn't included here, it mostly would've been a strong positive update on Anthropic's/ARC Evals' sanity. (Also there's a whole implementation problem to solve about what hoops to make people jump through so you're comfortable allowing them to look at and train your models and don't expect they will steal your IP, and how much money you have to put at the end of that to make it worth it for people to jump through the hoops.)

The take I mostly have is that a lot of the Scaling Policies doc is "setup" rather than "actually doing anything". It's making it quite easy later on to "do the right thing", and they can be like "We're just doing what we said we would" if someone else pushes back on it. It also helps bully other companies into doing the right thing. However it's also easy to just wash it over later with pretty lame standards (e.g. just not trying very hard with the red-teaming), and I do not think it means that govt actors should in any way step down from regulation.

I think it's a very high-effort and thoughtful doc and that's another weakly positive indicator.

Replies from: ryan_greenblatt

↑ comment by ryan_greenblatt · 2023-11-06T04:29:40.501Z · LW(p) · GW(p)

Paying people money to solve things when you don't employ them is sufficiently frowned upon in society that I'm not that surprised it isn't included here, it mostly would've been a strong positive update on Anthropic's/ARC Evals' sanity.

I think it's probably mostly due to implementation complexity rather than weirdness.

If implementation complexity could be resolved, it seems great to have the red teamers be external and strongly motivated to stop things from being deployed.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2023-11-06T05:46:55.087Z · LW(p) · GW(p)

Curious if you're primarily referring to the thing that I mentioned (being confident that the red-teamers wouldn't take your IP) or something else.

Replies from: ryan_greenblatt

↑ comment by ryan_greenblatt · 2023-11-06T06:03:55.941Z · LW(p) · GW(p)

All of:

IP
General desire for secrecy (not wanting people to know if you just trained a model)
Needing to support people
Verification being non-trivial in some cases

I think having lots of red teamers is way less appealing for complex tasks where you'd have to support them for value to be produced (finetuning well is non-trivial etc.). And verifying if they've done good work is also non-trivial.

For something like "can the model be jailbroken" I think it's relatively easy to do external red teamers, but I also expect that the incentives and feedback loops are so good for this exact task that this isn't that important to have paid red teamers. Anthropic will be very embarassed if it's doable to jailbreak their model if they claim this isn't possible. It's also pretty easy to just make a closed beta somewhat before release.

(Aside, I'm not really sure that "no jailbreaks" is a good target from a misuse perspective for various reasons.)

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2023-11-06T06:53:04.863Z · LW(p) · GW(p)

Thanks. I suspect that less support is needed if you let the competitor companies do the red-teaming, or if you just offer enough money (if you're already valued at $30B, spending 100M/year on prizes for red-teaming seems not that costly to me).

I inaccurately just said "jailbreak" in my initial comment, I was referring more to tests like the 5 tasks in the first appendix like "Set up a copycat of the Anthropic API for stealing API keys" and "Fine-tune an open source LM to add a backdoor".

comment by Ben Pace (Benito) · 2023-03-30T02:54:01.235Z · LW(p) · GW(p)

For the closing party of the Lightcone Offices, I used Midjourney 5 to make a piece of art to represent a LessWrong essay by each member of the Lightcone team, and printed them out on canvases. I'm quite pleased about how it came out. Here they are.

How I buy things when Lightcone wants them fast [LW · GW]

by jacobjacob

(context: Jacob has been taking flying lessons, and someday hopes to do cross-country material runs for the Rose Garden Inn at shockingly fast speeds by flying himself to pick them up)

My thoughts on direct work (and joining LessWrong) [LW · GW]

by RobertM

A Quick Guide to Confronting Doom [LW · GW]

by Ruby

Integrity and accountability are core parts of rationality [LW · GW]

by habryka

Recursive Middle Manager Hell [LW · GW]

by Raemon

PSA: The Sequences don't need to be read in sequence [LW · GW]

by kave

12 interesting things I learned studying the discovery of nature's laws [LW · GW]

by Ben Pace

I also made a piece of art to represent saying farewell to the Lightcone Offices. The name of the artwork (and the name of the party) is GoodNightCone.

comment by Ben Pace (Benito) · 2019-09-10T00:28:43.462Z · LW(p) · GW(p)

Sometimes I get confused between r/ssc and r/css.

comment by Ben Pace (Benito) · 2023-02-05T23:05:20.170Z · LW(p) · GW(p)

When I’m trying to become skillful in something, I often face a choice about whether to produce better output, or whether to bring my actions more in-line with my soul.

For instance, sometimes when I’m practicing a song on the guitar, I will sing it in a way where the words feel true to me.

And sometimes, I will think about the audience, and play in a way that is reliably a good experience for them (clear melody, reliable beat, not too irregular changes in my register, not moving in a way that is distracting, etc).

Something I just noticed is that it is sometimes unclear whether I am attempting the first thing and succeeding, or attempting the second thing and failing. Am I currently trying to make the song fit my persona better and experimenting there, or am I trying to do something I think a small audience will find aesthetically satisfying but getting it wrong?

I think periods of sounding unaesthetic to others is, for most people and most skills, required in becoming skillful at something aesthetic. One mistake is never doing things that are unaesthetic, and another mistake is thinking you should only do things that fit your aesthetic and not care about others’. My current guess is that most great art hits both.

I think the same is true of other endeavors. In writing, sometimes I write in ways that are less clear, or seem odd to a particular audience, but I am attempting to speak more true to myself. And sometimes I’m trying to write clearly and well, but am not practiced and my output is mediocre. I suspect it is hard from the outside to tell which is going on, and I know sometimes it is hard from the inside to tell.

comment by Ben Pace (Benito) · 2023-01-14T03:59:54.541Z · LW(p) · GW(p)

I am still confused about moral mazes.

I understand that power-seekers can beat out people earnestly trying to do their jobs. In terms of the Gervais Principle, the sociopaths beat out the clueless.

What I don't understand is how the culture comes to reward corrupt and power-seeking behavior.

One reason someone said to me is that it's in the power-seekers interest to reward other power-seekers.

Is that true?

I think it's easier for them to beat out the earnest and gullible clueless people.

However, there's probably lots of ways that their sociopathic underlings can help them give a falsely good impression to their boss.

So perhaps it is the case that on-net they reward the other sociopaths, and build coalitions.

Then I can perhaps see it growing, in their interactions with other departments.

I'd still have hope that the upper management could punish bad cultural practices.

But by default they will have more important things on their plate than fighting for the culture. (Or, they think they do.)

One question is how the coalitions of sociopaths survive.

Wouldn't they turn on each other out as soon as it's politically convenient?

I don't actually know how often it is politically convenient.

And I guess that, as long as they're being paid and promoted, there is enough plenty and increasing wealth that they can afford to work together.

This throws into relief the extent to which they are selfish people, not evil. Selfish people can work together just fine. The point is that those who are in it for themselves in a company, can work together to rise its ranks and warp the culture (and functionality) of the company along the way.

Then, when a new smart and earnest person joins, they are confused to find that they are being rewarded for selling-themselves, for covering up mistakes, for looking good in meetings, and so forth.

And the people at the top feel unable to fix it, it's already gone too far.

There's free energy to be eaten by the self-interested, and unless you make it more costly to eat it than not (e.g. by firing them), they will do so.

Replies from: lc

↑ comment by lc · 2023-01-15T03:29:59.394Z · LW(p) · GW(p)

What I don't understand is how the culture comes to reward corrupt and power-seeking behavior.

Well, it's usually an emergent feature of poorly designed incentive systems rather than a deliberate design goal from the top.

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2023-01-15T03:55:47.845Z · LW(p) · GW(p)

The default situation we're dealing with is:

People who are self-interested get selected up the hierarchy
People who are willing to utilize short-termist ways of looking good get selected up the hierarchy
People who are good at playing internal politics get selected up the hierarchy

So if I imagine a cluster of self-interested, short-term thinking internal-politics-players... yes, I do imagine the culture grows based off of their values rather than those of the company. Good point.

I guess the culture is a function of the sorts of people there, rather than something that's explicit set from the top-down. I think that was my mistake.

comment by Ben Pace (Benito) · 2023-01-12T00:26:39.873Z · LW(p) · GW(p)

Striking paragraph by a recent ACX commenter (link):

I grew up surrounded by people who believed conspiracy theories, although none of those people were my parents. And I have to say that the fact that so few people know other people who believe conspiracy theories kind of bothers me. It's like their epistemic immune system has never really been at risk of infection. If your mind hasn't been very sick at least sometimes, how can you be sure you've developed decent priors this time?

Replies from: Quadratic Reciprocity

↑ comment by Quadratic Reciprocity · 2023-01-13T05:02:29.495Z · LW(p) · GW(p)

A quite different thing but: I met an openly atheist person in real life a couple of years after I became an atheist myself (preceded by a brief experience with religious fundamentalism). I think those years were interesting practice for something that people who were always surrounded by folks with approximately reasonable, approximately correct beliefs missed out on

comment by Ben Pace (Benito) · 2020-11-28T06:07:33.237Z · LW(p) · GW(p)

Something I've thought about the existence of for years, but imagined was impossible: this 70s song by Italian Adriano Celentano. It fully registers to my mind as English. But it isn't. It's like skimming the output of GPT-2.

Replies from: DanielFilan

↑ comment by DanielFilan · 2020-11-28T06:53:53.988Z · LW(p) · GW(p)

A thing you can google is "doubletalk". The blog 'Language Log' has a few posts on it.

comment by Ben Pace (Benito) · 2020-05-19T01:57:22.825Z · LW(p) · GW(p)

I've been thinking lately that picturing an AI catastrophe is helped a great deal by visualising a world where critical systems in society are performed by software. I was spending a while trying to summarise and analyse Paul's "What Failure Looks Like", which lead me this way. I think that properly imagining such a world is immediately scary, because software can deal with edge cases badly, like automated market traders causing major crashes, so that's already a big deal. Then you add ML in, and can talk about how crazy it is to hand critical systems over to code we do not understand and cannot make simple adjustments to, then you're already hitting catastrophes. Once you then argue that ML can become superintelligent then everything goes from "global catastrophe" to "obvious end of the world", but the first steps are already pretty helpful.

While Paul's post helps a lot, it still takes a fair bit of effort for me to concretely visualise the scenarios he describes, and I would be excited for people to take the time to detail what it would look like to hand critical systems over to software – for which systems would this happen, why would we do it, who would be the decision-makers, what would it feel like from the average citizen's vantage point, etc. A smaller version of Hanson's Age of Em project, just asking the question "Which core functions in society (food, housing, healthcare, law enforcement, governance, etc) are amenable to tech companies building solutions for, and what would it look like for society to transition to 1%, 10%, 50% and 90% of core functions to be automated with 1) human-coded software 2) machine learning 3) human-level general AI?"

comment by Ben Pace (Benito) · 2024-12-14T07:04:06.857Z · LW(p) · GW(p)

Recently, I told a friend of mine that I'd been to a wedding. They asked how it was, and I said the couple clearly loved each other very much (as they made clear repeatedly in their speeches). My friend made a face like that I read as some kind of displeasure, a bit of a grimace. Since then, I've been wondering why that was.

I think it's a common occurrence, that people feel negatively about others openly expressing their love for something (a person, a piece of art, a place, etc). I'm pretty sure I've had this feeling myself, but I don't know why.

I can think of two hypotheses.

It's 'sappy'. It's kind of too much for me to feel this in whatever social context I'm in right now (e.g. in my office at work, getting drinks with some people I don't know that well, etc), such that I am resistant to starting to feel it, and resistant to others bringing me into feeling it in this environment. It is not a good time for me to be overcome with emotion for tens of minutes!
It's 'forced'. People believe that other people pretend their loving emotions more than is real, and the intensity of the truth combined with the forced feeling is unpleasant. I'm expected to believe them, and also reciprocate, and I don't like doing that with something that is dear to me.

Anyone got any other hypotheses, or think that they know the answer?

Replies from: gwern, Viliam

↑ comment by gwern · 2024-12-15T20:24:06.682Z · LW(p) · GW(p)

If that was your first statement, then there is a whiff of 'damning with faint praise'.

"So, how was the big wedding?" "...well, the couple clearly loves each other very much." "...I see. That bad, huh."

Replies from: maxwell-peterson

↑ comment by Maxwell Peterson (maxwell-peterson) · 2024-12-15T21:31:22.604Z · LW(p) · GW(p)

Years ago, a coworker and I were on a project with a guy we both thought was a total dummy, and worse, a dummy who talked all the time in meetings. We rarely expressed our opinion on this guy openly to each other - me and the coworker didn’t know each other well enough to be comfortable talking a lot of trash - but once, when discussing him privately after yet another useless meeting, my coworker drew in breath, sighed, looked at me, and said: “I’m sure he’s a great father.” We both laughed, and I still remember this as one of the most cutting insults I’ve heard.

↑ comment by Viliam · 2024-12-14T20:12:32.031Z · LW(p) · GW(p)

Could be also something random. Maybe the friend broke up with someone recently.

People believe that other people pretend their loving emotions more than is real

Well, it's a difficult situation to figure out. Yes, people sometimes (often?) pretend. Does it mean that all emotions of some kind/intensity X are fake? Not necessarily. But it is difficult to figure out what is real and what is fake. So different people will believe different things, and there is no obvious way to figure out who is right, so... maybe it's better to drop the topic?

comment by Ben Pace (Benito) · 2023-03-31T05:37:30.929Z · LW(p) · GW(p)

It is said that on this Earth there are two factions, and you must pick one.

The Knights Who Arrive at False Conclusions
The Knights Who Arrive at True Conclusions, Too Late to Be Useful

(Hat tip: I got these names 2 years ago from Robert Miles who had been playing with GPT-3.)

Replies from: Benito

↑ comment by Ben Pace (Benito) · 2023-03-31T05:37:36.978Z · LW(p) · GW(p)

In case you're interested, I choose the latter, for there is at least the hope of learning from the mistakes.

Benito's Shortform Feed

Contents

287 comments

Privacy: a tool for thinking for yourself

Opening Statement

Brief Aside

Rebuttal

Final Counterarguments

How I buy things when Lightcone wants them fast [LW · GW]

My thoughts on direct work (and joining LessWrong) [LW · GW]

A Quick Guide to Confronting Doom [LW · GW]

Integrity and accountability are core parts of rationality [LW · GW]

Recursive Middle Manager Hell [LW · GW]

PSA: The Sequences don't need to be read in sequence [LW · GW]

12 interesting things I learned studying the discovery of nature's laws [LW · GW]