LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Is "superhuman" AI forecasting BS? Some experiments on the "539" bot from the Centre for AI Safety
titotal (lombertini) · 2024-09-18T13:07:40.754Z · comments (3)

Reward hacking behavior can generalize across tasks
Kei · 2024-05-28T16:33:50.674Z · comments (5)

[link] The Cognitive-Theoretic Model of the Universe: A Partial Summary and Review
jessicata (jessica.liu.taylor) · 2024-03-27T19:59:27.893Z · comments (36)

EU policymakers reach an agreement on the AI Act
tlevin (trevor) · 2023-12-15T06:02:44.668Z · comments (7)

OpenAI: Leaks Confirm the Story
Zvi · 2023-12-12T14:00:04.812Z · comments (9)

The Parable Of The Fallen Pendulum - Part 2
johnswentworth · 2024-03-12T21:41:30.180Z · comments (8)

MATS Summer 2023 Retrospective
utilistrutil · 2023-12-01T23:29:47.958Z · comments (34)

Send us example gnarly bugs
Beth Barnes (beth-barnes) · 2023-12-10T05:23:00.773Z · comments (10)

Bitter lessons about lucid dreaming
avturchin · 2024-10-16T21:27:04.725Z · comments (62)

Creating unrestricted AI Agents with Command R+
Simon Lermen (dalasnoin) · 2024-04-16T14:52:50.917Z · comments (13)

[link] [Linkpost] Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate
trevor (TrevorWiesinger) · 2024-03-28T16:03:36.452Z · comments (22)

ACX Covid Origins Post convinced readers
ErnestScribbler · 2024-05-01T13:06:20.818Z · comments (7)

JargonBot Beta Test
Raemon · 2024-11-01T01:05:26.552Z · comments (55)

Questions for labs
Zach Stein-Perlman · 2024-04-30T22:15:55.362Z · comments (11)

Attention SAEs Scale to GPT-2 Small
Connor Kissane (ckkissane) · 2024-02-03T06:50:22.583Z · comments (4)

Secondary forces of debt
KatjaGrace · 2024-06-27T21:10:06.131Z · comments (18)

Universal Love Integration Test: Hitler
Raemon · 2024-01-10T23:55:35.526Z · comments (65)

Mid-conditional love
KatjaGrace · 2024-04-17T04:00:08.341Z · comments (21)

Darwinian Traps and Existential Risks
KristianRonn · 2024-08-25T22:37:14.142Z · comments (14)

My 10-year retrospective on trying SSRIs
Kaj_Sotala · 2024-09-22T20:30:02.483Z · comments (10)

[link] Bengio's Alignment Proposal: "Towards a Cautious Scientist AI with Convergent Safety Bounds"
mattmacdermott · 2024-02-29T13:59:34.959Z · comments (19)

Coherence of Caches and Agents
johnswentworth · 2024-04-01T23:04:31.320Z · comments (9)

[question] What could a policy banning AGI look like?
TsviBT · 2024-03-13T14:19:07.783Z · answers+comments (23)

What is malevolence? On the nature, measurement, and distribution of dark traits
David Althaus (wallowinmaya) · 2024-10-23T08:41:33.197Z · comments (15)

On Claude 3.0
Zvi · 2024-03-06T18:50:04.766Z · comments (5)

[link] AI takeoff and nuclear war
owencb · 2024-06-11T19:36:24.710Z · comments (6)

The Packaging and the Payload
Screwtape · 2024-11-12T03:07:37.209Z · comments (1)

Value fragility and AI takeover
Joe Carlsmith (joekc) · 2024-08-05T21:28:07.306Z · comments (5)

Dentistry, Oral Surgeons, and the Inefficiency of Small Markets
GeneSmith · 2024-11-01T17:26:06.466Z · comments (16)

Grief is a fire sale
Nathan Young · 2024-03-04T01:11:06.882Z · comments (1)

Lying Alignment Chart
Zack_M_Davis · 2023-11-29T16:15:28.102Z · comments (17)

[link] Gwern: Why So Few Matt Levines?
kave · 2024-10-29T01:07:27.564Z · comments (10)

[Intuitive self-models] 4. Trance
Steven Byrnes (steve2152) · 2024-10-08T13:30:41.446Z · comments (7)

The Obliqueness Thesis
jessicata (jessica.liu.taylor) · 2024-09-19T00:26:30.677Z · comments (17)

[link] Video lectures on the learning-theoretic agenda
Vanessa Kosoy (vanessa-kosoy) · 2024-10-27T12:01:32.777Z · comments (0)

[Valence series] 3. Valence & Beliefs
Steven Byrnes (steve2152) · 2023-12-11T20:21:30.570Z · comments (11)

My guess at Conjecture's vision: triggering a narrative bifurcation
Alexandre Variengien (alexandre-variengien) · 2024-02-06T19:10:42.690Z · comments (12)

[link] The Offense-Defense Balance Rarely Changes
Maxwell Tabarrok (maxwell-tabarrok) · 2023-12-09T15:21:23.340Z · comments (23)

[link] The problems with the concept of an infohazard as used by the LW community [Linkpost]
Noosphere89 (sharmake-farah) · 2023-12-22T16:13:54.822Z · comments (43)

On the CrowdStrike Incident
Zvi · 2024-07-22T12:40:05.894Z · comments (14)

AISC9 has ended and there will be an AISC10
Linda Linsefors · 2024-04-29T10:53:18.812Z · comments (4)

Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)
Elizabeth (pktechgirl) · 2024-10-22T18:20:01.194Z · comments (78)

Vote on Anthropic Topics to Discuss
Ben Pace (Benito) · 2024-03-06T19:43:47.194Z · comments (55)

Rationality Quotes - Fall 2024
Screwtape · 2024-10-10T18:37:55.013Z · comments (25)

Analogies between scaling labs and misaligned superintelligent AI
scasper · 2024-02-21T19:29:39.033Z · comments (5)

Could randomly choosing people to serve as representatives lead to better government?
John Huang · 2024-10-21T17:10:20.920Z · comments (13)

[link] Claude 3.5 Sonnet
Zach Stein-Perlman · 2024-06-20T18:00:35.443Z · comments (41)

SAE-VIS: Announcement Post
CallumMcDougall (TheMcDouglas) · 2024-03-31T15:30:49.079Z · comments (8)

Interpreting Preference Models w/ Sparse Autoencoders
Logan Riggs (elriggs) · 2024-07-01T21:35:40.603Z · comments (12)

A Simple Toy Coherence Theorem
johnswentworth · 2024-08-02T17:47:50.642Z · comments (19)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

meme-marine on "The Solomonoff Prior is Malign" is a special case of a simpler argument

The reason for agnosticism is that it is no more likely for them to be on one side or the other. As a result, you don't know without evidence who is influencing you. I don't really think this class of Pascal's Wager attack is very logical for this reason - an attack is supposed to influence someone's behavior but I think that without special pleading this can't do that. Non-existent beings have no leverage whatsoever and any rational agent would understand this - even humans do. Even religious beliefs aren't completely evidenceless, the type of evidence exhibited just doesn't stand up to scientific scrutiny.

To give an example: What if that AI was in a future simulation performed after the humans had won, and were now trying to counter-capture it? There's no reason to this this is less likely than the aliens hosting the simulation. It has also been pointed out that the Oracle is not actually trying to earnestly communicate its findings but actually to get reward - reinforcement learners in practice do not behave like this, they learn behavior which generates reward. "Devote yourself to a hypothetical god" is not a very good strategy in train-time.

gesild-muka on Which things were you surprised to learn are not metaphors?

"Home is where the heart is."

I thought this meant something like home is where longing is (your metaphorical heart), the place that you yearn for the most. Now I think it may simply mean that home is wherever your physical beating heart is. The message behind it being that you can adapt to feel at home most anywhere.

"Breathtaking."

I thought this was just an expression to explain natural beauty but I actually felt the breath leave me when I was young from suddenly seeing a sweeping vista of mountains and forest while riding on a bus when I was a teen.

robert-cousineau on Decorated pedestrian tunnels

I'm honestly really skeptical of the cost effectiveness of pedestrian tunnels as a form of transportation. Asking Claude for estimates on tunnel construction costs gets me the following:

A 1-mile pedestrian tunnel would likely cost $15M-$30M for basic construction ($3,000-$6,000 per foot based on utility tunnel costs), plus 30% for ventilation, lighting, and safety systems ($4.5M-$9M), and ongoing maintenance of ~$500K/year.

To put this in perspective: Converting Portland's 400 miles of bike lanes to tunnels would cost $7.8B-$15.6B upfront (1.1-2.3× Portland's entire annual budget) plus $200M/year in maintenance. For that same $15.6B, you could:

Build ~780 miles of protected surface bike lanes ($2M/mile)
Fund Portland's bike infrastructure maintenance for 31 years
Give every Portland resident an e-bike and still have $14B left over

Even for a modest 5-mile grid serving 10,000 daily users (optimistic for suburbs), that's $10K-$20K per user in construction costs alone.

Alternative: A comprehensive street-level mural program might cost $100K-$200K per mile, achieving similar visual variety at ~1% of the tunnel cost.

signer on Are You More Real If You're Really Forgetful?

Yes, except I would object to phrasing this anthropic stuff as "we should expect ourselves to be agents that exist in a universe that abstracts well" instead of "we should value universe that abstracts well (or other universes that contain many instances of us)" - there is no coherence theorems that force summation of your copies, right? And so it becomes apparent that we can value some other thing.

Also even if you consider some memories a part of your identity, you can value yourself slightly less after forgetting them, instead of only having threshold for death.

raemon on "The Solomonoff Prior is Malign" is a special case of a simpler argument

Curated. Like others, I found this a good simpler articulation of the concept. I appreciated the disclaimers around the

One thing I got from this post, which for some reason I hadn't gotten from previous posts, was the notion that "to what degree am I in a simulation?" may be situation-dependent. i.e. moments where I'm involved with historically important things might be more simulationy, other times less so. (Something had felt off about my previous question of "do more 'historically important' people have a more-of-their-measure in simulations?", and the answer is maybe still just "yes", but somehow it feels less magical and weird to ask "how likely is this particular moment to be simulated?")

Something does still feel pretty sus to me about the "during historically significant moments, you might be more likely to see something supernatural-looking afterwards" (esp. if you think it should be appear in >50% of your reality-measure-or-whatever).

The "think in terms of expected value" seems practically useful but also... I dunno, even if I was a much more historically significant person, I just really don't expect to see Simulationy Things. The reasoning spelled out in the post didn't feel like it resolved my confusion about this.

(independent of that, I agreed with Richard's critique of some of the phrasing in the post, which seem to not quite internalize the claims David was making)

robert-cousineau on Are You More Real If You're Really Forgetful?

I'll preface this with: what I'm saying is low confidence - I'm not very educated on the topics in question (reality fluid, consciousness, quantum mechanics, etc).

Nevertheless, I don't see how the prison example is applicable. In the prison scenario there's an external truth (which prisoner was picked) that exists independent of memory/consciousness. The memory wipe just makes the prisoner uncertain about this external truth.

But this post is talking about a scenario where your memories/consciousness are the only thing that determines which universes count as 'you'.

There is no external truth about which universe you're really in - your consciousness itself defines (encompasses?) which universes contain you. So, when your memories become more coarse, you're not just becoming uncertain about which universe you're in - you're changing which universes count as containing you, since your consciousness is the only arbiter of this.

thane-ruthenis on Are You More Real If You're Really Forgetful?

Sure. This setup couldn't really be exploited for optimizing the universe. If we assume that the self-selection assumption is a reasonable assumption to make, inducing amnesia doesn't actually improve outcomes across possible worlds. One out of 100 prisoners still dies.

It can't even be considered "re-rolling the dice" on whether the specific prisoner that you are dies. Under the SSA, there's no such thing as a "specific prisoner", "you" are implemented as all 100 prisoners simultaneously, and so regardless of whether you choose to erase your memory or not, 1/100 of your measure is still destroyed. Without SSA, on the other hand, if we consider each prisoner's perspective to be distinct, erasing memory indeed does nothing: it doesn't return your perspective to the common pool of prisoner-perspectives, so if "you" were going to get shot, "you" are still going to get shot.

I'm not super interested in that part, though. What I'm interested in is whether there are in fact 100 clones of me: whether, under the SSA, "microscopically different" prisoners could be meaningfully considered a single "high-level" prisoner.

mako-yass on Are You More Real If You're Really Forgetful?

Huh but some loss of measure would be inevitable, wouldn't it? Given that your outgoing glyph total is going to be bigger than your incoming glyph total, since however many glyphs you summon, some of the non-glyph population are going to whittle and add to the outgoing glyphs.

I'm remembering more. I think a lot of it was about avoiding "arbitrary reinstantiation", this idea that when a person dies, their consciousness continues wherever that same pattern still counts as "alive", and usually those are terrible places. Boltzmann brains for instance. This might be part of the reason I don't care about patternist continuity. Seems like a lost cause. I'll just die normally thank you.

simon on Magic by forgetting

Here are some things one might care about:

what happens to your physical body
the access to working physical bodies of cognitive algorithms, across all possible universes, that are within some reference class containing the cognitive algorithm implemented by your physical body
... etc, etc...
what happens to the physical body selected by the following process:
1. start with your physical body
2. go forward to some later time selected by the cognitive algorithm implemented by your physical body, allowing (or causing) the knowledge possessed by the cognitive algorithm implemented by your physical body to change in the interim
3. at that later time, randomly sample from all the physical bodies, among all universes, that implement cognitive algorithms having the same knowledge as the cognitive algorithm implemented by your physical body at that later time
4. (optionally) return to step b but with the physical body whose changes of cognitive algorithm are tracked and whose decisions are used being the the new physical body selected from step c
5. stop whenever the cognitive algorithm implemented by the physical body selected in some step decides to stop.

For 1, 2, and I expect for the vast majority of possibilities for 3, your procedure will not work. It will work for 4, which is apparently what you care about.

Terminal values are arbitrary, so that's entirely valid. However, 4 is not something that seems, to me, like a particularly privileged or "rational" thing to care about.

jmh on Hell is wasted on the evil

If I'm reading this correctly, then generally we're seeing a rather flat payoff curve over most "do good opportunities" and the rare max should stand out like a sore thumb when taking a good look. So those really should be things do-gooders will jump on quickly. (Note, that doesn't mean they are done quickly or that additional assistance is not important.)

While not as obvious, it probably also means that a lot of more mundane opportunities are getting ignored. That comes from an insight offered in one of my classes from years back asking why so much clumping (think fad type stuff here) exists when the marginal utility of the consumed good is pretty much equal to all the other goods that could have been consumer. In other words, when the opportunity cost is zero why is everyone doing the same thing?

I suspect we could see something like that in the "do good" space. Therefore, taking the path not followed could be a very good thing.