LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[question] Practical advice for secure virtual communication post easy AI voice-cloning?
hmys (the-cactus) · 2024-08-09T17:32:33.458Z · answers+comments (5)

Does “Ultimate Neartermism” via Eternal Inflation dominate Longtermism in expectation?
Jordan Arel · 2024-08-17T22:28:21.849Z · comments (1)

A Taxonomy Of AI System Evaluations
Maxime Riché (maxime-riche) · 2024-08-19T09:07:45.224Z · comments (0)

Understanding Hidden Computations in Chain-of-Thought Reasoning
rokosbasilisk · 2024-08-24T16:35:03.907Z · comments (1)

[link] Metaculus's 'Minitaculus' Experiments — Collaborate With Us
ChristianWilliams · 2024-08-26T20:44:32.125Z · comments (0)

Budapest Hungary - ACX Meetups Everywhere Fall 2024
Timothy Underwood (timothy-underwood-1) · 2024-08-29T18:37:41.313Z · comments (0)

Halifax Canada - ACX Meetups Everywhere Fall 2024
interstice · 2024-08-29T18:39:12.490Z · comments (0)

[link] Redundant Attention Heads in Large Language Models For In Context Learning
skunnavakkam · 2024-09-01T20:08:48.963Z · comments (0)

A gentle introduction to sparse autoencoders
Nick Jiang (nick-jiang) · 2024-09-02T18:11:47.086Z · comments (0)

[link] Contra Yudkowsky on 2-4-6 Game Difficulty Explanations
Josh Hickman (josh-hickman) · 2024-09-08T16:13:33.187Z · comments (1)

[link] [Linkpost] Interpretable Analysis of Features Found in Open-source Sparse Autoencoder (partial replication)
Fernando Avalos (fernando-avalos) · 2024-09-09T03:33:53.548Z · comments (1)

[link] Optimising under arbitrarily many constraint equations
dkl9 · 2024-09-12T14:59:28.475Z · comments (0)

Increasing the Span of the Set of Ideas
Jeffrey Heninger (jeffrey-heninger) · 2024-09-13T15:52:39.132Z · comments (1)

Forever Leaders
Justice Howard (justice-howard) · 2024-09-14T20:55:39.095Z · comments (9)

[link] SCP Foundation - Anti memetic Division Hub
landscape_kiwi · 2024-09-15T13:40:52.691Z · comments (1)

Thirty random thoughts about AI alignment
Lysandre Terrisse · 2024-09-15T16:24:10.572Z · comments (1)

[question] Can subjunctive dependence emerge from a simplicity prior?
Daniel C (harper-owen) · 2024-09-16T12:39:35.543Z · answers+comments (0)

Food, Prison & Exotic Animals: Sparse Autoencoders Detect 6.5x Performing Youtube Thumbnails
Louka Ewington-Pitsos (louka-ewington-pitsos) · 2024-09-17T03:52:43.269Z · comments (2)

Inquisitive vs. adversarial rationality
gb (ghb) · 2024-09-18T13:50:09.198Z · comments (9)

GPT4o is still sensitive to user-induced bias when writing code
Reed (ThomasReed) · 2024-09-22T21:04:54.717Z · comments (0)

The Existential Dread of Being a Powerful AI System
testingthewaters · 2024-09-26T10:56:32.904Z · comments (1)

Avoiding jailbreaks by discouraging their representation in activation space
Guido Bergman · 2024-09-27T17:49:20.785Z · comments (2)

'Chat with impactful research & evaluations' (Unjournal NotebookLMs)
david reinstein (david-reinstein) · 2024-09-28T00:32:16.845Z · comments (0)

Thoughts on Evo-Bio Math and Mesa-Optimization: Maybe We Need To Think Harder About "Relative" Fitness?
Lorec · 2024-09-28T14:07:42.412Z · comments (6)

Exploring Shard-like Behavior: Empirical Insights into Contextual Decision-Making in RL Agents
Alejandro Aristizabal (alejandro-aristizabal) · 2024-09-29T00:32:42.161Z · comments (0)

Grounding self-reference paradoxes in reality
Fiora from Rosebloom · 2024-09-29T05:50:30.559Z · comments (3)

Retrieval Augmented Genesis
João Ribeiro Medeiros (joao-ribeiro-medeiros) · 2024-10-01T20:18:01.836Z · comments (0)

[question] why won't this alignment plan work?
KvmanThinking (avery-liu) · 2024-10-10T15:44:59.450Z · answers+comments (7)

[question] Is School of Thought related to the Rationality Community?
Shoshannah Tekofsky (DarkSym) · 2024-10-15T12:41:33.224Z · answers+comments (6)

Against Job Boards: Human Capital and the Legibility Trap
vaishnav92 · 2024-10-24T20:50:50.266Z · comments (1)

Introducing Kairos: a new AI safety fieldbuilding organization (the new home for SPAR and FSP)
agucova · 2024-10-25T21:59:08.782Z · comments (0)

[question] What are some good ways to form opinions on controversial subjects in the current and upcoming era?
notfnofn · 2024-10-27T14:33:53.960Z · answers+comments (20)

[link] AI Safety Newsletter #43: White House Issues First National Security Memo on AI Plus, AI and Job Displacement, and AI Takes Over the Nobels
Corin Katzke (corin-katzke) · 2024-10-28T16:03:39.258Z · comments (0)

Goal: Understand Intelligence
Johannes C. Mayer (johannes-c-mayer) · 2024-11-03T21:20:02.900Z · comments (8)

[question] How do we know dreams aren't real?
Logan Zoellner (logan-zoellner) · 2024-08-22T12:41:57.380Z · answers+comments (31)

[link] Could Things Be Very Different?—How Historical Inertia Might Blind Us To Optimal Solutions
James Stephen Brown (james-brown) · 2024-09-11T09:53:07.474Z · comments (0)

New Capabilities, New Risks? - Evaluating Agentic General Assistants using Elements of GAIA & METR Frameworks
Tej Lander (tej-lander) · 2024-09-29T18:58:56.253Z · comments (0)

LDT (and everything else) can be irrational
Christopher King (christopher-king) · 2024-11-06T04:05:36.932Z · comments (1)

[link] Universal basic income isn’t always AGI-proof
Kevin Kohler (KevinKohler) · 2024-09-05T15:39:18.389Z · comments (3)

[link] How long should political (and other) terms be?
ohmurphy · 2024-10-14T21:38:43.050Z · comments (0)

Apply to be a mentor in SPAR!
agucova · 2024-11-05T21:32:45.797Z · comments (0)

[question] A Different Perspective on Rationality - Would This Be Valuable?
Gabriel Brito (gabriel-brito) · 2024-10-26T18:47:46.416Z · answers+comments (2)

Toy Models of Superposition: what about BitNets?
Alejandro Tlaie (alejandro-tlaie-boria) · 2024-08-08T16:29:02.054Z · comments (1)

[link] Join the $10K AutoHack 2024 Tournament
Paul Bricman (paulbricman) · 2024-09-25T11:54:20.112Z · comments (0)

[link] Exposure can’t rule out disasters
Chipmonk · 2024-08-15T17:03:37.259Z · comments (19)

[link] Linkpost: Hypocrisy standoff
Chris_Leong · 2024-09-29T14:27:19.175Z · comments (1)

Democracy beyond majoritarianism
Arturo Macias (arturo-macias) · 2024-09-03T15:10:56.284Z · comments (2)

[question] If the DoJ goes through with the Google breakup,where does Deepmind end up?
O O (o-o) · 2024-10-12T05:06:50.996Z · answers+comments (1)

[question] AMA: International School Student in China
Novice · 2024-10-01T06:00:16.282Z · answers+comments (0)

Differential knowledge interconnection
Roman Leventov · 2024-10-12T12:52:36.267Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

steve2152 on Could orcas be (trained to be) smarter than humans? 

Possibly related: Could we use current AI methods to understand dolphins? [LW · GW] + comments

daniel-kokotajlo on Daniel Kokotajlo's Shortform

Thanks, I hadn't considered that. So as per my argument, there's some threshold of density above which it's easier to attack underground; as per your argument, there's some threshold of density where 'indirect fires' of large tunnel-destroying bombs become practical. Unclear which threshold comes first, but I'd guess it's the first.

davekasten on Daniel Kokotajlo's Shortform

I think the thing that you're not considering is that when tunnels are more prevalent and more densely packed, the incentives to use the defensive strategy of "dig a tunnel, then set off a very big bomb in it that collapses many tunnels" gets far higher. It wouldn't always be infantry combat, it would often be a subterranean equivalent of indirect fires.

viliam on Review: “The Case Against Reality”

All of the stuff we take for granted as making up the world we inhabit — objects, dimensions, qualities, time, causality — are, says Hoffman, not objectively real.

I expect these words to be followed by a redefinition of what "objectively real" means.

There’s really just no resemblance between the interface and the underlying reality at all.

Well, maybe no resemblance of... uhm... the substrate, for example we could be simulated on a computer, and the stuff our universe is made of could be completely different from the stuff the computer is made of.

But there is also the resemblance of... form, like if we experience something in our universe, it is probably in some way represented in the simulated computer. Like, the things we perceive as the laws of physics are in some way a consequence of something in the simulating software. Maybe not 1:1 directly; it could be the case that something we perceive as a basic law could be just some statistical average of the true underlying law.

What I am trying to say is that if our perceptions were utterly disconnected from the true reality, then we wouldn't be able to make any kinds of predictions, and probably couldn't even live.

tsvibt on Advisors for Smaller Major Donors?

Well, anyone who wants could pay me to advise them about giving to decrease X-risk by creating smarter humans. Funders less constrained by PR would of course be advantaged in that area.

measure on How to put California and Texas on the campaign trail!

I remember there was a movement a while back to have states agree to award their electors to the national proportional vote winner, but I'm not sure what came of that.

anthony-digiovanni on Winning isn't enough

The key claim is: You can’t evaluate which beliefs and decision theory to endorse just by asking “which ones perform the best?” Because the whole question is what it means to systematically perform better, under uncertainty. Every operationalization of “systematically performing better” we’re aware of is either:

Incomplete — like “avoiding dominated strategies”, which leaves a lot unconstrained;
A poorly motivated proxy for the performance we actually care about — like “doing what’s worked in the past”; or
Secretly smuggling in nontrivial non-pragmatic assumptions — like “doing what’s worked in the past, not because that’s what we actually care about, but because past performance predicts future performance”

This is what we meant to convey with this sentence: “On any way of making sense of those words, we end up either calling a very wide range of beliefs and decisions “rational”, or reifying an objective that has nothing to do with our terminal goals without some substantive assumptions.”

(I can't tell from your comment if you agree with all of that. But, if this was all obvious to you, great! But we’ve often had discussions where someone appealed to “which ones perform the best?” in a way that misses these points.)

steve2152 on Against empathy-by-default

Hmm, maybe we should distinguish two things:

(A) I find the feeling of picking up the tofu with the fork to be intrinsically satisfying—it feels satisfying and empowering to feel the tongs of the fork slide into the tofu.
(B) I don’t care at all about the feeling of the fork sliding into the tofu; instead I feel motivated to pick up tofu with the fork because I’m hungry and tofu is yummy.

For (A), the analogy to picking up feta is logically sound—this is legitimate evidence that picking up the feta will also feel intrinsically satisfying. And accordingly, my brain, having made the analogy, correctly feels motivated to pick up feta.

For (B), the analogy to picking up feta is irrelevant. The dimension along which I’m analogizing (how the fork slides in) is unrelated to the dimension which constitutes the source of my motivation (tofu being yummy). And accordingly, if I like the taste of tofu but dislike feta, then I will not feel motivated to pick up the feta, not even a little bit, let alone to the point where it’s determining my behavior.

The lesson here (I claim) is that our brain algorithms are sophisticated enough to not just note whether an analogy target has good or bad vibes, but rather whether the analogy target has good or bad vibes for reasons that legitimately transfer back to the real plan under consideration.

So circling back to empathy, if I was a sociopath, then “Ahmed getting punched” might still kinda remind me of “me getting punched”, but the reason I dislike “me getting punched” is because it’s painful, whereas “Ahmed getting punched” is not painful. So even if “me getting punched” momentarily popped into my sociopathic head, I would then immediately say to myself “ah, but that’s not something I need to worry about here”, and whistle a tune and carry on with my day.

Remember, empathy is a major force. People submit to torture and turn their lives upside down over feelings of empathy. If you want to talk about phenomena like “something unpleasant popped into my head momentarily, even if it doesn’t really have anything to do with this situation”, then OK maybe that kind of thing might have a nonzero impact on motivation, but even if it does, it’s gonna be tiny. It’s definitely not up to the task of explaining such a central part of human behavior, right?

sunwillrise on Winning isn't enough

some people say that "winning is about not playing dominated strategies"

I do not believe this statement. As in, I do not currently know of a single person, associated either with LW or with decision-theory academia, that says "not playing dominated strategies is entirely action-guiding." So, as Raemon pointed out [LW(p) · GW(p)], "this post seems like it’s arguing with someone but I’m not sure who."

In general, I tend to mildly disapprove of words like "a widely-used strategy", "we often encounter claims" etc, without any direct citations to the individuals who are purportedly making these mistakes. If it really was that widely-used, surely it would be trivial for the authors to quote a few examples off the top of their head, no? What does it say about them that they didn't?

tailcalled on Scissors Statements for President?

Electoral candidates can only be very bad because the country is very big and strong, which can only be the case because there's a lot of people, land, capital and institutions.

Noticing that two candidates for leading these resources are both bad is kind of useless without some other opinion on what form the resources should enter. A simple option would be that the form of the resources should lessen, e.g. that people should work less. The first step to this is to go away from Keynesianism. But if you take that to its logical conclusion, it implies e/acc replacement of humanity, VHEM, mass suicide, or whatever. It's not surprising that this is unpopular.

So that raises the question: What's some direction that the form of societal resources could be shifted towards that would be less confusing than a scissor statement candidate?

Because without an answer to this question, I'm not sure we even need elaborate theories on scissor statements.