LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

I am the Golden Gate Bridge
Zvi · 2024-05-27T14:40:03.216Z · comments (6)

Counting arguments provide no evidence for AI doom
Nora Belrose (nora-belrose) · 2024-02-27T23:03:49.296Z · comments (188)

[link] Ilya Sutskever created a new AGI startup
harfe · 2024-06-19T17:17:17.366Z · comments (35)

[question] How to get nerds fascinated about mysterious chronic illness research?
riceissa · 2024-05-27T22:58:29.707Z · answers+comments (50)

The case for unlearning that removes information from LLM weights
Fabien Roger (Fabien) · 2024-10-14T14:08:04.775Z · comments (14)

[link] Ideological Bayesians
Kevin Dorst · 2024-02-25T14:17:25.070Z · comments (4)

[link] Uncovering Deceptive Tendencies in Language Models: A Simulated Company AI Assistant
Olli Järviniemi (jarviniemi) · 2024-05-06T07:07:05.019Z · comments (13)

[link] Explaining Impact Markets
Saul Munn (saul-munn) · 2024-01-31T09:51:27.587Z · comments (2)

Kids or No kids
Kids or no kids (grosseholz.f@gmail.com) · 2023-11-14T18:37:02.799Z · comments (10)

[link] Almost everyone I’ve met would be well-served thinking more about what to focus on
Henrik Karlsson (henrik-karlsson) · 2024-01-05T21:01:27.861Z · comments (8)

On Claude 3.5 Sonnet
Zvi · 2024-06-24T12:00:05.719Z · comments (14)

[link] Compact Proofs of Model Performance via Mechanistic Interpretability
LawrenceC (LawChan) · 2024-06-24T19:27:21.214Z · comments (3)

[link] MIRI's April 2024 Newsletter
Harlan · 2024-04-12T23:38:20.781Z · comments (0)

Access to powerful AI might make computer security radically easier
Buck · 2024-06-08T06:00:19.310Z · comments (14)

[link] I found >800 orthogonal "write code" steering vectors
Jacob G-W (g-w1) · 2024-07-15T19:06:17.636Z · comments (19)

[link] Things You’re Allowed to Do: University Edition
Saul Munn (saul-munn) · 2024-02-06T00:36:11.690Z · comments (13)

Sparsify: A mechanistic interpretability research agenda
Lee Sharkey (Lee_Sharkey) · 2024-04-03T12:34:12.043Z · comments (22)

[link] the Giga Press was a mistake
bhauth · 2024-08-21T04:51:24.150Z · comments (26)

[link] RAND report finds no effect of current LLMs on viability of bioterrorism attacks
StellaAthena · 2024-01-25T19:17:30.493Z · comments (14)

[link] Sabotage Evaluations for Frontier Models
David Duvenaud (david-duvenaud) · 2024-10-18T22:33:14.320Z · comments (11)

2024 Petrov Day Retrospective
Ben Pace (Benito) · 2024-09-28T21:30:14.952Z · comments (25)

Live Theory Part 0: Taking Intelligence Seriously
Sahil · 2024-06-26T21:37:10.479Z · comments (3)

[link] Against Aschenbrenner: How 'Situational Awareness' constructs a narrative that undermines safety and threatens humanity
GideonF · 2024-07-15T18:37:40.232Z · comments (17)

Apollo Research 1-year update
Marius Hobbhahn (marius-hobbhahn) · 2024-05-29T17:44:32.484Z · comments (0)

Towards a Less Bullshit Model of Semantics
johnswentworth · 2024-06-17T15:51:06.060Z · comments (44)

Notes on Dwarkesh Patel’s Podcast with Demis Hassabis
Zvi · 2024-03-01T16:30:08.687Z · comments (0)

It's time for a self-reproducing machine
Carl Feynman (carl-feynman) · 2024-08-07T21:52:22.819Z · comments (68)

A Solomonoff Inductor Walks Into a Bar: Schelling Points for Communication
johnswentworth · 2024-07-26T00:33:42.000Z · comments (1)

OpenAI: The Board Expands
Zvi · 2024-03-12T14:00:04.110Z · comments (1)

You can, in fact, bamboozle an unaligned AI into sparing your life
David Matolcsi (matolcsid) · 2024-09-29T16:59:43.942Z · comments (171)

Takeoff speeds presentation at Anthropic
Tom Davidson (tom-davidson-1) · 2024-06-04T22:46:35.448Z · comments (0)

SB 1047: Final Takes and Also AB 3211
Zvi · 2024-08-27T22:10:07.647Z · comments (11)

On attunement
Joe Carlsmith (joekc) · 2024-03-25T12:47:34.856Z · comments (8)

[question] Am I confused about the "malign universal prior" argument?
nostalgebraist · 2024-08-27T23:17:22.779Z · answers+comments (33)

Announcing Neuronpedia: Platform for accelerating research into Sparse Autoencoders
Johnny Lin (hijohnnylin) · 2024-03-25T21:17:58.421Z · comments (7)

Defining alignment research
Richard_Ngo (ricraz) · 2024-08-19T20:42:29.279Z · comments (23)

Information vs Assurance
johnswentworth · 2024-10-20T23:16:25.762Z · comments (7)

New page: Integrity
Zach Stein-Perlman · 2024-07-10T15:00:41.050Z · comments (3)

Science advances one funeral at a time
Cameron Berg (cameron-berg) · 2024-11-01T23:06:19.381Z · comments (9)

[link] Anthropic: Three Sketches of ASL-4 Safety Case Components
Zach Stein-Perlman · 2024-11-06T16:00:06.940Z · comments (29)

[link] Finishing The SB-1047 Documentary In 6 Weeks
Michaël Trazzi (mtrazzi) · 2024-10-28T20:17:47.465Z · comments (5)

Quotes from Leopold Aschenbrenner’s Situational Awareness Paper
Zvi · 2024-06-07T11:40:03.981Z · comments (10)

How to train your own "Sleeper Agents"
evhub · 2024-02-07T00:31:42.653Z · comments (11)

Everything Wrong with Roko's Claims about an Engineered Pandemic
WitheringWeights (EZ97) · 2024-02-22T15:59:08.439Z · comments (10)

Circular Reasoning
abramdemski · 2024-08-05T18:10:32.736Z · comments (36)

Meaning & Agency
abramdemski · 2023-12-19T22:27:32.123Z · comments (17)

Just admit that you’ve zoned out
joec · 2024-06-04T02:51:27.594Z · comments (22)

Review: Conor Moreton's "Civilization & Cooperation"
Duncan Sabien (Deactivated) (Duncan_Sabien) · 2024-05-26T19:32:43.131Z · comments (8)

Prediction Markets aren't Magic
SimonM · 2023-12-21T12:54:07.754Z · comments (29)

[link] Introducing METR's Autonomy Evaluation Resources
Megan Kinniment (megan-kinniment) · 2024-03-15T23:16:59.696Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

startattheend on Does the "ancient wisdom" argument have any validity? If a particular teaching or tradition is old, to what extent does this make it more trustworthy?

That sounds about right. And "people sometimes feel that way" is a good explanation for the downvote in my opinion. I was arguing the object-level premises of the post because the "disagree" downvote was factually wrong, and this factual wrongness, I argue, is caused by a faulty understanding of how truth works, and this faulty understanding is most common in the western world and in educated people, and in the ideologies which correlate with western thought and academia.

If you disagree with something which is true, I think the only likely explanations are "Does not understand" and "Has a dislike of", and the bias I pointed out covers both of these possibilities (the former is a "map vs territory" issue and the latter is a "morality vs reality" issue).

I think you figured out what went wrong nicely, but in the end the disagreement remains. I still consider my point likely. If somebody comes along and tells me that they disagreed with it for other reasons, I might even argue that they're lying to themselves, as I'm way to disillusioned to think that a "will to truth" exists. I think social status, moral values and other such things are stronger motivators than people will admit even to themselves.

rob-lucas on Bigger Livers?

One reason is just that eating food is enjoyable. I limit the amount of food I eat to stay within a healthy range, but if I could increase that amount while staying healthy, I could enjoy that excess.

I think there are two aspects to the enjoyment of food. One is related to satiety. I enjoy the feeling of sating my appetite, and failing to sate it leaves me with te negative experience of craving food (negative if I don't satisfy those cravings.

But the other aspect is just the enjoyment of eating each individual bite of food. Not the separate enjoyment of sating my appetite, but just the experience of eating.*

When I was younger and much more physically active I ate very large amounts of food. I miss being able to do that. I'm just as sated now with the much smaller portions I eat, but eating a small breakfast instead of a large one is a different experience.

This probably doesn't justify some sort of risky intervention in increasing liver size. Food is enjoyable, but so are a lot of other things in life. But shifting to a higher protien diet seems like the kind of safe intervention, potentially even also healthier in other respects, that, if it has the side effect of being able to eat a little more food, could improve quality of life with minimal other costs. Potential costs I see are related to the price of protein relative to other sources of nutrition, the cost of additional food (if the point is being able to eat more, you've got spend money for that excess), and, depending on one's moral views, something related to the source of the protien being added.

*I think Kahneman's remembering vs. expereincing selves adds some confusion here as well. When we remember a meal we don't necessarily remember the enjoyment we got from every bite, but probably put more weight on the feeling of satiety and the peak experience (how good did it taste at its best?). But the experiencing self experiences every bite. How much you want to weight the remembering vs. experiencing self is a philosophical issue, but I just want to note that it comes up here.

lukehmiles on Should CA, TX, OK, and LA merge into a giant swing state, just for elections?

I wonder if anybody has tried to quantify how much it's worth to be a swing voter. I imagine if you are the government contractor up for renewal then it's worth quite a lot, but I wonder how much of the money/benefits the average Joe sees.

I don't know much about swing state benefits except that Milwaukee, Wisconsin got their lead pipes replaced by the fed and the workers were required to be local and they say they were paid quite well https://youtube.com/watch?v=4VpwgG0P8VU

lukehmiles on The hostile telepaths problem

Aw man we used the same word for different things again

lukehmiles on The hostile telepaths problem

Your examples fit the definition quite well. Apparently this is in the dictionary now. https://www.merriam-webster.com/dictionary/gaslighting

norimori1992 on The Witness

Why would we stretch the definition of lawyer in such a way? That's not what the word "lawyer" means, either in the dictionary sense or in the sense of how people use the word. And even if you can come up with a reason to stretch it to include all those professions, what makes you think that's what Eliezer was doing?

richard_kennaway on Bigger Livers?

I'm missing something here. Why would I want a bigger liver? I mean, from this account, liver size is obviously something that the body is controlling. You list various interventions to make it bigger, which predictably have bad effects. But why would I want to change something that my body is already managing perfectly well?

The only reason I could find was this:

Athletes have higher resting metabolic rates than non-athletes; their bodies use more energy, even when they’re not exercising. That means they can eat more without getting fat.

Is that it? Why not just^[1]...not eat more? These are athletes. They eat to sustain themselves in the pursuit of athletic excellence. They can already "just" not eat more. If they couldn't, they would not be athletes.

I agree there are people, notably Eliezer, who can't "just" not eat more without being as unable to function as if they were starving. I can't see a larger liver burning up more energy helping with that.

If anyone's hackles rise at a sentence beginning "Why not just—", you're quite right. No problem can be solved by "just"...whatever it is. If it could, it would not be a problem. ↩︎

nostalgebraist on the case for CoT unfaithfulness is overstated

The "sequential calculation steps" I'm referring to are the ones that CoT adds above and beyond what can be done in a single forward pass. It's the extra sequential computation added by CoT, specifically, that is bottlenecked on the CoT tokens.

There is of course another notion of "sequential calculation steps" involved: the sequential layers of the model. However, I don't think the bolded part of this is true:

replacing the token by a dot reduces the number of serial steps the model can perform (from mn to m+n, if there are m forward passes and n layers)

If a model with N layers has been trained to always produce exactly M "dot" tokens before answering, then the number of serial steps is just N, not M+N.

One way to see this is to note that we don't actually need to run M separate forward passes. We can just pre-fill a context window containing the prompt tokens followed by M dot tokens, and run 1 forward pass on the whole thing.

Having the dots does add computation, but it's only extra parallel computation – there's still only one forward pass, just a "wider" one, with more computation happening in parallel inside each of the individually parallelizable steps (tensor multiplications, activation functions).

(If we relax the constraint that the number of dots is fixed, and allow the model to choose it based on the input, that still doesn't add much: note that we could do 1 forward pass on the prompt tokens followed by a very large number of dots, then find the first position where we would have sampled a non-dot from from output distribution, truncate the KV cache to end at that point and sample normally from there.)

If you haven't read the paper I linked in OP, I recommend it – it's pretty illuminating about these distinctions. See e.g. the stuff about CoT making LMs more powerful than versus dots adding more power withing $T C^{0}$ .

elityre on Eli's shortform feed

It's possible no one tried literally "recreate OkC", but I think dating startups are very oversubscribed by founders, relative to interest from VCs

If this is true, it's somewhat cruxy for me.

I'm still disappointed that no one cared enough to solve this problem without VC funding.

elityre on Eli's shortform feed

I only skimmed the retrospective now, but it seems mostly to be detailing problems that stymied their ability to find traction.

Right. But they were not relentlessly focused on solving this problem.

I straight up don't believe that that the problems outlined can't be surmounted, especially if you're going for a cashflow business instead of an exit.