LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

instruction tuning and autoregressive distribution shift
nostalgebraist · 2024-09-05T16:53:41.497Z · comments (5)

[link] marine cloud brightening
bhauth · 2023-08-09T02:50:56.639Z · comments (14)

Box inversion revisited
Jan_Kulveit · 2023-11-07T11:09:36.557Z · comments (3)

Whiteboard Pen Magazines are Useful
Johannes C. Mayer (johannes-c-mayer) · 2024-07-12T17:15:33.200Z · comments (8)

Intrinsic Drives and Extrinsic Misuse: Two Intertwined Risks of AI
jsteinhardt · 2023-10-31T05:10:02.581Z · comments (0)

Deconfusing Regret
Alex Hollow · 2023-09-15T11:52:03.294Z · comments (32)

[link] Progress Conference 2024: Toward Abundant Futures
jasoncrawford · 2024-06-26T15:39:45.267Z · comments (2)

What's up with all the non-Mormons? Weirdly specific universalities across LLMs
mwatkins · 2024-04-19T13:43:24.568Z · comments (13)

Long-Term Future Fund: May 2023 to March 2024 Payout recommendations
Linch · 2024-06-12T13:46:29.535Z · comments (0)

[link] ARC Evals: Responsible Scaling Policies
Zach Stein-Perlman · 2023-09-28T04:30:37.140Z · comments (9)

[link] "What if we could redesign society from scratch? The promise of charter cities." [Rational Animations video]
Jackson Wagner · 2024-02-18T00:57:50.444Z · comments (7)

Debate, Oracles, and Obfuscated Arguments
Jonah Brown-Cohen (jonah-brown-cohen) · 2024-06-20T23:14:57.340Z · comments (2)

Choosing My Quest (Part 2 of "The Sense Of Physical Necessity")
LoganStrohl (BrienneYudkowsky) · 2024-02-24T21:31:45.377Z · comments (7)

Mech Interp Puzzle 2: Word2Vec Style Embeddings
Neel Nanda (neel-nanda-1) · 2023-07-28T00:50:00.297Z · comments (4)

The Serendipity of Density
jefftk (jkaufman) · 2023-12-17T03:50:04.824Z · comments (4)

List your AI X-Risk cruxes!
Aryeh Englander (alenglander) · 2024-04-28T18:26:19.327Z · comments (7)

2025 Color Trends
sarahconstantin · 2024-10-07T21:20:03.962Z · comments (7)

Manifund Q1 Retro: Learnings from impact certs
Austin Chen (austin-chen) · 2024-05-01T16:48:33.140Z · comments (1)

[question] Does AI governance needs a "Federalist papers" debate?
azsantosk · 2023-10-18T21:08:26.098Z · answers+comments (4)

Planning to build a cryptographic box with perfect secrecy
Lysandre Terrisse · 2023-12-31T09:31:47.941Z · comments (6)

When Are Results from Computational Complexity Not Too Coarse?
Dalcy (Darcy) · 2024-07-03T19:06:44.953Z · comments (7)

Beware unfinished bridges
Adam Zerner (adamzerner) · 2024-05-12T09:29:07.808Z · comments (9)

"Does your paradigm beget new, good, paradigms?"
Raemon · 2024-01-25T18:23:15.497Z · comments (6)

Applying Force to the Wrong End of a Causal Chain
silentbob · 2024-06-22T18:06:32.364Z · comments (0)

[link] Book review: Cuisine and Empire
eukaryote · 2024-01-21T06:15:12.969Z · comments (2)

[question] Implications of China's recession on AGI development?
Eric Neyman (UnexpectedValues) · 2024-09-28T01:12:36.443Z · answers+comments (3)

Extrapolating from Five Words
Gordon Seidoh Worley (gworley) · 2023-11-15T23:21:30.865Z · comments (11)

Quantopian contest, but for food intake and weight
Lucent · 2023-11-08T05:41:35.050Z · comments (9)

[link] Queuing theory: Benefits of operating at 60% capacity
ampdot · 2023-12-01T18:48:01.426Z · comments (4)

[link] Dequantifying first-order theories
jessicata (jessica.liu.taylor) · 2024-04-23T19:04:49.000Z · comments (9)

How to solve deception and still fail.
Charlie Steiner · 2023-10-04T19:56:56.254Z · comments (7)

[link] Forecasting: the way I think about it
Molly (hickman-santini) · 2024-05-09T00:49:01.768Z · comments (4)

Reflexive decision theory is an unsolved problem
Richard_Kennaway · 2023-09-17T14:15:09.222Z · comments (27)

Movie posters
KatjaGrace · 2024-03-06T06:20:03.034Z · comments (0)

[link] The Data Wall is Important
JustisMills · 2024-06-09T22:54:20.070Z · comments (20)

[link] Eight Magic Lamps
Richard_Ngo (ricraz) · 2023-10-14T04:10:02.040Z · comments (0)

Apply to the PIBBSS Summer Research Fellowship
Nora_Ammann · 2024-01-12T04:06:58.328Z · comments (1)

Stitching SAEs of different sizes
Bart Bussmann (Stuckwork) · 2024-07-13T17:19:20.506Z · comments (12)

Medical Roundup #3
Zvi · 2024-07-09T13:10:06.862Z · comments (4)

[link] Linear infra-Bayesian Bandits
Vanessa Kosoy (vanessa-kosoy) · 2024-05-10T06:41:09.206Z · comments (5)

Monthly Roundup #23: October 2024
Zvi · 2024-10-16T13:50:05.869Z · comments (12)

How To Do Patching Fast
Joseph Miller (Josephm) · 2024-05-11T20:13:52.424Z · comments (6)

Anthropic rewrote its RSP
Zach Stein-Perlman · 2024-10-15T14:25:12.518Z · comments (17)

[link] Conflict in Posthuman Literature
Martín Soto (martinsq) · 2024-04-06T22:26:04.051Z · comments (1)

Nitric oxide for covid and other viral infections
Elizabeth (pktechgirl) · 2024-02-07T21:30:03.774Z · comments (6)

Forget Everything (Statistical Mechanics Part 1)
J Bostock (Jemist) · 2024-04-22T13:33:35.446Z · comments (6)

Prepsgiving, A Convergently Instrumental Human Practice
JenniferRM · 2023-11-23T17:24:56.784Z · comments (0)

Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations with MDL-SAEs
Kola Ayonrinde (kola-ayonrinde) · 2024-08-23T18:52:31.019Z · comments (5)

Individually incentivized safe Pareto improvements in open-source bargaining
Nicolas Macé (NicolasMace) · 2024-07-17T18:26:43.619Z · comments (2)

Logical Line-Of-Sight Makes Games Sequential or Loopy
StrivingForLegibility · 2024-01-19T04:05:44.782Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

cubefox on Alexander Gietelink Oldenziel's Shortform

Follow-up question: If sunglasses are so cool, why do relatively few people wear them? Perhaps they aren't that cool after all?

keltan on Interest in Leetcode, but for Rationality?

I imagine a character (Alice) is constantly used as the rational actor in scenarios. We make Alice a likeable character, give her a personality, a series of events and decisions that lead her to the present.

Then, when the user has been around for a sufficient amount of time. Alice starts to slip. She makes mistakes that harm others, perhaps she has disputes with ‘Stupidus’, Maybe she just begins to say untrue things.

How long will it take a user to pry themself out of the rose tinted glasses, and update on Alice?

mako-yass on Advice on Communicating Concisely

"if they don't understand, they will ask"

A lot of people have to write for audiences with narcissism, who never ask, because asking constitutes an admission that there might be something important that they don't understand. They're always looking for any reason, however shallow, to dismiss any view that surprises them too much.
So these writers feel like they have to pre-empt every possible objection, even the stupid ones that don't make any sense.

It's best if you can avoid having to write for audiences like that. But it's difficult to avoid them.

mako-yass on leogao's Shortform

You should be more curious about why, when you aim at a goal, you do not aim for the most effective way.

quila on OpenAI defected, but we can take honest actions

If we stand by while OpenAI violates its charter, it signals that their execs can get away with it. Worse, it signals that we don’t care.

what signals you send to OAI execs seems not relevant.

in the case where they really can't get away with it, e.g. where the state will really arrest them, then sending them signals / influencing their information state is not what causes that outcome.

if your advocacy causes the world to change such that "they can't get away with it" becomes true, this also does not route through influencing their information state.

OpenAI is seen as the industry leader, yet projected to lose $5 billion this year

i don't see why this would lead them to downsize, if "the gap between industry investment in deep learning and actual revenue has ballooned to over $600 billion a year"

teatieandhat on Cipolla's Shortform

I’m not quite sure how to answer your question, but at least I have similar feelings: that my conscientiousness is relatively low ; and that many people who do cooler stuff than me appear to be more driven, with clearer goals and a better ability to actually go and pursue them. I have various thoughts on this:

To an extent, it’s just an impression. Many people will struggle to do more than a fraction of what they wanted, and yet because they still do quite a lot and remain very upbeat, you don’t notice than they achieve relatively little compared to what they want, but they certainly notice that. Similarly, many people are working on cool projects and apparently having tons of fun doing it, but if you asked you’d learn that they have no clue about "what they want to do with their lives" or similar super long-term goals.
In fact, I suspect that most people feel at least a little like that at least sometimes, and that we grossly underestimate how likely others are to feel that way.
Yet, some people genuinely are better able to get stuff done and stay relentlessly focused on tasks than others. It can be built from habit, it can come from being really really into the one specific thing you’re working on, etc. If you struggle with that anyway, it might be something to do with mental health: famously ADHD, but also autism, or depression/anxiety can impact conscientiousness, and all these seem somewhat more common among LW readers than in the general population, so I dunno, maybe?
And some people are also better than others at being optimistic, enthusiastic, eager to do cool stuff. I guess there are many causes, and therefore many potential ways of dealing with it, but I personally like the explanation from low self-confidence, fear of failure, etc., making you less willing to try ambitious stuff (notice how you said "it’s like they’re already taking their success for certain", when, yeah that might be the case, but it might also be that they’re aware they can fail, but they think it’s likely they could easily recover from that failure anyway). It’s quite well described (imho) here.
But I’m pretty sure I’m covering only a relatively narrow part of the space of all the things that could be said on that topic, so I hope other people write other replies with completely different takes on the problem :-)

momom2 on Against empathy-by-default

Thanks, it does clarify, both on separating the instantiation of an empathy mechanism in the human brain vs in AI and on considering instantiation separately from the (evolutionary or training) process that leads to it.

shmi on Eliezer Yudkowsky Is Frequently, Confidently, Egregiously Wrong

The argument goes through on probabilities of each possible world, the limit toward perfection is not singular. given the 1000:1 reward ratio, for any predictor who is substantially better than chance once ought to one-box to maximize EV. Anyway, this is an old argument where people rarely manage to convince the other side.

quila on Alexander Gietelink Oldenziel's Shortform

how? edit: maybe you meant just the first kind

viliam on Advice on Communicating Concisely

Just guessing here, because I have a similar problem. You need to know your audience, so that you can skip the parts they already know, and only communicate the new part.

Also, depends on whether it is a monologue or dialogue; in monologue you err on the side of saying more, in dialog you can expect some "if they don't understand, they will ask".

.

For example, I sometimes realize that I am needlessly defensive, that I am unconsciously expecting the most uncharitable misinterpretation of anything I say -- that's because I have spent a lot of time offline with people who were like that -- so I am trying to make my argument ironclad, include all kinds of disclaimers, etc., which results in many extra words.

On the other hand, it is easy (and frequent) to err on the side of saying too little, making your message ambiguous without noticing it [? · GW]. Sometimes people appreciate that I include some extra context; I have been explicitly praised at work for writing great documentation.