LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Scenario Forecasting Workshop: Materials and Learnings
elifland · 2024-03-08T02:30:46.517Z · comments (3)

[link] A starter guide for evals
Marius Hobbhahn (marius-hobbhahn) · 2024-01-08T18:24:23.913Z · comments (2)

Gemini 1.0
Zvi · 2023-12-07T14:40:05.243Z · comments (7)

[link] Announcing Human-aligned AI Summer School
Jan_Kulveit · 2024-05-22T08:55:10.839Z · comments (0)

AI #52: Oops
Zvi · 2024-02-22T21:50:07.393Z · comments (9)

The Shortest Path Between Scylla and Charybdis
Thane Ruthenis · 2023-12-18T20:08:34.995Z · comments (8)

Unlearning via RMU is mostly shallow
Andy Arditi (andy-arditi) · 2024-07-23T16:07:52.223Z · comments (3)

n of m ring signatures
DanielFilan · 2023-12-04T20:00:06.580Z · comments (7)

[link] on the dollar-yen exchange rate
bhauth · 2024-04-07T04:49:53.920Z · comments (21)

[link] Finding Backward Chaining Circuits in Transformers Trained on Tree Search
abhayesian · 2024-05-28T05:29:46.777Z · comments (1)

Observations on Teaching for Four Weeks
ClareChiaraVincent · 2024-05-06T16:55:59.315Z · comments (14)

Changes in College Admissions
Zvi · 2024-04-24T13:50:03.487Z · comments (11)

GPT-2030 and Catastrophic Drives: Four Vignettes
jsteinhardt · 2023-11-10T07:30:06.480Z · comments (5)

Goal-Completeness is like Turing-Completeness for AGI
Liron · 2023-12-19T18:12:29.947Z · comments (26)

Toy models of AI control for concentrated catastrophe prevention
Fabien Roger (Fabien) · 2024-02-06T01:38:19.865Z · comments (2)

Altman firing retaliation incoming?
trevor (TrevorWiesinger) · 2023-11-19T00:10:15.645Z · comments (23)

On Overhangs and Technological Change
Roko · 2023-11-05T22:58:51.306Z · comments (19)

On Complexity Science
Garrett Baker (D0TheMath) · 2024-04-05T02:24:32.039Z · comments (19)

Apply to the Conceptual Boundaries Workshop for AI Safety
Chipmonk · 2023-11-27T21:04:59.037Z · comments (0)

Vipassana Meditation and Active Inference: A Framework for Understanding Suffering and its Cessation
Benjamin Sturgeon (benjamin-sturgeon) · 2024-03-21T12:32:22.475Z · comments (8)

[link] Peak Human Capital
PeterMcCluskey · 2024-09-30T21:13:30.421Z · comments (2)

Applications of Chaos: Saying No (with Hastings Greer)
Elizabeth (pktechgirl) · 2024-09-21T16:30:07.415Z · comments (16)

AI #82: The Governor Ponders
Zvi · 2024-09-19T13:30:04.863Z · comments (8)

Should rationalists be spiritual / Spirituality as overcoming delusion
Kaj_Sotala · 2024-03-25T16:48:08.397Z · comments (57)

They are made of repeating patterns
quetzal_rainbow · 2023-11-13T18:17:43.189Z · comments (4)

Job listing: Communications Generalist / Project Manager
Gretta Duleba (gretta-duleba) · 2023-11-06T20:21:03.721Z · comments (7)

Public Weights?
jefftk (jkaufman) · 2023-11-02T02:50:18.095Z · comments (19)

[question] why did OpenAI employees sign
bhauth · 2023-11-27T05:21:28.612Z · answers+comments (23)

Notes on control evaluations for safety cases
ryan_greenblatt · 2024-02-28T16:15:17.799Z · comments (0)

AI #58: Stargate AGI
Zvi · 2024-04-04T13:10:06.342Z · comments (9)

An issue with training schemers with supervised fine-tuning
Fabien Roger (Fabien) · 2024-06-27T15:37:56.020Z · comments (12)

The Broken Screwdriver and other parables
bhauth · 2024-03-04T03:34:38.807Z · comments (1)

Bounty: Diverse hard tasks for LLM agents
Beth Barnes (beth-barnes) · 2023-12-17T01:04:05.460Z · comments (31)

Tall Tales at Different Scales: Evaluating Scaling Trends For Deception In Language Models
Felix Hofstätter · 2023-11-08T11:37:43.997Z · comments (0)

AI #67: Brief Strange Trip
Zvi · 2024-06-06T18:50:03.514Z · comments (6)

So you want to work on technical AI safety
gw · 2024-06-24T14:29:57.481Z · comments (3)

[link] DM Parenting
Shoshannah Tekofsky (DarkSym) · 2024-07-16T08:50:08.144Z · comments (4)

Please do not use AI to write for you
Richard_Kennaway · 2024-08-21T09:53:34.425Z · comments (34)

[link] Anthropic announces interpretability advances. How much does this advance alignment?
Seth Herd · 2024-05-21T22:30:52.638Z · comments (4)

[LDSL#0] Some epistemological conundrums
tailcalled · 2024-08-07T19:52:55.688Z · comments (10)

[link] Chapter 1 of How to Win Friends and Influence People
gull · 2024-01-28T00:32:52.865Z · comments (5)

Wrong answer bias
lukehmiles (lcmgcd) · 2024-02-01T20:05:38.573Z · comments (24)

Interoperable High Level Structures: Early Thoughts on Adjectives
johnswentworth · 2024-08-22T21:12:38.223Z · comments (1)

Consent across power differentials
Ramana Kumar (ramana-kumar) · 2024-07-09T11:42:03.177Z · comments (12)

[link] in defense of Linus Pauling
bhauth · 2024-06-03T21:27:43.962Z · comments (8)

Book Review: Righteous Victims - A History of the Zionist-Arab Conflict
Yair Halberstadt (yair-halberstadt) · 2024-06-24T11:02:03.490Z · comments (8)

[link] Web-surfing tips for strange times
eukaryote · 2024-05-31T07:10:25.805Z · comments (19)

Safety First: safety before full alignment. The deontic sufficiency hypothesis.
Chipmonk · 2024-01-03T17:55:19.825Z · comments (3)

The Dunning-Kruger of disproving Dunning-Kruger
kromem · 2024-05-16T10:11:33.108Z · comments (0)

SRE's review of Democracy
Martin Sustrik (sustrik) · 2024-08-03T07:20:01.483Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

christiankl on Why I’m not a Bayesian

Imagine, you have a function f that takes a_1, a_2, ..., a_n and returns b_1, b_2, ... b_m. a_1, a_2, ..., a_n are boolean states of the known world and b_1, b_2, ... b_m boolean states of the world you don't yet know. Because f uses predicate logic internally you can't modify it to take values between 0 and 1 and have to accept that it can only take boolean values.

When you do your probability augmentation you can easily add probabilities to a_1, a_2, ..., a_n and have P(a_1), P(a_2), ..., P(a_n), as those are part of the known world.

On the other hand, how would you get P(b_1), P(b_2), ... , P(b_m)?

cubefox on Alexander Gietelink Oldenziel's Shortform

Follow-up question: If sunglasses are so cool, why do relatively few people wear them? Perhaps they aren't that cool after all?

keltan on Interest in Leetcode, but for Rationality?

I imagine a character (Alice) is constantly used as the rational actor in scenarios. We make Alice a likeable character, give her a personality, a series of events and decisions that lead her to the present.

Then, when the user has been around for a sufficient amount of time. Alice starts to slip. She makes mistakes that harm others, perhaps she has disputes with ‘Stupidus’, Maybe she just begins to say untrue things.

How long will it take a user to pry themself out of the rose tinted glasses, and update on Alice?

mako-yass on Advice on Communicating Concisely

"if they don't understand, they will ask"

A lot of people have to write for audiences with narcissism, who never ask, because asking constitutes an admission that there might be something important that they don't understand. They're always looking for any reason, however shallow, to dismiss any view that surprises them too much.
So these writers feel like they have to pre-empt every possible objection, even the stupid ones that don't make any sense.

It's best if you can avoid having to write for audiences like that. But it's difficult to avoid them.

mako-yass on leogao's Shortform

You should be more curious about why, when you aim at a goal, you do not aim for the most effective way.

quila on OpenAI defected, but we can take honest actions

If we stand by while OpenAI violates its charter, it signals that their execs can get away with it. Worse, it signals that we don’t care.

what signals you send to OAI execs seems not relevant.

in the case where they really can't get away with it, e.g. where the state will really arrest them, then sending them signals / influencing their information state is not what causes that outcome.

if your advocacy causes the world to change such that "they can't get away with it" becomes true, this also does not route through influencing their information state.

OpenAI is seen as the industry leader, yet projected to lose $5 billion this year

i don't see why this would lead them to downsize, if "the gap between industry investment in deep learning and actual revenue has ballooned to over $600 billion a year"

teatieandhat on Cipolla's Shortform

I’m not quite sure how to answer your question, but at least I have similar feelings: that my conscientiousness is relatively low ; and that many people who do cooler stuff than me appear to be more driven, with clearer goals and a better ability to actually go and pursue them. I have various thoughts on this:

To an extent, it’s just an impression. Many people will struggle to do more than a fraction of what they wanted, and yet because they still do quite a lot and remain very upbeat, you don’t notice than they achieve relatively little compared to what they want, but they certainly notice that. Similarly, many people are working on cool projects and apparently having tons of fun doing it, but if you asked you’d learn that they have no clue about "what they want to do with their lives" or similar super long-term goals.
In fact, I suspect that most people feel at least a little like that at least sometimes, and that we grossly underestimate how likely others are to feel that way.
Yet, some people genuinely are better able to get stuff done and stay relentlessly focused on tasks than others. It can be built from habit, it can come from being really really into the one specific thing you’re working on, etc. If you struggle with that anyway, it might be something to do with mental health: famously ADHD, but also autism, or depression/anxiety can impact conscientiousness, and all these seem somewhat more common among LW readers than in the general population, so I dunno, maybe?
And some people are also better than others at being optimistic, enthusiastic, eager to do cool stuff. I guess there are many causes, and therefore many potential ways of dealing with it, but I personally like the explanation from low self-confidence, fear of failure, etc., making you less willing to try ambitious stuff (notice how you said "it’s like they’re already taking their success for certain", when, yeah that might be the case, but it might also be that they’re aware they can fail, but they think it’s likely they could easily recover from that failure anyway). It’s quite well described (imho) here.
But I’m pretty sure I’m covering only a relatively narrow part of the space of all the things that could be said on that topic, so I hope other people write other replies with completely different takes on the problem :-)

momom2 on Against empathy-by-default

Thanks, it does clarify, both on separating the instantiation of an empathy mechanism in the human brain vs in AI and on considering instantiation separately from the (evolutionary or training) process that leads to it.

shmi on Eliezer Yudkowsky Is Frequently, Confidently, Egregiously Wrong

The argument goes through on probabilities of each possible world, the limit toward perfection is not singular. given the 1000:1 reward ratio, for any predictor who is substantially better than chance once ought to one-box to maximize EV. Anyway, this is an old argument where people rarely manage to convince the other side.

quila on Alexander Gietelink Oldenziel's Shortform

how? edit: maybe you meant just the first kind