LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[question] What are your cruxes for imprecise probabilities / decision rules?
Anthony DiGiovanni (antimonyanthony) · 2024-07-31T15:42:27.057Z · answers+comments (29)

[link] Learning coefficient estimation: the details
Zach Furman (zfurman) · 2023-11-16T03:19:09.013Z · comments (0)

[link] Scalable And Transferable Black-Box Jailbreaks For Language Models Via Persona Modulation
Soroush Pour (soroush-pour) · 2023-11-07T17:59:36.857Z · comments (2)

Introduce a Speed Maximum
jefftk (jkaufman) · 2024-01-11T02:50:04.284Z · comments (28)

Mental Masturbation and the Intellectual Comfort Zone
Declan Molony (declan-molony) · 2024-05-07T05:47:05.257Z · comments (2)

Finding the Wisdom to Build Safe AI
Gordon Seidoh Worley (gworley) · 2024-07-04T19:04:16.089Z · comments (10)

[link] Who is Sam Bankman-Fried (SBF) really, and how could he have done what he did? - three theories and a lot of evidence
spencerg · 2023-11-11T01:04:22.747Z · comments (28)

The "context window" analogy for human minds
Ruby · 2024-02-13T19:29:10.387Z · comments (0)

[link] ∀: a story
Richard_Ngo (ricraz) · 2023-12-17T22:42:32.857Z · comments (1)

[link] Scaling laws for dominant assurance contracts
jessicata (jessica.liu.taylor) · 2023-11-28T23:11:07.631Z · comments (5)

[link] UC Berkeley course on LLMs and ML Safety
Dan H (dan-hendrycks) · 2024-07-09T15:40:00.920Z · comments (1)

Please Bet On My Quantified Self Decision Markets
niplav · 2023-12-01T20:07:38.284Z · comments (6)

A Socratic dialogue with my student
lsusr · 2023-12-05T09:31:05.266Z · comments (14)

Good job opportunities for helping with the most important century
HoldenKarnofsky · 2024-01-18T17:30:03.332Z · comments (0)

[link] Claude 3 Opus can operate as a Turing machine
Gunnar_Zarncke · 2024-04-17T08:41:57.209Z · comments (2)

(Appetitive, Consummatory) ≈ (RL, reflex)
Steven Byrnes (steve2152) · 2024-06-15T15:57:39.533Z · comments (1)

Deeply Cover Car Crashes?
jefftk (jkaufman) · 2023-12-10T22:20:01.133Z · comments (31)

[link] Toki pona FAQ
dkl9 · 2024-03-17T21:44:21.782Z · comments (8)

Closeness To the Issue (Part 5 of "The Sense Of Physical Necessity")
LoganStrohl (BrienneYudkowsky) · 2024-03-09T00:36:47.388Z · comments (0)

AI Safety Camp final presentations
Linda Linsefors · 2024-03-29T14:27:43.503Z · comments (3)

Drone Wars Endgame
RussellThor · 2024-02-01T02:30:46.161Z · comments (71)

[link] "Model UN Solutions"
Arjun Panickssery (arjun-panickssery) · 2023-12-08T23:06:33.490Z · comments (5)

[link] Big tech transitions are slow (with implications for AI)
jasoncrawford · 2024-10-24T14:25:06.873Z · comments (16)

Eye contact is effortless when you’re no longer emotionally blocked on it
Chipmonk · 2024-09-27T21:47:01.970Z · comments (24)

Is the Power Grid Sustainable?
jefftk (jkaufman) · 2024-10-26T02:30:06.612Z · comments (37)

[link] On Fables and Nuanced Charts
Niko_McCarty (niko-2) · 2024-09-08T17:09:07.503Z · comments (2)

Monthly Roundup #22: September 2024
Zvi · 2024-09-17T12:20:08.297Z · comments (10)

Video and transcript of presentation on Otherness and control in the age of AGI
Joe Carlsmith (joekc) · 2024-10-08T22:30:38.054Z · comments (1)

Book Review: On the Edge: The Gamblers
Zvi · 2024-09-24T11:50:06.065Z · comments (1)

Open consultancy: Letting untrusted AIs choose what answer to argue for
Fabien Roger (Fabien) · 2024-03-12T20:38:03.785Z · comments (5)

Open Thread – Winter 2023/2024
habryka (habryka4) · 2023-12-04T22:59:49.957Z · comments (160)

List of strategies for mitigating deceptive alignment
joshc (joshua-clymer) · 2023-12-02T05:56:50.867Z · comments (2)

Dangers of Closed-Loop AI
Gordon Seidoh Worley (gworley) · 2024-03-22T23:52:22.010Z · comments (7)

Doomsday Argument and the False Dilemma of Anthropic Reasoning
Ape in the coat · 2024-07-05T05:38:39.428Z · comments (55)

Proposal for improving the global online discourse through personalised comment ordering on all websites
Roman Leventov · 2023-12-06T18:51:37.645Z · comments (21)

Predictive model agents are sort of corrigible
Raymond D · 2024-01-05T14:05:03.037Z · comments (6)

How predictive processing solved my wrist pain
max_shen (makoshen) · 2024-07-04T01:56:20.162Z · comments (8)

[Valence series] 4. Valence & Social Status (deprecated)
Steven Byrnes (steve2152) · 2023-12-15T14:24:41.040Z · comments (19)

Humans aren't fleeb.
Charlie Steiner · 2024-01-24T05:31:46.929Z · comments (5)

Secondary Risk Markets
Vaniver · 2023-12-11T21:52:46.836Z · comments (4)

Representation Tuning
Christopher Ackerman (christopher-ackerman) · 2024-06-27T17:44:33.338Z · comments (9)

Forecasting AI (Overview)
jsteinhardt · 2023-11-16T19:00:04.218Z · comments (0)

A sketch of acausal trade in practice
Richard_Ngo (ricraz) · 2024-02-04T00:32:54.622Z · comments (4)

My Detailed Notes & Commentary from Secular Solstice
Jeffrey Heninger (jeffrey-heninger) · 2024-03-23T18:48:51.894Z · comments (16)

Empirical vs. Mathematical Joints of Nature
Elizabeth (pktechgirl) · 2024-06-26T01:55:22.858Z · comments (1)

'Theories of Values' and 'Theories of Agents': confusions, musings and desiderata
Mateusz Bagiński (mateusz-baginski) · 2023-11-15T16:00:48.926Z · comments (8)

[link] List of Collective Intelligence Projects
Chipmonk · 2024-07-02T14:10:41.789Z · comments (9)

Index of rationalist groups in the Bay Area July 2024
Lucie Philippon (lucie-philippon) · 2024-07-26T16:32:25.337Z · comments (10)

What Helped Me - Kale, Blood, CPAP, X-tiamine, Methylphenidate
Johannes C. Mayer (johannes-c-mayer) · 2024-01-03T13:22:11.700Z · comments (12)

[link] My article in The Nation — California’s AI Safety Bill Is a Mask-Off Moment for the Industry
garrison · 2024-08-15T19:25:59.592Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

avancil on Should CA, TX, OK, and LA merge into a giant swing state, just for elections?

Except there's more at play than just winning the election. If you're a voter in a swing state, the candidates are paying more attention to you, and making more promises catering to you. The parties are picking candidates they think will appeal to you. Even if your odds of winning stay the same, the prize for winning gets bigger.

It was exiting a few elections ago when Colorado was in play by both parties. We even got to host the Democratic convention in Denver. Now, they just ignore us.

d0themath on The Median Researcher Problem

I really feel like we're talking past each other here, because I have no idea how any of what you said relates to what I said, except the first paragraph.

As for that, what you describe sounds worse than a median researcher problem, instead sounding like a situation ripe for group think instead!

ninety-three on Should CA, TX, OK, and LA merge into a giant swing state, just for elections?

This proposal increases the influence of the states, in the sense of "how much does it matter that any given person bothered to vote?", but does it increase their preference satisfaction? If the 4 states each conceive of themselves as red or blue states, then each of them will be thinking "under the current system I estimate an X% chance that we'll elect my party's president while under the new system I estimate a Y% chance we'll elect my party's president". If both sides are perfect predictors then one will conclude that Y<X so they should not do the deal. If both sides are imperfect predictors such that they both think Y>X, then the outside view still tells them it's equally likely that they're the sucker here and shouldn't participate.

aphyer on Gurkenglas's Shortform

Were whichever markets you're looking at open at this time? Most stuff doesn't trade that much out of hours.

aprilsr on Does the "ancient wisdom" argument have any validity? If a particular teaching or tradition is old, to what extent does this make it more trustworthy?

I mostly think it's too loose a heuristic and that you should dig into more details

thomas-kwa on Thomas Kwa's Shortform

What's the most important technical question in AI safety right now?

npostavs on AI #89: Trump Card

Finding two bugs in a large codebase doesn't seem especially suspicious to me.

sharmake-farah on Anthropic: Three Sketches of ASL-4 Safety Case Components

While I agree that people are in general overconfident, including LessWrongers, I don't particularly think this is because Bayesianism is philosophically incorrect, but rather due to both practical limits on computation combined with sometimes not realizing how data-poor their efforts truly are.

(There are philosophical problems with Bayesianism, but not ones that predict very well the current issues of overconfidence in real human reasoning, so I don't see why Bayesianism is so central here. Separately, while I'm not sure there can ever be a complete theory of epistemology, I do think that Bayesianism is actually quite general, and a lot of the principles of Bayesianism is probably implemented in human brains, allowing for practicality concerns like cost of compute.)

ricraz on Anthropic: Three Sketches of ASL-4 Safety Case Components

We have discussed this dynamic before but just for the record:

I think that if it became industry-standard practice for AGI corporations to write, publish, and regularly update (actual instead of just hypothetical) safety cases at at this level of rigor and detail, my p(doom) would cut in half.

This is IMO not the type of change that should be able to cut someone's P(doom) in half. There are so many different factors that are of this size and importance or bigger (including many that people simply have not thought of yet) such that, if this change could halve your P(doom), then your P(doom) should be oscillating wildly all the time.

I flag this as an example of prioritizing inside-view considerations too strongly in forecasts. I think this is the sort of problem that arises when you "take bayesianism too seriously", which is one of the reasons why I wrote my recent post on why I'm not a bayesian [LW · GW] (and also my earlier post on Knightian uncertainty [LW · GW]).

For context: our previous discussions about this related to Daniel's claim that appointing one specific person to one specific important job could change his P(doom) by double digit percentage points. I similarly think this is not the type of consideration that should be able to swing people's P(doom) that much (except maybe changing the US or Chinese leaders, but we weren't talking about those).

Lastly, since this is a somewhat critical comment, I should flag that I really appreciate and admire Daniel's forecasting, have learned a lot from him, and think he's generally a great guy. The epistemology disagreements just disproportionately bug me.

exceph on Scissors Statements for President?

In my experience, the first step in reconciling conflict is to understand one's own values, before listening to those of others. There are multiple reasons for this step, but the one relevant to your point is that by reflecting on the tradeoffs that I accept or reject and why, I can feel secure in listening to someone else's point of view. If their approach addresses my own concerns, then I can recognize it and that dissolves the disagreement. If it doesn't, then I know enough about what I really want to suggest modifications to their approach that would address my concerns. Either way, it keeps me safe from value-drift, especially on important principles like ethics.

Just because someone else has valid concerns doesn't mean I have to give up any of my own, but it doesn't mean we're at an impasse either. Humans have a habit of turning disagreements into false dichotomies. When they listen to each other, the conversation becomes, "alright, I understand your concerns, but you understand why mine are more important, right?" They are so quick to ask other people to sacrifice their values that they don't think of exploring alternative approaches, ones that can change the situation to fulfill the values of all the stakeholders. That's what I'm working on changing.

Does that all make sense?