LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Introducing Transluce — A Letter from the Founders
jsteinhardt · 2024-10-23T18:10:02.526Z · comments (2)

Interpretability with Sparse Autoencoders (Colab exercises)
CallumMcDougall (TheMcDouglas) · 2023-11-29T12:56:21.608Z · comments (9)

A Simple Toy Coherence Theorem
johnswentworth · 2024-08-02T17:47:50.642Z · comments (16)

Anthropic Fall 2023 Debate Progress Update
Ansh Radhakrishnan (anshuman-radhakrishnan-1) · 2023-11-28T05:37:30.070Z · comments (9)

(Not) Derailing the LessOnline Puzzle Hunt
Error · 2024-06-04T01:28:31.688Z · comments (2)

Idealized Agents Are Approximate Causal Mirrors (+ Radical Optimism on Agent Foundations)
Thane Ruthenis · 2023-12-22T20:19:13.865Z · comments (14)

[link] Nick Bostrom’s new book, “Deep Utopia”, is out today
PeterH · 2024-03-27T11:24:01.401Z · comments (5)

Joshua Achiam Public Statement Analysis
Zvi · 2024-10-10T12:50:06.285Z · comments (14)

Dialogue on the Claim: "OpenAI's Firing of Sam Altman (And Shortly-Subsequent Events) On Net Reduced Existential Risk From AGI"
johnswentworth · 2023-11-21T17:39:17.828Z · comments (84)

The One and a Half Gemini
Zvi · 2024-02-22T13:10:04.725Z · comments (4)

On Dwarkesh’s Podcast with OpenAI’s John Schulman
Zvi · 2024-05-21T17:30:04.332Z · comments (4)

[Full Post] Progress Update #1 from the GDM Mech Interp Team
Neel Nanda (neel-nanda-1) · 2024-04-19T19:06:59.185Z · comments (10)

The World in 2029
Nathan Young · 2024-03-02T18:03:29.368Z · comments (37)

Companies' safety plans neglect risks from scheming AI
Zach Stein-Perlman · 2024-06-03T15:00:20.236Z · comments (4)

Scalable Oversight and Weak-to-Strong Generalization: Compatible approaches to the same problem
Ansh Radhakrishnan (anshuman-radhakrishnan-1) · 2023-12-16T05:49:23.672Z · comments (3)

[link] Soft Nationalization: how the USG will control AI labs
Deric Cheng (deric-cheng) · 2024-08-27T15:11:14.601Z · comments (7)

A Gentle Introduction to Risk Frameworks Beyond Forecasting
pendingsurvival · 2024-04-11T18:03:25.605Z · comments (10)

[link] A Narrow Path: a plan to deal with AI extinction risk
Andrea_Miotti (AndreaM) · 2024-10-07T13:02:15.229Z · comments (10)

Interpreting Preference Models w/ Sparse Autoencoders
Logan Riggs (elriggs) · 2024-07-01T21:35:40.603Z · comments (12)

When "yang" goes wrong
Joe Carlsmith (joekc) · 2024-01-08T16:35:50.607Z · comments (6)

[link] Excerpts from "A Reader's Manifesto"
Arjun Panickssery (arjun-panickssery) · 2024-09-06T22:37:40.254Z · comments (1)

Announcing Suffering For Good
Garrett Baker (D0TheMath) · 2024-04-01T17:08:12.322Z · comments (5)

Prompts for Big-Picture Planning
Raemon · 2024-04-13T03:04:24.523Z · comments (1)

In Defense of Open-Minded UDT
abramdemski · 2024-08-12T18:27:36.220Z · comments (27)

[link] LK-99 in retrospect
bhauth · 2024-07-07T02:06:27.660Z · comments (21)

D&D.Sci Scenario Index
aphyer · 2024-07-23T02:00:43.483Z · comments (0)

[question] Interest in Leetcode, but for Rationality?
Gregory (gregory-eales) · 2024-10-16T17:54:25.578Z · answers+comments (20)

Claude 3 claims it's conscious, doesn't want to die or be modified
Mikhail Samin (mikhail-samin) · 2024-03-04T23:05:00.376Z · comments (113)

AXRP Episode 31 - Singular Learning Theory with Daniel Murfet
DanielFilan · 2024-05-07T03:50:05.001Z · comments (4)

Testbed evals: evaluating AI safety even when it can’t be directly measured
joshc (joshua-clymer) · 2023-11-15T19:00:41.908Z · comments (2)

LW Frontpage Experiments! (aka "Take the wheel, Shoggoth!")
Ruby · 2024-04-23T03:58:43.443Z · comments (27)

AI for Bio: State Of The Field
sarahconstantin · 2024-08-30T18:00:02.187Z · comments (2)

Survey for alignment researchers!
Cameron Berg (cameron-berg) · 2024-02-02T20:41:44.323Z · comments (11)

The Mask Comes Off: At What Price?
Zvi · 2024-10-21T23:50:05.247Z · comments (16)

FarmKind's Illusory Offer
jefftk (jkaufman) · 2024-08-09T11:30:07.082Z · comments (5)

Guide to SB 1047
Zvi · 2024-08-20T13:10:07.408Z · comments (18)

Do sparse autoencoders find "true features"?
Demian Till · 2024-02-22T18:06:59.630Z · comments (33)

We need a Science of Evals
Marius Hobbhahn (marius-hobbhahn) · 2024-01-22T20:30:39.493Z · comments (13)

Some Rules for an Algebra of Bayes Nets
johnswentworth · 2023-11-16T23:53:11.650Z · comments (35)

“Artificial General Intelligence”: an extremely brief FAQ
Steven Byrnes (steve2152) · 2024-03-11T17:49:02.496Z · comments (6)

Dumbing down
Martin Sustrik (sustrik) · 2024-06-09T06:50:47.469Z · comments (0)

[link] [Repost] The Copenhagen Interpretation of Ethics
mesaoptimizer · 2024-01-25T15:20:08.162Z · comments (4)

[link] If far-UV is so great, why isn't it everywhere?
Austin Chen (austin-chen) · 2024-10-19T18:56:58.910Z · comments (23)

Instruction-following AGI is easier and more likely than value aligned AGI
Seth Herd · 2024-05-15T19:38:03.185Z · comments (25)

If we solve alignment, do we die anyway?
Seth Herd · 2024-08-23T13:13:10.933Z · comments (68)

[link] Gwern: Why So Few Matt Levines?
kave · 2024-10-29T01:07:27.564Z · comments (9)

Epistemic Hell
rogersbacon · 2024-01-27T17:13:09.578Z · comments (20)

[link] Yoshua Bengio: Reasoning through arguments against taking AI safety seriously
Judd Rosenblatt (judd) · 2024-07-11T23:53:17.187Z · comments (3)

[link] OpenAI: Preparedness framework
Zach Stein-Perlman · 2023-12-18T18:30:10.153Z · comments (23)

[link] The True Story of How GPT-2 Became Maximally Lewd
Writer · 2024-01-18T21:03:08.167Z · comments (7)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

martin-vlach on Survival without dignity

some OpenAI board members who the Office of National AI Strategy was allowed to appoint, and they did in fact try to fire Sam Altman over the UAE move, but somehow a week later Sam was running the Multinational Artificial Narrow Intelligence Alignment Consortium, which sort of morphed into OpenAI's oversight body, which sort of morphed into OpenAI's parent company, and, well, you can guess who was running that.

pretty sassy abbreviations spiced in there.'Đ

mr-frege on Both-Sidesism—When Fair & Balanced Goes Wrong

This comment received a lot of backlash. Admittedly, my comment wasn't as diplomatic as it might've been nor did I elaborate on my own reasoning. In my defence, I didn't think the original article was making much of an effort to get at the truth (see other criticisms above). Rather, it is a (very) one-sided account advocating that we should not consider the other side of the story (ie, it is an attack on both-sidesism).

The attack on both-sidesism is consistent with findings referenced in the video below. Both sides are prone to such anti-democratic behaviour, but the findings also suggest that one side is "slightly more willing to sacrifice democracy (by supporting actions that benefit their own party at the expense of democracy)". This might be a case in point.

Feel free to watch the whole thing, but the tldr part starts at 4:05.
https://www.youtube.com/watch?v=PVqjH6MaqRY

The events of Nov 6/7, 2024 might support the argument that the original argument was indeed self-defeating; ie, an argument against the argument against both-sidesism---effectively, an argument for both-sidesism.

avancil on Should CA, TX, OK, and LA merge into a giant swing state, just for elections?

Except there's more at play than just winning the election. If you're a voter in a swing state, the candidates are paying more attention to you, and making more promises catering to you. The parties are picking candidates they think will appeal to you. Even if your odds of winning stay the same, the prize for winning gets bigger.

It was exiting a few elections ago when Colorado was in play by both parties. We even got to host the Democratic convention in Denver. Now, they just ignore us.

d0themath on The Median Researcher Problem

I really feel like we're talking past each other here, because I have no idea how any of what you said relates to what I said, except the first paragraph.

As for that, what you describe sounds worse than a median researcher problem, instead sounding like a situation ripe for group think instead!

ninety-three on Should CA, TX, OK, and LA merge into a giant swing state, just for elections?

This proposal increases the influence of the states, in the sense of "how much does it matter that any given person bothered to vote?", but does it increase their preference satisfaction? If the 4 states each conceive of themselves as red or blue states, then each of them will be thinking "under the current system I estimate an X% chance that we'll elect my party's president while under the new system I estimate a Y% chance we'll elect my party's president". If both sides are perfect predictors then one will conclude that Y<X so they should not do the deal. If both sides are imperfect predictors such that they both think Y>X, then the outside view still tells them it's equally likely that they're the sucker here and shouldn't participate.

aphyer on Gurkenglas's Shortform

Were whichever markets you're looking at open at this time? Most stuff doesn't trade that much out of hours.

aprilsr on Does the "ancient wisdom" argument have any validity? If a particular teaching or tradition is old, to what extent does this make it more trustworthy?

I mostly think it's too loose a heuristic and that you should dig into more details

thomas-kwa on Thomas Kwa's Shortform

What's the most important technical question in AI safety right now?

npostavs on AI #89: Trump Card

Finding two bugs in a large codebase doesn't seem especially suspicious to me.

sharmake-farah on Anthropic: Three Sketches of ASL-4 Safety Case Components

While I agree that people are in general overconfident, including LessWrongers, I don't particularly think this is because Bayesianism is philosophically incorrect, but rather due to both practical limits on computation combined with sometimes not realizing how data-poor their efforts truly are.

(There are philosophical problems with Bayesianism, but not ones that predict very well the current issues of overconfidence in real human reasoning, so I don't see why Bayesianism is so central here. Separately, while I'm not sure there can ever be a complete theory of epistemology, I do think that Bayesianism is actually quite general, and a lot of the principles of Bayesianism is probably implemented in human brains, allowing for practicality concerns like cost of compute.)