LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

What’s the short timeline plan?
Marius Hobbhahn (marius-hobbhahn) · 2025-01-02T14:59:20.026Z · comments (37)

Shallow review of technical AI safety, 2024
technicalities · 2024-12-29T12:01:14.724Z · comments (31)

Maximizing Communication, not Traffic
jefftk (jkaufman) · 2025-01-05T13:00:02.280Z · comments (7)

OpenAI #10: Reflections
Zvi · 2025-01-07T17:00:07.348Z · comments (6)

How will we update about scheming?
ryan_greenblatt · 2025-01-06T20:21:52.281Z · comments (4)

What Indicators Should We Watch to Disambiguate AGI Timelines?
snewman · 2025-01-06T19:57:43.398Z · comments (32)

2024 in AI predictions
jessicata (jessica.liu.taylor) · 2025-01-01T20:29:49.132Z · comments (2)

The Plan - 2024 Update
johnswentworth · 2024-12-31T13:29:53.888Z · comments (27)

Capital Ownership Will Not Prevent Human Disempowerment
beren · 2025-01-05T06:00:23.095Z · comments (9)

Why I'm Moving from Mechanistic to Prosaic Interpretability
Daniel Tan (dtch1997) · 2024-12-30T06:35:43.417Z · comments (34)

[link] Parkinson's Law and the Ideology of Statistics
Benquo · 2025-01-04T15:49:21.247Z · comments (2)

[Fiction] [Comic] Effective Altruism and Rationality meet at a Secular Solstice afterparty
tandem · 2025-01-07T19:11:21.238Z · comments (4)

My AGI safety research—2024 review, ’25 plans
Steven Byrnes (steve2152) · 2024-12-31T21:05:19.037Z · comments (4)

Comment on "Death and the Gorgon"
Zack_M_Davis · 2025-01-01T05:47:30.730Z · comments (27)

Reasons for and against working on technical AI safety at a frontier AI lab
bilalchughtai (beelal) · 2025-01-05T14:49:53.529Z · comments (12)

Introducing Squiggle AI
ozziegooen · 2025-01-03T17:53:42.915Z · comments (13)

Is "VNM-agent" one of several options, for what minds can grow up into?
AnnaSalamon · 2024-12-30T06:36:20.890Z · comments (47)

The subset parity learning problem: much more than you wanted to know
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-03T09:13:59.245Z · comments (17)

[link] The Intelligence Curse
lukedrago · 2025-01-03T19:07:43.493Z · comments (26)

Some arguments against a land value tax
Matthew Barnett (matthew-barnett) · 2024-12-29T15:17:00.740Z · comments (37)

Human study on AI spear phishing campaigns
Simon Lermen (dalasnoin) · 2025-01-03T15:11:14.765Z · comments (8)

2025 Prediction Thread
habryka (habryka4) · 2024-12-30T01:50:14.216Z · comments (18)

[link] Learn to write well BEFORE you have something worth saying
eukaryote · 2024-12-29T23:42:31.906Z · comments (18)

[link] "We know how to build AGI" - Sam Altman
Nikola Jurkovic (nikolaisalreadytaken) · 2025-01-06T02:05:05.134Z · comments (5)

Tips On Empirical Research Slides
James Chua (james-chua) · 2025-01-08T05:06:44.942Z · comments (1)

[link] Testing for Scheming with Model Deletion
Guive (GAA) · 2025-01-07T01:54:13.550Z · comments (12)

Read The Sequences As If They Were Written Today
Peter Berggren (peter-berggren) · 2025-01-02T02:51:36.537Z · comments (3)

o3, Oh My
Zvi · 2024-12-30T14:10:05.144Z · comments (16)

[link] new chinese stealth aircraft
bhauth · 2025-01-01T00:19:10.644Z · comments (3)

The OODA Loop -- Observe, Orient, Decide, Act
Davis_Kingsley · 2025-01-01T08:00:27.979Z · comments (2)

Activation space interpretability may be doomed
bilalchughtai (beelal) · 2025-01-08T12:49:38.421Z · comments (1)

DeekSeek v3: The Six Million Dollar Model
Zvi · 2024-12-31T15:10:06.924Z · comments (6)

[link] On Eating the Sun
jessicata (jessica.liu.taylor) · 2025-01-08T04:57:20.457Z · comments (7)

Considerations on orca intelligence
Towards_Keeperhood (Simon Skade) · 2024-12-29T14:35:16.445Z · comments (5)

Stream Entry
lsusr · 2025-01-07T23:56:13.530Z · comments (0)

[link] Preference Inversion
Benquo · 2025-01-02T18:15:52.938Z · comments (37)

AI #97: 4
Zvi · 2025-01-02T14:10:06.505Z · comments (4)

[link] Began a pay-on-results coaching experiment, made $40,300 since July
Chipmonk · 2024-12-29T21:12:02.574Z · comments (14)

Practicing Bayesian Epistemology with "Two Boys" Probability Puzzles
Liron · 2025-01-02T04:42:20.362Z · comments (14)

[link] Alignment Is Not All You Need
Adam Jones (domdomegg) · 2025-01-02T17:50:00.486Z · comments (10)

My January alignment theory Nanowrimo
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-02T00:07:24.050Z · comments (2)

Role embeddings: making authorship more salient to LLMs
Nina Panickssery (NinaR) · 2025-01-07T20:13:16.677Z · comments (0)

What happens next?
Logan Zoellner (logan-zoellner) · 2024-12-29T01:41:33.685Z · comments (19)

The Laws of Large Numbers
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-04T11:54:16.967Z · comments (6)

AI Safety as a YC Startup
Lukas Petersson (lukas-petersson-1) · 2025-01-08T10:46:29.042Z · comments (3)

Building Big Science from the Bottom-Up: A Fractal Approach to AI Safety
Lauren Greenspan (LaurenGreenspan) · 2025-01-07T03:08:51.447Z · comments (2)

Grammars, subgrammars, and combinatorics of generalization in transformers
Dmitry Vaintrob (dmitry-vaintrob) · 2025-01-02T09:37:23.191Z · comments (0)

Fireplace and Candle Smoke
jefftk (jkaufman) · 2025-01-01T01:50:01.408Z · comments (4)

Estimating the benefits of a new flu drug (BXM)
DirectedEvolution (AllAmericanBreakfast) · 2025-01-06T04:31:16.837Z · comments (2)

[link] Oppression and production are competing explanations for wealth inequality.
Benquo · 2025-01-05T14:13:15.398Z · comments (15)

next page (older posts) →

Archive

Recent comments

yoshinori-okamoto on Open Thread Fall 2024

I introduce my research work, which has been published in the Japanese AI community, in an effort to contribute to the rationality of the world. It would be wonderful if AI advancements could realize an era where researchers from different countries can publish their research in their native languages, with AI translating it into multiple languages. This is such an experimental website (Introduction to AI Research Papers).

gwern on Disagreement on AGI Suggests It’s Near

Specifically, as an antichrist, as the Gospels specifically warn that "false messiahs and false prophets will appear and produce great signs and omens", among other things. (And the position that the second coming has already happened - completely, not merely partially - is hyperpreterism.)

yair-halberstadt on XX by Rian Hughes: Pretentious Bullshit

I did also like the Ascension story. It did a very good job of imitating 1960s sci fi magazine stories. In a way it shows off his talent as an author more than the main story does!

charbel-raphael on ryan_greenblatt's Shortform

Yeah, fair enough. I think someone should try to do a more representative experiment and we could then monitor this metric.

btw, something that bothers me a little bit with this metric is the fact that a very simple AI that just asks me periodically "Hey, do you endorse what you are doing right now? Are you time boxing? Are you following your plan?" makes me (I think) significantly more strategic and productive. Similar to I hired 5 people to sit behind me and make me productive for a month. But this is maybe off topic.

jessica-liu-taylor on On Eating the Sun

I think partially it's meant to go from some sort of abstract model of intelligence as a scalar variable that increases at some rate (like, on a x/y graph) to concrete, material milestones. Like, people can imagine "intelligence goes up rapidly! singularity!" and it's unclear what that implies, I'm saying sufficient levels would imply eating the sun, that makes it harder to confuse with things like "getting higher scores on math tests".

I suppose a more general category would be, the relevant kind of self-improving intelligence would be the sort that can re-purpose mass-energy to creating more computation that can run its intelligence, and "eat the Sun" is an obvious target given this background notion of intelligence.

(Note, there is skepticism about feasibility on Twitter/X, that's some info about how non-singulatarians react)

ryan_greenblatt on ryan_greenblatt's Shortform

This case seems extremely cherry picked for cases where uplift is especially high. (Note that this is in copilot's interest.) Now, this task could probably be solved autonomously by an AI in like 10 minutes with good scaffolding.

I think you have to consider the full diverse range of tasks to get a reasonable sense or at least consider harder tasks. Like RE-bench seems much closer, but I still expect uplift on RE-bench to probably (but not certainly!) considerably overstate real world speed up.

raemon on On Eating the Sun

This seemed like a nice explainer post, though it's somewhat confusing who the post is for – if I imagine being someone who didn't really understand any arguments about superintelligence, I think I might bounce off the opening paragraph or title because I'm like "why would I care about eating the sun."

There is something nice and straightforward about the current phrasing but suspect there's an opening paragraph that would do a better job explaining why you might care about this.

(But I'd be curious to hear from people who weren't really sold on any singularity stuff who read it and can describe how it was for them)

jacobjacob on AI Safety as a YC Startup

Impact = Magnitude * Direction

Surely one should think of this as a vector in a space with more dimensions than 1.

In your equation you can just 1,000,000x magnitude and it will move in the "positive direction".

In the real world you can become a billionaire from selling toothbrushes and still be "overtaken" by a guy who wrote one blog post that happened to be real dang good

I made a drawing but lw won't allow adding it on phone I think

charbel-raphael on ryan_greenblatt's Shortform

I was saying 2x because I've memorised the results from this study. Do we have better numbers today? R&D is harder, so this is an upper bound. However, since this was from one year ago, so perhaps the factors cancel each other out?

Summary of the experiment process and results (described in following paragraph)

sharmake-farah on Disagreement on AGI Suggests It’s Near

My response to this is to focus on when a Dyson Swarm is being built, not AGI, because it's easier to define the term less controversially.

And a large portion of disagreements here fundamentally revolves around being unable to coordinate on what a given word means, which from an epistemic perspective doesn't matter at all, but it does matter from a utility/coordination perspective, where coordination is required for a lot of human feats.