LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Why Don't We Just... Shoggoth+Face+Paraphraser?
Daniel Kokotajlo (daniel-kokotajlo) · 2024-11-19T20:53:52.084Z · comments (52)

BIG-Bench Canary Contamination in GPT-4
Jozdien · 2024-10-22T15:40:48.166Z · comments (13)

What o3 Becomes by 2028
Vladimir_Nesov · 2024-12-22T12:37:20.929Z · comments (15)

[link] The Dangers of Mirrored Life
Niko_McCarty (niko-2) · 2024-12-12T20:58:32.750Z · comments (7)

A bird's eye view of ARC's research
Jacob_Hilton · 2024-10-23T15:50:06.123Z · comments (12)

[link] Miles Brundage resigned from OpenAI, and his AGI readiness team was disbanded
garrison · 2024-10-23T23:40:57.180Z · comments (1)

Passages I Highlighted in The Letters of J.R.R.Tolkien
Ivan Vendrov (ivan-vendrov) · 2024-11-25T01:47:59.071Z · comments (10)

The Dream Machine
sarahconstantin · 2024-12-05T00:00:05.796Z · comments (6)

Scissors Statements for President?
AnnaSalamon · 2024-11-06T10:38:21.230Z · comments (31)

Hire (or become) a Thinking Assistant / Body Double
Raemon · 2024-12-23T03:58:42.061Z · comments (32)

The o1 System Card Is Not About o1
Zvi · 2024-12-13T20:30:08.048Z · comments (5)

Should CA, TX, OK, and LA merge into a giant swing state, just for elections?
Thomas Kwa (thomas-kwa) · 2024-11-06T23:01:48.992Z · comments (35)

You should consider applying to PhDs (soon!)
bilalchughtai (beelal) · 2024-11-29T20:33:12.462Z · comments (19)

DeepSeek beats o1-preview on math, ties on coding; will release weights
Zach Stein-Perlman · 2024-11-20T23:50:26.597Z · comments (26)

Sorry for the downtime, looks like we got DDosd
habryka (habryka4) · 2024-12-02T04:14:30.209Z · comments (13)

AIs Will Increasingly Attempt Shenanigans
Zvi · 2024-12-16T15:20:05.652Z · comments (2)

Ablations for “Frontier Models are Capable of In-context Scheming”
AlexMeinke (Paulawurm) · 2024-12-17T23:58:19.222Z · comments (1)

The Big Nonprofits Post
Zvi · 2024-11-29T16:10:06.938Z · comments (10)

[link] Announcing turntrout.com, my new digital home
TurnTrout · 2024-11-17T17:42:08.164Z · comments (24)

Hierarchical Agency: A Missing Piece in AI Alignment
Jan_Kulveit · 2024-11-27T05:49:04.241Z · comments (20)

I turned decision theory problems into memes about trolleys
Tapatakt · 2024-10-30T20:13:29.589Z · comments (20)

A shortcoming of concrete demonstrations as AGI risk advocacy
Steven Byrnes (steve2152) · 2024-12-11T16:48:41.602Z · comments (27)

Takes on "Alignment Faking in Large Language Models"
Joe Carlsmith (joekc) · 2024-12-18T18:22:34.059Z · comments (8)

LLMs can learn about themselves by introspection
Felix J Binder (fjb) · 2024-10-18T16:12:51.231Z · comments (38)

Why comparative advantage does not help horses
Sherrinford · 2024-09-30T22:27:57.450Z · comments (15)

[link] Advice for journalists
Nathan Young · 2024-10-07T16:46:40.929Z · comments (53)

[link] How to replicate and extend our alignment faking demo
Fabien Roger (Fabien) · 2024-12-19T21:44:13.059Z · comments (2)

Behavioral red-teaming is unlikely to produce clear, strong evidence that models aren't scheming
Buck · 2024-10-10T13:36:53.810Z · comments (4)

MIRI’s 2024 End-of-Year Update
Rob Bensinger (RobbBB) · 2024-12-03T04:33:47.499Z · comments (2)

Bigger Livers?
sarahconstantin · 2024-11-08T21:50:09.814Z · comments (13)

You can, in fact, bamboozle an unaligned AI into sparing your life
David Matolcsi (matolcsid) · 2024-09-29T16:59:43.942Z · comments (171)

[link] Seven lessons I didn't learn from election day
Eric Neyman (UnexpectedValues) · 2024-11-14T18:39:07.053Z · comments (33)

The "Think It Faster" Exercise
Raemon · 2024-12-11T19:14:10.427Z · comments (13)

The case for unlearning that removes information from LLM weights
Fabien Roger (Fabien) · 2024-10-14T14:08:04.775Z · comments (15)

[link] Anthropic: Three Sketches of ASL-4 Safety Case Components
Zach Stein-Perlman · 2024-11-06T16:00:06.940Z · comments (33)

The nihilism of NeurIPS
charlieoneill (kingchucky211) · 2024-12-20T23:58:11.858Z · comments (7)

[link] Finishing The SB-1047 Documentary In 6 Weeks
Michaël Trazzi (mtrazzi) · 2024-10-28T20:17:47.465Z · comments (5)

[link] Sabotage Evaluations for Frontier Models
David Duvenaud (david-duvenaud) · 2024-10-18T22:33:14.320Z · comments (55)

Science advances one funeral at a time
Cameron Berg (cameron-berg) · 2024-11-01T23:06:19.381Z · comments (9)

[question] What are the strongest arguments for very short timelines?
Kaj_Sotala · 2024-12-23T09:38:56.905Z · answers+comments (69)

2024 Unofficial LessWrong Census/Survey
Screwtape · 2024-12-02T05:30:53.019Z · comments (42)

Catastrophic sabotage as a major threat model for human-level AI systems
evhub · 2024-10-22T20:57:11.395Z · comments (11)

Zvi’s Thoughts on His 2nd Round of SFF
Zvi · 2024-11-20T13:40:08.092Z · comments (2)

LLMs Look Increasingly Like General Reasoners
eggsyntax · 2024-11-08T23:47:28.886Z · comments (45)

A very strange probability paradox
notfnofn · 2024-11-22T14:01:36.587Z · comments (26)

Anvil Problems
Screwtape · 2024-11-13T22:57:41.974Z · comments (13)

AIs Will Increasingly Fake Alignment
Zvi · 2024-12-24T13:00:07.770Z · comments (0)

Three Notions of "Power"
johnswentworth · 2024-10-30T06:10:08.326Z · comments (44)

A breakdown of AI capability levels focused on AI R&D labor acceleration
ryan_greenblatt · 2024-12-22T20:56:00.298Z · comments (5)

[link] Self-Help Corner: Loop Detection
adamShimi · 2024-10-02T08:33:23.487Z · comments (6)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

yair-halberstadt on The average rationalist IQ is about 122

Here’s the breakdown: a median SAT score of 1490 (from the LessWrong 2014 survey) corresponds to +2.42 SD, which regresses to +1.93 SD for IQ using an SAT-IQ correlation of +0.80. This equates to an IQ of 129.

I don't think that works unless Less wrong specifically selects for high SAT score. If it selects for high IQ and the high SAT is as a result of the high IQ then you would have to go the other way and assume an SD of 3.03.

If, as seems more likely, Less wrong correlates with both IQ and SAT score, then the exact number is impossible to calculate, but assuming it correlates with both equally we would estimate IQ at 2.42 SD.

finalformal2 on Being Present is Not a Skill

How would you recommend learning how to get rid of emotional blocks?

unexpectedvalues on The average rationalist IQ is about 122

These both seem valid to me! Now, if you have multiple predictors (like SAT and height), then things get messy because you have to consider their covariance and stuff.

jenn on Ideas for benchmarking LLM creativity

I wrote up my much less seriously considered tests at https://jenn.site/2024/12/llm-creativity/ in part due to this post.

LLMs for creative work seems to be an area that you're poking at a lot these days and I always enjoy seeing what you get up to with it :]

tsvibt on Alexander Gietelink Oldenziel's Shortform

Alternative: "AI x-derisking"

fer32dwt34r3dfsz on AlphaAndOmega's Shortform

I decided to run the same question through the latest models to gauge their improvements.

Not exactly sure if there is much advantage at all in you having done this, but I feel inclined to say Thank You for persisting in persuading your cousin to at least consider concerns regarding AI, even if he perceptually filters those concerns to mostly regard job automation over others, such as a global catastrophe.

In my own life, over the last several years, I have found it difficult to persuade those close to me to really consider concerns from AI.

I thought that capabilities advancing observably before them might stoke them to think more about their own future and how possibly to behave and or live differently conditional on different AI capabilities, but this has been of little avail.

Expanding capabilities seem to best dissolve skepticism but conversations seem to have not had as large an effect as I would have expected. I've not thought or acted as much as I want to on how to coordinate more of humanity around decision-making regarding AI (or the consequences of AI), partially since I do not have a concrete notion where to steer humanity or justification for where to steer (even I knew it was highly likely I was actually contributing to the steering through my actions).

petermccluskey on What happens next?

I want to register different probabilities:

Jobs hits first (25%).
AGI race hits first (50%).
Alignment hits first (15%).

archimedes on Review: Planecrash

This review led me to find the following podcast version of Planecrash. I've listened to the first couple of episodes and the quality is quite good.

https://askwhocastsai.substack.com/s/planecrash

fer32dwt34r3dfsz on The average rationalist IQ is about 122

The second paragraph puts into words something I've noticed but not really mentally formalized before. Some anecdotal evidence from my own life in support of the claims made in this paragraph: I've met individuals whose tested IQ exceeds those of other, lower but not much lower, IQ individuals I know who are more educated / trained in epistemological thinking and tangential disciplines. For none of the individual-pairs I have in mind would I declare that one person "ran circles around" the other, however, the difference (advantage going to the lower but better "trained" IQ individual) in conversational dynamics were notable enough for me to remember well. The catch here is the accuracy of the IQ claims made by some of these individuals, as some did not personally reveal their scores to me.

cstinesublime on CstineSublime's Shortform

Kantmogorov Imperative - more of a philosophical dad-joke than a actual thing, it is the shortest possible computer program that outputs descriptions of morally consistent behaviors in all/any circumstances