LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

What Indicators Should We Watch to Disambiguate AGI Timelines?
snewman · 2025-01-06T19:57:43.398Z · comments (33)

[link] OpenAI's CBRN tests seem unclear
LucaRighetti (Error404Dinosaur) · 2024-11-21T17:28:30.290Z · comments (6)

What o3 Becomes by 2028
Vladimir_Nesov · 2024-12-22T12:37:20.929Z · comments (15)

BIG-Bench Canary Contamination in GPT-4
Jozdien · 2024-10-22T15:40:48.166Z · comments (13)

Hire (or Become) a Thinking Assistant
Raemon · 2024-12-23T03:58:42.061Z · comments (43)

A bird's eye view of ARC's research
Jacob_Hilton · 2024-10-23T15:50:06.123Z · comments (12)

[link] The Dangers of Mirrored Life
Niko_McCarty (niko-2) · 2024-12-12T20:58:32.750Z · comments (7)

[link] Miles Brundage resigned from OpenAI, and his AGI readiness team was disbanded
garrison · 2024-10-23T23:40:57.180Z · comments (1)

Passages I Highlighted in The Letters of J.R.R.Tolkien
Ivan Vendrov (ivan-vendrov) · 2024-11-25T01:47:59.071Z · comments (10)

Scissors Statements for President?
AnnaSalamon · 2024-11-06T10:38:21.230Z · comments (32)

The Dream Machine
sarahconstantin · 2024-12-05T00:00:05.796Z · comments (6)

2024 in AI predictions
jessicata (jessica.liu.taylor) · 2025-01-01T20:29:49.132Z · comments (3)

The o1 System Card Is Not About o1
Zvi · 2024-12-13T20:30:08.048Z · comments (5)

Should CA, TX, OK, and LA merge into a giant swing state, just for elections?
Thomas Kwa (thomas-kwa) · 2024-11-06T23:01:48.992Z · comments (35)

The Plan - 2024 Update
johnswentworth · 2024-12-31T13:29:53.888Z · comments (27)

Capital Ownership Will Not Prevent Human Disempowerment
beren · 2025-01-05T06:00:23.095Z · comments (9)

You should consider applying to PhDs (soon!)
bilalchughtai (beelal) · 2024-11-29T20:33:12.462Z · comments (19)

Ablations for “Frontier Models are Capable of In-context Scheming”
AlexMeinke (Paulawurm) · 2024-12-17T23:58:19.222Z · comments (1)

Hierarchical Agency: A Missing Piece in AI Alignment
Jan_Kulveit · 2024-11-27T05:49:04.241Z · comments (20)

AIs Will Increasingly Attempt Shenanigans
Zvi · 2024-12-16T15:20:05.652Z · comments (2)

[link] Parkinson's Law and the Ideology of Statistics
Benquo · 2025-01-04T15:49:21.247Z · comments (2)

DeepSeek beats o1-preview on math, ties on coding; will release weights
Zach Stein-Perlman · 2024-11-20T23:50:26.597Z · comments (26)

Why I'm Moving from Mechanistic to Prosaic Interpretability
Daniel Tan (dtch1997) · 2024-12-30T06:35:43.417Z · comments (34)

Sorry for the downtime, looks like we got DDosd
habryka (habryka4) · 2024-12-02T04:14:30.209Z · comments (13)

A Three-Layer Model of LLM Psychology
Jan_Kulveit · 2024-12-26T16:49:41.738Z · comments (7)

The Big Nonprofits Post
Zvi · 2024-11-29T16:10:06.938Z · comments (10)

[link] Announcing turntrout.com, my new digital home
TurnTrout · 2024-11-17T17:42:08.164Z · comments (24)

I turned decision theory problems into memes about trolleys
Tapatakt · 2024-10-30T20:13:29.589Z · comments (23)

[link] How to replicate and extend our alignment faking demo
Fabien Roger (Fabien) · 2024-12-19T21:44:13.059Z · comments (5)

Takes on "Alignment Faking in Large Language Models"
Joe Carlsmith (joekc) · 2024-12-18T18:22:34.059Z · comments (7)

[Fiction] [Comic] Effective Altruism and Rationality meet at a Secular Solstice afterparty
tandem · 2025-01-07T19:11:21.238Z · comments (4)

A shortcoming of concrete demonstrations as AGI risk advocacy
Steven Byrnes (steve2152) · 2024-12-11T16:48:41.602Z · comments (27)

LLMs can learn about themselves by introspection
Felix J Binder (fjb) · 2024-10-18T16:12:51.231Z · comments (38)

2024 Unofficial LessWrong Census/Survey
Screwtape · 2024-12-02T05:30:53.019Z · comments (48)

Behavioral red-teaming is unlikely to produce clear, strong evidence that models aren't scheming
Buck · 2024-10-10T13:36:53.810Z · comments (4)

The nihilism of NeurIPS
charlieoneill (kingchucky211) · 2024-12-20T23:58:11.858Z · comments (7)

Bigger Livers?
sarahconstantin · 2024-11-08T21:50:09.814Z · comments (13)

MIRI’s 2024 End-of-Year Update
Rob Bensinger (RobbBB) · 2024-12-03T04:33:47.499Z · comments (2)

The "Think It Faster" Exercise
Raemon · 2024-12-11T19:14:10.427Z · comments (13)

[link] Seven lessons I didn't learn from election day
Eric Neyman (UnexpectedValues) · 2024-11-14T18:39:07.053Z · comments (33)

The case for unlearning that removes information from LLM weights
Fabien Roger (Fabien) · 2024-10-14T14:08:04.775Z · comments (15)

[link] Anthropic: Three Sketches of ASL-4 Safety Case Components
Zach Stein-Perlman · 2024-11-06T16:00:06.940Z · comments (33)

A breakdown of AI capability levels focused on AI R&D labor acceleration
ryan_greenblatt · 2024-12-22T20:56:00.298Z · comments (5)

My AGI safety research—2024 review, ’25 plans
Steven Byrnes (steve2152) · 2024-12-31T21:05:19.037Z · comments (4)

[question] What are the strongest arguments for very short timelines?
Kaj_Sotala · 2024-12-23T09:38:56.905Z · answers+comments (72)

[link] Finishing The SB-1047 Documentary In 6 Weeks
Michaël Trazzi (mtrazzi) · 2024-10-28T20:17:47.465Z · comments (5)

[link] Sabotage Evaluations for Frontier Models
David Duvenaud (david-duvenaud) · 2024-10-18T22:33:14.320Z · comments (55)

Science advances one funeral at a time
Cameron Berg (cameron-berg) · 2024-11-01T23:06:19.381Z · comments (9)

Catastrophic sabotage as a major threat model for human-level AI systems
evhub · 2024-10-22T20:57:11.395Z · comments (11)

Zvi’s Thoughts on His 2nd Round of SFF
Zvi · 2024-11-20T13:40:08.092Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

abstractapplic on D&D.Sci Coliseum: Arena of Data Evaluation and Ruleset

Just realized I forgot to mention this: I really like how the interactive handled the Bonus Objective, i.e. if the player is thinking along the right lines their character automatically makes the in-universe sensible/optimal decision for them (which means you can set up a fair Bonus Objective for players who don't live in that universe and so don't have all the context).

morpheus on On Eating the Sun

I was already sold on singularity. For what it's worth I found the post and comments very helpful for why you would want to take the sun apart in the first place and why it would be feasible and desirable for superintelligent and non-superintelligent civilization (Turning the sun into a smaller sun that doesn't explode seems nicer than having it explode. Fusion gives off way more energy than lifting the material. Gravity is the weakest of the 4 forces after all. In a superintelligent civilization with reversible computers, not taking apart the sun will make readily available mass a taught constraint).

quetzal_rainbow on On Eating the Sun

If you can use 1kg of hydrogen to lift x>1kg of hydrogen using proton-proton fusion, you are getting exponential bulidup, limited only by "how many proton-proton reactors you can build in Solar system" and "how willing you are to actually build them", and you can use exponential buildup to create all necessary infrastructure.

antontimmer on Nathan Young's Shortform

Here is an example which I believe is directionally correct, it took me roughly 20 minutes to come up with it. The prompt is "how do living systems create meaning "?:

My life feels like it has meaning (sensory-motor behavior and conceptual intentional aspects). Looking at it through an evolutionary perspective, it is highly likely that meaning assignment is the way through which living systems survived. Thus, there has to be some base biological level at which meaning is created through cell-cell communication/ bioelectricity/ biochemistry /biosensoring etc.
Life is just made of atoms. Atoms are just automata. This implies, there is no meaning at the atom level and thus it cannot pop at a higher levels through emergence or some shit. You are delusional to believe there is some meaning assignment in life.
Meaning is something that is defined through the language that we speak. It is well known that different cultures have different words and conceptual framing which implies that meaning is different in different cultures. Meaning thus only depends on language.
Meaning is just a social construct and we can define anything to have meaning. Thus it doesn't matter what you find meaningful since it is just something you inherited through society and parenting.

I believe points 1-3 are fine, point 4 is kinda shaky.

jessica-liu-taylor on On Eating the Sun

Doesn't have to expend the energy. It's about reshaping the matter to machines. Computers take lots of mass-energy to constitute them, not to power them.
Things can go 6 orders of magnitude faster due to intelligence/agency, it's not highly unlikely in general.
I agree that in theory the arguments here could be better. It might require knowing more physics than I do, and has the "how does Kasparov beat you at chess" problem.

auspicious on 2024 in AI predictions

Thank you for putting this together.

Something I find interesting is that even many of the highest-profile skeptics of AI progress are surprisingly bullish (from an objective perspective).

For example, Yann LeCun has said we might get to AGI within a decade or two, and even Gary Marcus has gone on record saying "I do think we will eventually reach AGI (artificial general intelligence), and quite possibly before the end of this century."

"Before the end of this century" might seem pessimistic, but you'd think a true pessimist would say it will take centuries or millennia or even never happen at all. Almost no one seems to be saying that.

ted-sanders on Tips On Empirical Research Slides

Management consulting firms have lots of great ideas on slide design: https://www.theanalystacademy.com/consulting-presentations/

Some things they do well:

They treat slides as documents that can be understood standalone (this is even useful when presenting, as not everyone is following every word)
They employ a lot of hierarchy to help make the content skimmable (helpful for efficiency)
They put conclusions / summaries / action items up front, details behind (helpful for efficiency, especially in a high trust environments)

ted-sanders on Tips On Empirical Research Slides

Additional thoughts:

More than 3 bars/colors is fine
I recommend using horizontal bars on some of those slides, so the labels are written in the same direction as the bars - lets you fill space more efficiently
Put sentences / verbs in titles; noun titles like "Summary" or "Discussion" are low value
If you're measuring deltas between two things, compute the error bar on the delta, don't compute the error bars on the two things; consider coloring by statistical significance (e.g., continuous color scale over range of standard errors of differences of the mean)
In addition to agenda, it can be helpful to start with objectives - why are you here and what are you hoping to get from them? are you trying to inform them? get advice on something specific? get advice on something broad?
Can help to include real data / real prompts / real model outputs - harder to fool yourself when you look at real data instead of relying on abstract metrics and intentions
It's fine to have crummy slides - don't waste 1 hour of your time to save 5 minutes of your audience's time - the slides should serve you, not the other way around

quanticle on Ann Altman has filed a lawsuit in US federal court alleging that she was sexually abused by Sam Altman

The content of the complaint caused me to have additional doubt about the truth of Ann Altman's claims. One of the key claims in pythagoras0515's post is that Ann Altman's claims have been self-consistent. That is, Ann Altman has been claiming that approximately the same acts occurred, over a consistent period of time, when given the opportunity to express her views. However, here, there is significant divergence. In the lawsuit complaint, she is alleging that the abuse took place, repeatedly over eight to nine years, a claim that is not supported by any of the evidence in pythagoras0515's post. In addition, another claim from the original post is that the reason she's only bringing up these allegations now is because she suppressed the memory of the abuse. The science behind suppressed memory is controversial, but I doubt that even its staunchest advocates would claim that a person could involuntarily suppress the memory of repeated acts carried out consistently over a long period of time. Therefore, I am more inclined to doubt Ann Altman's allegations based on the contents of the initial complaint filed for the lawsuit.

All that said, I do look forward to seeing what other evidence she can bring forth to support her claims, assuming that Sam Altman doesn't settle out of court to avoid the negative publicity of a trial.

drake-thomas on Drake Thomas's Shortform

(TLDR: Recent Cochrane review says zinc lozenges shave 0.5 to 4 days off of cold duration with low confidence, middling results for other endpoints. Some reason to think good lozenges are better than this.)

There's a 2024 Cochrane review on zinc lozenges for colds that's come out since LessWrong posts on the topic from 2019 [LW · GW], 2020 [LW · GW], and 2021 [LW · GW]. 34 studies, 17 of which are lozenges, 9/17 are gluconate and I assume most of the rest are acetate but they don't say. Not on sci-hub or Anna's Archive, so I'm just going off the abstract and summary here; would love a PDF if anyone has one.

Dosing ranged between 45 and 276 mg/day, which lines up with 3-15 18mg lozenges per day: basically in the same ballpark as the recommendation on Life Extension's acetate lozenges (the rationalist favorite).
Evidence for prevention is weak (partly bc fewer studies): they looked at risk of developing cold, rate of colds during followup, duration conditional on getting a cold, and global symptom severity. All but the last had CIs just barely overlapping "no effect" but leaning in the efficacious direction; even the optimistic ends of the CIs don't seem great, though.
Evidence for treatment is OK: "there may be a reduction in the mean duration of the cold in days (MD ‐2.37, 95% CI ‐4.21 to ‐0.53; I² = 97%; 8 studies, 972 participants; low‐certainty evidence)". P(cold at end of followup) and global symptom severity look like basically noise and have few studies.

My not very informed takes:

On the model of the podcast in the 2019 post, I should expect several of these studies to be using treatments I think are less or not at all efficacious, be less surprised by study-to-study variation, and increase my estimate of the effect size of using zinc acetate lozenges compared to anything else. Also maybe I worry that some of these studies didn't start zinc early enough? Ideally I could get the full PDF and they'll just have a table of (study, intervention type, effect size).
Even with the caveats around some methods of absorption being worse than others, this seems rough for a theory in which zinc acetate taken early completely obliterates colds - the prevention numbers just don't look very good. (But maybe the prevention studies all used bad zinc?)
I don't know what baseline cold duration is, but assuming it's something like a week, this lines up pretty well with the 33% decrease (40% for acetate) seen in this meta-analysis from 2013 if we assume effect sizes are dragged down by worse forms of zinc in the 2024 review.
- But note these two reviews are probably looking at many of the same studies, so that's more of an indication that nothing damning has come out since 2013 rather than an independent datapoint.
My current best guess for the efficacy of zinc acetate lozenges at 18mg every two waking hours from onset of any symptoms, as measured by "expected decrease in integral of cold symptom disutility", is:
- 15% demolishes colds, like 0.2x disutility
- 25% helps a lot, like 0.5x disutility
- 35% helps some (or helps lots but only for a small subset of people or cases), like 0.75x disutility
- 25% negligible difference from placebo

I woke up at 2am this morning with my throat feeling bad, started taking Life Extension peppermint flavored 18mg zinc acetate lozenges at noon, expecting to take 5ish lozenges per day for 3 days or while symptoms are worsening. My most recent cold before this was about 6 days: [mild throat tingle, bad, worse, bad, fair bit better, nearly symptomless, symptomless]. I'll follow up about how it goes!