LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

GPT-o1
Zvi · 2024-09-16T13:40:06.236Z · comments (34)

There is a globe in your LLM
jacob_drori (jacobcd52) · 2024-10-08T00:43:40.300Z · comments (4)

[link] What Depression Is Like
Sable · 2024-08-27T17:43:22.549Z · comments (23)

Circling as practice for “just be yourself”
Kaj_Sotala · 2024-12-16T07:40:04.482Z · comments (5)

5 homegrown EA projects, seeking small donors
Austin Chen (austin-chen) · 2024-10-28T23:24:25.745Z · comments (4)

Newsom Vetoes SB 1047
Zvi · 2024-10-01T12:20:06.127Z · comments (6)

Why you should be using a retinoid
GeneSmith · 2024-08-19T03:07:41.722Z · comments (60)

Self-prediction acts as an emergent regularizer
Cameron Berg (cameron-berg) · 2024-10-23T22:27:03.664Z · comments (4)

[link] Is Deep Learning Actually Hitting a Wall? Evaluating Ilya Sutskever's Recent Claims
garrison · 2024-11-13T17:00:01.005Z · comments (14)

JargonBot Beta Test
Raemon · 2024-11-01T01:05:26.552Z · comments (55)

Secular interpretations of core perennialist claims
zhukeepa · 2024-08-25T23:41:02.683Z · comments (33)

OpenAI o1, Llama 4, and AlphaZero of LLMs
Vladimir_Nesov · 2024-09-14T21:27:41.241Z · comments (25)

A breakdown of AI capability levels focused on AI R&D labor acceleration
ryan_greenblatt · 2024-12-22T20:56:00.298Z · comments (2)

Deep Causal Transcoding: A Framework for Mechanistically Eliciting Latent Behaviors in Language Models
Andrew Mack (andrew-mack) · 2024-12-03T21:19:42.333Z · comments (7)

[question] What are the strongest arguments for very short timelines?
Kaj_Sotala · 2024-12-23T09:38:56.905Z · answers+comments (60)

AI #92: Behind the Curve
Zvi · 2024-11-28T14:40:05.448Z · comments (7)

AI #83: The Mask Comes Off
Zvi · 2024-09-26T12:00:08.689Z · comments (20)

[question] What are the good rationality films?
Ben Pace (Benito) · 2024-11-20T06:04:56.757Z · answers+comments (53)

Release: Optimal Weave (P1): A Prototype Cohabitive Game
mako yass (MakoYass) · 2024-08-17T14:08:18.947Z · comments (21)

How to prevent collusion when using untrusted models to monitor each other
Buck · 2024-09-25T18:58:20.693Z · comments (11)

Remap your caps lock key
bilalchughtai (beelal) · 2024-12-15T14:03:33.623Z · comments (16)

Values Are Real Like Harry Potter
johnswentworth · 2024-10-09T23:42:24.724Z · comments (17)

[Intuitive self-models] 2. Conscious Awareness
Steven Byrnes (steve2152) · 2024-09-25T13:29:02.820Z · comments (48)

Scaffolding for "Noticing Metacognition"
Raemon · 2024-10-09T17:54:13.657Z · comments (4)

Darwinian Traps and Existential Risks
KristianRonn · 2024-08-25T22:37:14.142Z · comments (14)

[link] Not every accommodation is a Curb Cut Effect: The Handicapped Parking Effect, the Clapper Effect, and more
Michael Cohn (michael-cohn) · 2024-09-15T05:27:36.691Z · comments (39)

Testing which LLM architectures can do hidden serial reasoning
Filip Sondej · 2024-12-16T13:48:34.204Z · comments (9)

[link] Gwern Branwen interview on Dwarkesh Patel’s podcast: “How an Anonymous Researcher Predicted AI's Trajectory”
Said Achmiz (SaidAchmiz) · 2024-11-14T23:53:34.922Z · comments (0)

Quick look: applications of chaos theory
Elizabeth (pktechgirl) · 2024-08-18T15:00:07.853Z · comments (51)

Graceful Degradation
Screwtape · 2024-11-05T23:57:53.362Z · comments (8)

[link] Gwern: Why So Few Matt Levines?
kave · 2024-10-29T01:07:27.564Z · comments (10)

[link] Best-of-N Jailbreaking
John Hughes (john-hughes) · 2024-12-14T04:58:48.974Z · comments (6)

Should there be just one western AGI project?
rosehadshar · 2024-12-03T10:11:17.914Z · comments (72)

Rationality Quotes - Fall 2024
Screwtape · 2024-10-10T18:37:55.013Z · comments (26)

[link] Is "superhuman" AI forecasting BS? Some experiments on the "539" bot from the Centre for AI Safety
titotal (lombertini) · 2024-09-18T13:07:40.754Z · comments (3)

AIs Will Increasingly Fake Alignment
Zvi · 2024-12-24T13:00:07.770Z · comments (0)

Bitter lessons about lucid dreaming
avturchin · 2024-10-16T21:27:04.725Z · comments (62)

Dentistry, Oral Surgeons, and the Inefficiency of Small Markets
GeneSmith · 2024-11-01T17:26:06.466Z · comments (16)

LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that.
Andrew_Critch · 2024-11-22T03:26:11.681Z · comments (53)

The Obliqueness Thesis
jessicata (jessica.liu.taylor) · 2024-09-19T00:26:30.677Z · comments (17)

If we solve alignment, do we die anyway?
Seth Herd · 2024-08-23T13:13:10.933Z · comments (112)

Effective Evil's AI Misalignment Plan
lsusr · 2024-12-15T07:39:34.046Z · comments (9)

My 10-year retrospective on trying SSRIs
Kaj_Sotala · 2024-09-22T20:30:02.483Z · comments (10)

What is malevolence? On the nature, measurement, and distribution of dark traits
David Althaus (wallowinmaya) · 2024-10-23T08:41:33.197Z · comments (15)

The Packaging and the Payload
Screwtape · 2024-11-12T03:07:37.209Z · comments (1)

Secular Solstice Round Up 2024
dspeyer · 2024-11-21T10:49:36.682Z · comments (15)

Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)
Elizabeth (pktechgirl) · 2024-10-22T18:20:01.194Z · comments (79)

[link] Soft Nationalization: how the USG will control AI labs
Deric Cheng (deric-cheng) · 2024-08-27T15:11:14.601Z · comments (7)

[Intuitive self-models] 3. The Homunculus
Steven Byrnes (steve2152) · 2024-10-02T15:20:18.394Z · comments (36)

[link] Video lectures on the learning-theoretic agenda
Vanessa Kosoy (vanessa-kosoy) · 2024-10-27T12:01:32.777Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

akash-wasil on nikola's Shortform

@Nikola Jurkovic [LW · GW] I'd be interested in timeline estimates for something along the lines of "AI that substantially increases AI R&D". Not exactly sure what the right way to operationalize this is, but something that says "if there is a period of Rapid AI-enabled AI Progress, this is when we think it would occur."

(I don't really love the "95% of fully remote jobs could be automated frame", partially because I don't think it captures many of the specific domains we care about (e.g., AI-enabled R&D, other natsec-relevant capabilities) and partly because I suspect people have pretty different views of how easy/hard remote jobs are. Like, some people think that lots of remote jobs today are basically worthless and could already be automated, whereas others disagree. If the purpose of the forecasting question is to get a sense of how powerful AI will be, the disagreements about "how much do people actually contribute in remote jobs" seems like unnecessary noise.)

(Nitpicks aside, this is cool and I appreciate you running this poll!)

raemon on Hire (or become) a Thinking Assistant / Body Double

I imagined "FocusMate + TaskRabbit" specifically to address this issue.

Three types of workers I'm imagining here:

People who are reasonable skilled types, but who are youngish and haven't landed a job yet.
People who actively like doing this sort of work and are good at it
People who have trouble getting/keeping a fulltime job for various reasons (which would land them in the "unreliable" sector), but... it's FocusMate/TaskRabbit, they don't need to be reliable all the time, there just needs to be one of them online who responds to you within a few hours, who is at least reasonably competent when they're sitting down and paying attention.

And then there are reviews (which I somehow UI design to elicit honest reactions, rather than just slapping a 0-5 stars rating which everyone feels obligated to rate "5" all the time unless something was actively wrong"), and they have profiles about what they think they're good at and what others thought they were good at.

(where an expectation is, if you don't have active endorsementss, if you haven't yet been rated you will probably charge a low rate)

Meanwhile if you're actively good and actively reliable, people can "favorite" you and work out deals where you commit to some schedule.

sharmake-farah on Noosphere89's Shortform

Note I'm not talking about moral weight here, and my point here is that all discussions about counterfactuals (especially human intuitions around counterfactuals) could in principle be executable/actually doable with enough compute and the ability to specify details, so counterfactability/counterfactual worlds isn't special from a philosophical perspective, as it implicitly refers to other real worlds/universes.

Of course, this isn't the only way to do so for a large class of counterfactuals/counterfactual worlds, and sometimes you can run them fully accurately on less compute/data if you can identify simplicities/abstractions that are lossless.

sam-g on Open Thread Fall 2024

Thanks so much for these resources, interesting!!
Im not saying my only goal is to combat misinformation, but that logical discourse combats misinformation (as well as hegemonic propaganda) as a matter of course.

vladimir_nesov on Noosphere89's Shortform

I think explicitly computing details in full (as opposed to abstract reasoning about approximate properties) has no bearing on moral weight (degree of being real), but some kind of computational irreducibility forces the simulation of interesting things to get quite close to low level detail in order to figure out most global facts about what's going on there, such as values/culture of people living in a world after significant time passes.

zvi on AI: Practical Advice for the Worried

I find myself linking back to this often. I don't still fully endorse quite everything here, but the core messages still seem true even with things seeming further along.

I do think it should likely get updated soon for 2025.

alex-k-chen-parrot on AGI with RL is Bad News for Safety

Can't CoT's be what makes RL safe, however? (if you force the reasoner to self-limit under some recursion depth when it senses that the RL agent might be asking for so much that it makes it unsafe)

philh on What Have Been Your Most Valuable Casual Conversations At Conferences?

My girlfriend and I probably wouldn't have got together if not for a conversation at Less Wrong Community Weekend.

lukas-finnveden on What are the strongest arguments for very short timelines?

One argument I have been making publicly is that I think Ajeya's Bioanchors report greatly overestimated human brain compute. I think a more careful reading of Joe Carlsmith's report that hers was based on supports my own estimates of around 1e15 FLOPs.

Am I getting things mixed up, or isn’t that just exactly Ajeya’s median estimate? Quote from the report: ”Under this definition, my median estimate for human brain computation is ~1e15 FLOP/s.”

https://docs.google.com/document/d/1IJ6Sr-gPeXdSJugFulwIpvavc0atjHGM82QjIfUSBGQ/edit

dr_s on What Goes Without Saying

On this issue specifically, I feel like the bar for what counts as an actually sane and non-dysfunctional organization to the average user of this website is probably way too lofty for 95% of workplaces out there (to be generous!) so it's not even that strange that it would be the case.