LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[Intuitive self-models] 2. Conscious Awareness
Steven Byrnes (steve2152) · 2024-09-25T13:29:02.820Z · comments (48)

Graceful Degradation
Screwtape · 2024-11-05T23:57:53.362Z · comments (8)

Quick look: applications of chaos theory
Elizabeth (pktechgirl) · 2024-08-18T15:00:07.853Z · comments (51)

LessWrong Community Weekend 2024, open for applications
UnplannedCauliflower · 2024-05-01T10:18:21.992Z · comments (2)

A couple productivity tips for overthinkers
Steven Byrnes (steve2152) · 2024-04-20T16:05:50.332Z · comments (13)

Reward hacking behavior can generalize across tasks
Kei · 2024-05-28T16:33:50.674Z · comments (5)

Corrigibility = Tool-ness?
johnswentworth · 2024-06-28T01:19:48.883Z · comments (8)

[link] Is "superhuman" AI forecasting BS? Some experiments on the "539" bot from the Centre for AI Safety
titotal (lombertini) · 2024-09-18T13:07:40.754Z · comments (3)

[link] Gwern: Why So Few Matt Levines?
kave · 2024-10-29T01:07:27.564Z · comments (10)

EU policymakers reach an agreement on the AI Act
tlevin (trevor) · 2023-12-15T06:02:44.668Z · comments (7)

[link] The Cognitive-Theoretic Model of the Universe: A Partial Summary and Review
jessicata (jessica.liu.taylor) · 2024-03-27T19:59:27.893Z · comments (36)

Attention SAEs Scale to GPT-2 Small
Connor Kissane (ckkissane) · 2024-02-03T06:50:22.583Z · comments (4)

OpenAI: Leaks Confirm the Story
Zvi · 2023-12-12T14:00:04.812Z · comments (9)

Questions for labs
Zach Stein-Perlman · 2024-04-30T22:15:55.362Z · comments (11)

Send us example gnarly bugs
Beth Barnes (beth-barnes) · 2023-12-10T05:23:00.773Z · comments (10)

MATS Summer 2023 Retrospective
utilistrutil · 2023-12-01T23:29:47.958Z · comments (34)

[link] AI takeoff and nuclear war
owencb · 2024-06-11T19:36:24.710Z · comments (6)

JargonBot Beta Test
Raemon · 2024-11-01T01:05:26.552Z · comments (55)

Bitter lessons about lucid dreaming
avturchin · 2024-10-16T21:27:04.725Z · comments (62)

[link] [Linkpost] Practically-A-Book Review: Rootclaim $100,000 Lab Leak Debate
trevor (TrevorWiesinger) · 2024-03-28T16:03:36.452Z · comments (22)

The Parable Of The Fallen Pendulum - Part 2
johnswentworth · 2024-03-12T21:41:30.180Z · comments (8)

Secondary forces of debt
KatjaGrace · 2024-06-27T21:10:06.131Z · comments (18)

Rationality Quotes - Fall 2024
Screwtape · 2024-10-10T18:37:55.013Z · comments (26)

Creating unrestricted AI Agents with Command R+
Simon Lermen (dalasnoin) · 2024-04-16T14:52:50.917Z · comments (13)

ACX Covid Origins Post convinced readers
ErnestScribbler · 2024-05-01T13:06:20.818Z · comments (7)

On Claude 3.0
Zvi · 2024-03-06T18:50:04.766Z · comments (5)

Grief is a fire sale
Nathan Young · 2024-03-04T01:11:06.882Z · comments (1)

Lying Alignment Chart
Zack_M_Davis · 2023-11-29T16:15:28.102Z · comments (17)

Mid-conditional love
KatjaGrace · 2024-04-17T04:00:08.341Z · comments (21)

My 10-year retrospective on trying SSRIs
Kaj_Sotala · 2024-09-22T20:30:02.483Z · comments (10)

[link] Bengio's Alignment Proposal: "Towards a Cautious Scientist AI with Convergent Safety Bounds"
mattmacdermott · 2024-02-29T13:59:34.959Z · comments (19)

The Obliqueness Thesis
jessicata (jessica.liu.taylor) · 2024-09-19T00:26:30.677Z · comments (17)

Coherence of Caches and Agents
johnswentworth · 2024-04-01T23:04:31.320Z · comments (9)

The Packaging and the Payload
Screwtape · 2024-11-12T03:07:37.209Z · comments (1)

Value fragility and AI takeover
Joe Carlsmith (joekc) · 2024-08-05T21:28:07.306Z · comments (5)

Universal Love Integration Test: Hitler
Raemon · 2024-01-10T23:55:35.526Z · comments (65)

What is malevolence? On the nature, measurement, and distribution of dark traits
David Althaus (wallowinmaya) · 2024-10-23T08:41:33.197Z · comments (15)

Dentistry, Oral Surgeons, and the Inefficiency of Small Markets
GeneSmith · 2024-11-01T17:26:06.466Z · comments (16)

[question] What could a policy banning AGI look like?
TsviBT · 2024-03-13T14:19:07.783Z · answers+comments (23)

Analogies between scaling labs and misaligned superintelligent AI
scasper · 2024-02-21T19:29:39.033Z · comments (5)

[Valence series] 3. Valence & Beliefs
Steven Byrnes (steve2152) · 2023-12-11T20:21:30.570Z · comments (11)

Vote on Anthropic Topics to Discuss
Ben Pace (Benito) · 2024-03-06T19:43:47.194Z · comments (55)

[link] Video lectures on the learning-theoretic agenda
Vanessa Kosoy (vanessa-kosoy) · 2024-10-27T12:01:32.777Z · comments (0)

Could randomly choosing people to serve as representatives lead to better government?
John Huang · 2024-10-21T17:10:20.920Z · comments (13)

[link] The Offense-Defense Balance Rarely Changes
Maxwell Tabarrok (maxwell-tabarrok) · 2023-12-09T15:21:23.340Z · comments (23)

[Intuitive self-models] 4. Trance
Steven Byrnes (steve2152) · 2024-10-08T13:30:41.446Z · comments (7)

My guess at Conjecture's vision: triggering a narrative bifurcation
Alexandre Variengien (alexandre-variengien) · 2024-02-06T19:10:42.690Z · comments (12)

[link] The problems with the concept of an infohazard as used by the LW community [Linkpost]
Noosphere89 (sharmake-farah) · 2023-12-22T16:13:54.822Z · comments (43)

On the CrowdStrike Incident
Zvi · 2024-07-22T12:40:05.894Z · comments (14)

[link] Claude 3.5 Sonnet
Zach Stein-Perlman · 2024-06-20T18:00:35.443Z · comments (41)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

dagon on Is the mind a program?

I'd call that an empirical problem that has philosophical consequences :)

And it's still not worth a lot of debate about far-mode possibilities, but it MAY be worth exploring what we actually know and we we can test in the near-term. They've fully(*) emulated some brains - https://openworm.org/ is fascinating in how far it's come very recently. They're nowhere near to emulating a brain big enough to try to compare WRT complex behaviors from which consciousness can be inferred.

* "fully" is not actually claimed nor tested. Only the currently-measurable neural weights and interactions are emulated. More subtle physical properties may well turn out to be important, but we can't tell yet if that's so.

jan_kulveit on Hierarchical Agency: A Missing Piece in AI Alignment

I guess make one? Unclear if hierarchical agency is the true name

mukashi on What are the good rationality films?

12 Angry Men

Connection to rationality:

This is just the perfect movie about rationality. Damn, there is even a fantastic YouTube series discussing this movie in the context of instrumental rationality! And besides, I have never met anyone who did not enjoy this classic film.

This classic film is a masterclass in group decision-making, overcoming biases, and the process of critical thinking. The plot revolves around a jury deliberating the guilt of a young man accused of murder. Initially, 11 out of the 12 jurors vote "guilty," but one juror (played by Henry Fonda) questions the certainty of the evidence. It is an absolute must-watch.

sil-ver on Is the mind a program?

What about if it's a philosophical problem that has empirical consequences? I.e., suppose answering the philosophical questions tells you enough about the brain that you know how hard it would be to simulate it on a digital computer. In this case, the answer can be tested -- but not yet -- and I still think you wouldn't know if someone had the answer already.

joseph-miller on The Big Nonprofits Post

Tarbell Fellowship at PPF

I think you've massively underrated this. My impression is that Tarbell has had significant effect on the general AI discourse, by allowing a number of articles to be written in mainstream outlets.

dagon on Is the mind a program?

But if someone finds the correct answer to a philosophical question, then they can... try to write essays about it explaining the answer? Which maybe will be slightly more effective than essays arguing for any number of different positions because the answer is true?

I think this is a crux. To the extent that it's a purely philosophical problem (a modeling choice, contingent mostly on opinions and consensus about "useful" rather than "true"), posts like this one make no sense. To the extent that it's expressed as propositions that can be tested (even if not now, it could be described how it will resolve), it's NOT purely philosophical.

This post appears to be about an empirical question - can a human brain be simulated with sufficient fidelity to be indistinguishable from a biological brain. It's not clear whether OP is talking about an arbitrary new person, or if they include the upload problem as part of the unlikelihood. It's also not clear why anyone cares about this specific aspect of it, so maybe your comments are appropriate.

charlie-steiner on Is the mind a program?

Euan seems to be using the phrase to mean (something like) causal closure (as the phrase would normally be used e.g. in talking about physicalism) of the upper level of description - basically saying every thing that actually happens makes sense in terms of the emergent theory, it doesn't need to have interventions from outside or below.

danielfilan on The Big Nonprofits Post

If one wants to investigate [the Alignment of Complex Systems research group] further, he has an AXRP podcast episode, which I haven’t listened to.

Note that if you want to investigate further but would rather read a transcript than watch a video, AXRP has you covered [LW · GW].

sil-ver on Is the mind a program?

I don't know the answer, and I'm pretty sure nobody else does either.

I see similar statements all the time (no one has solved consciousness yet / no one knows whether LLMs/chickens/insects/fish are conscious / I can only speculate and I'm pretty sure this is true for everyone else / etc.) ... and I don't see how this confidence is justified. The idea seems to be that no one can't have found the correct answers to the philosophical problems yet because if they had, those answers would immediately make waves and soon everyone on LessWrong would know about it. But it's like... do you really think this? People can't even agree on whether consciousness is a well-defined thing or a lossy abstraction [LW · GW]; do we really think that if someone had the right answers, they would automatically convince everyone of them?

There are other fields where those kinds of statements make sense, like physics. If someone finds the correct answer to a physics question, they can run an experiment and prove it. But if someone finds the correct answer to a philosophical question, then they can... try to write essays about it explaining the answer? Which maybe will be slightly more effective than essays arguing for any number of different positions because the answer is true?

I can imagine someone several hundred years ago having figured out, purely based on first-principles reasoning, that life is no crisp category at the territory but just a lossy conceptual abstraction. I can imagine them being highly confident in this result because they've derived it for correct reasons and they've verified all the steps that got them there. And I can imagine someone else throwing their hands up and saying "I don't know what mysterious force is behind the phenomenon of life, and I'm pretty sure no one else does, either".

Which is all just to say -- isn't it much more likely that the problem has been solved, and there are people who are highly confident in the solution because they have verified all the steps that led them there, and they know with high confidence which features need to be replicated to preserve consciousness... but you just don't know about it because "find the correct solution" and "convince people of a solution" are mostly independent problems, and there's just no reason why the correct solution would organically spread?

(As I've mentioned, the "we know that no one knows" thing is something I see expressed all the time, usually just stated as a self-evident fact -- so I'm equally arguing against everyone else who's expressed it. This just happens to the be first time that I've decided to formulate my objection.)

martin-randall on A shot at the diamond-alignment problem

It's looking like the values of humans are far, far simpler than a lot of evopsych literature and Yudkowsky.

I've missed this. Any particular link to get to me started reading about this update? Shard theory seems to imply complex values in individual humans. Though certainly less fragile than Yudkowsky proposed.