LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

next page (older posts) →

Shallow review of live agendas in alignment & safety
technicalities · 2023-11-27T11:10:27.464Z · comments (69)
Social Dark Matter
[DEACTIVATED] Duncan Sabien (Duncan_Sabien) · 2023-11-16T20:00:00.000Z · comments (112)
OpenAI: The Battle of the Board
Zvi · 2023-11-22T17:30:04.574Z · comments (82)
OpenAI: Facts from a Weekend
Zvi · 2023-11-20T15:30:06.732Z · comments (158)
The 6D effect: When companies take risks, one email can be very powerful.
scasper · 2023-11-04T20:08:39.775Z · comments (40)
AI Timelines
habryka (habryka4) · 2023-11-10T05:28:24.841Z · comments (74)
The 101 Space You Will Always Have With You
Screwtape · 2023-11-29T04:56:40.240Z · comments (20)
What are the results of more parental supervision and less outdoor play?
juliawise · 2023-11-25T12:52:29.986Z · comments (30)
Ability to solve long-horizon tasks correlates with wanting things in the behaviorist sense
So8res · 2023-11-24T17:37:43.020Z · comments (83)
[link] Sam Altman fired from OpenAI
LawrenceC (LawChan) · 2023-11-17T20:42:30.759Z · comments (75)
Propaganda or Science: A Look at Open Source AI and Bioterrorism Risk
1a3orn · 2023-11-02T18:20:29.569Z · comments (79)
Thinking By The Clock
Screwtape · 2023-11-08T07:40:59.936Z · comments (27)
The other side of the tidal wave
KatjaGrace · 2023-11-03T05:40:05.363Z · comments (79)
Vote on Interesting Disagreements
Ben Pace (Benito) · 2023-11-07T21:35:00.270Z · comments (129)
My thoughts on the social response to AI risk
Matthew Barnett (matthew-barnett) · 2023-11-01T21:17:08.184Z · comments (36)
You can just spontaneously call people you haven't met in years
lc · 2023-11-13T05:21:05.726Z · comments (19)
Does davidad's uploading moonshot work?
jacobjacob · 2023-11-03T02:21:51.720Z · comments (32)
Loudly Give Up, Don't Quietly Fade
Screwtape · 2023-11-13T23:30:25.308Z · comments (11)
[link] Moral Reality Check (a short story)
jessicata (jessica.liu.taylor) · 2023-11-26T05:03:18.254Z · comments (44)
EA orgs' legal structure inhibits risk taking and information sharing on the margin
Elizabeth (pktechgirl) · 2023-11-05T19:13:56.135Z · comments (17)
Integrity in AI Governance and Advocacy
habryka (habryka4) · 2023-11-03T19:52:33.180Z · comments (57)
How to (hopefully ethically) make money off of AGI
habryka (habryka4) · 2023-11-06T23:35:16.476Z · comments (75)
Apocalypse insurance, and the hardline libertarian take on AI risk
So8res · 2023-11-28T02:09:52.400Z · comments (36)
8 examples informing my pessimism on uploading without reverse engineering
Steven Byrnes (steve2152) · 2023-11-03T20:03:50.450Z · comments (12)
Experiences and learnings from both sides of the AI safety job market
Marius Hobbhahn (marius-hobbhahn) · 2023-11-15T15:40:32.196Z · comments (4)
How much to update on recent AI governance moves?
habryka (habryka4) · 2023-11-16T23:46:01.601Z · comments (4)
Picking Mentors For Research Programmes
Raymond D · 2023-11-10T13:01:14.197Z · comments (8)
Stuxnet, not Skynet: Humanity's disempowerment by AI
Roko · 2023-11-04T22:23:55.428Z · comments (23)
New LessWrong feature: Dialogue Matching
jacobjacob · 2023-11-16T21:27:16.763Z · comments (22)
Deception Chess: Game #1
Zane · 2023-11-03T21:13:55.777Z · comments (19)
[link] My techno-optimism [By Vitalik Buterin]
habryka (habryka4) · 2023-11-27T23:53:35.859Z · comments (16)
One Day Sooner
Screwtape · 2023-11-02T19:00:58.427Z · comments (5)
On the Executive Order
Zvi · 2023-11-01T14:20:01.657Z · comments (3)
[link] The Soul Key
Richard_Ngo (ricraz) · 2023-11-04T17:51:53.176Z · comments (9)
Learning-theoretic agenda reading list
Vanessa Kosoy (vanessa-kosoy) · 2023-11-09T17:25:35.046Z · comments (0)
Kids or No kids
Kids or no kids (grosseholz.f@gmail.com) · 2023-11-14T18:37:02.799Z · comments (10)
[link] Large Language Models can Strategically Deceive their Users when Put Under Pressure.
ReaderM · 2023-11-15T16:36:04.446Z · comments (8)
Public Call for Interest in Mathematical Alignment
Davidmanheim · 2023-11-22T13:22:09.558Z · comments (9)
Growth and Form in a Toy Model of Superposition
Liam Carroll (liam-carroll) · 2023-11-08T11:08:04.359Z · comments (5)
[link] Dario Amodei’s prepared remarks from the UK AI Safety Summit, on Anthropic’s Responsible Scaling Policy
Zac Hatfield-Dodds (zac-hatfield-dodds) · 2023-11-01T18:10:31.110Z · comments (1)
Coup probes: Catching catastrophes with probes trained off-policy
Fabien Roger (Fabien) · 2023-11-17T17:58:28.687Z · comments (7)
Saying the quiet part out loud: trading off x-risk for personal immortality
disturbance · 2023-11-02T17:43:34.155Z · comments (89)
Bostrom Goes Unheard
Zvi · 2023-11-13T14:11:07.586Z · comments (9)
Agent Boundaries Aren't Markov Blankets. [Unless they're non-causal; see comments.]
abramdemski · 2023-11-20T18:23:40.443Z · comments (8)
Self-Referential Probabilistic Logic Admits the Payor's Lemma
Yudhister Kumar (randomwalks) · 2023-11-28T10:27:29.029Z · comments (13)
Untrusted smart models and trusted dumb models
Buck · 2023-11-04T03:06:38.001Z · comments (12)
Announcing Athena - Women in AI Alignment Research
Claire Short (claire-short) · 2023-11-07T21:46:41.741Z · comments (2)
New report: "Scheming AIs: Will AIs fake alignment during training in order to get power?"
Joe Carlsmith (joekc) · 2023-11-15T17:16:42.088Z · comments (26)
My Criticism of Singular Learning Theory
Joar Skalse (Logical_Lunatic) · 2023-11-19T15:19:16.874Z · comments (56)
Thomas Kwa's research journal
Thomas Kwa (thomas-kwa) · 2023-11-23T05:11:08.907Z · comments (1)
next page (older posts) →