LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

My take on Jacob Cannell’s take on AGI safety
Steven Byrnes (steve2152) · 2022-11-28T14:01:15.584Z · comments (15)
K-types vs T-types — what priors do you have?
Cleo Nardo (strawberry calm) · 2022-11-03T11:29:00.809Z · comments (25)
[link] Far-UVC Light Update: No, LEDs are not around the corner (tweetstorm)
Davidmanheim · 2022-11-02T12:57:23.445Z · comments (27)
Don't design agents which exploit adversarial inputs
TurnTrout · 2022-11-18T01:48:38.372Z · comments (64)
Why Would AI "Aim" To Defeat Humanity?
HoldenKarnofsky · 2022-11-29T19:30:07.828Z · comments (10)
[link] Real-Time Research Recording: Can a Transformer Re-Derive Positional Info?
Neel Nanda (neel-nanda-1) · 2022-11-01T23:56:06.215Z · comments (16)
All AGI Safety questions welcome (especially basic ones) [~monthly thread]
Robert Miles (robert-miles) · 2022-11-01T23:23:04.146Z · comments (105)
Against "Classic Style"
Cleo Nardo (strawberry calm) · 2022-11-23T22:10:50.422Z · comments (30)
2022 LessWrong Census?
SurfingOrca · 2022-11-07T05:16:33.207Z · comments (13)
The First Filter
adamShimi · 2022-11-26T19:37:04.607Z · comments (5)
[link] Career Scouting: Dentistry
koratkar · 2022-11-20T15:55:12.431Z · comments (5)
Clarifying wireheading terminology
leogao · 2022-11-24T04:53:23.925Z · comments (6)
Deontology and virtue ethics as "effective theories" of consequentialist ethics
Jan_Kulveit · 2022-11-17T14:11:49.087Z · comments (9)
Announcing AI safety Mentors and Mentees
Marius Hobbhahn (marius-hobbhahn) · 2022-11-23T15:21:12.636Z · comments (7)
Against a General Factor of Doom
Jeffrey Heninger (jeffrey-heninger) · 2022-11-23T16:50:04.229Z · comments (19)
Here's the exit.
Valentine · 2022-11-21T18:07:23.607Z · comments (178)
Alignment allows "nonrobust" decision-influences and doesn't require robust grading
TurnTrout · 2022-11-29T06:23:00.394Z · comments (42)
What’s the Deal with Elon Musk and Twitter?
Zvi · 2022-11-07T13:50:00.991Z · comments (11)
[link] New Frontiers in Mojibake
Adam Scherlis (adam-scherlis) · 2022-11-26T02:37:27.290Z · comments (7)
The Least Controversial Application of Geometric Rationality
Scott Garrabrant · 2022-11-25T16:50:56.497Z · comments (22)
FTX will probably be sold at a steep discount. What we know and some forecasts on what will happen next
Nathan Young · 2022-11-09T02:14:19.623Z · comments (21)
[link] Could a single alien message destroy us?
Writer · 2022-11-25T07:32:24.889Z · comments (23)
Open technical problem: A Quinean proof of Löb's theorem, for an easier cartoon guide
Andrew_Critch · 2022-11-24T21:16:43.879Z · comments (35)
Humans do acausal coordination all the time
Adam Jermyn (adam-jermyn) · 2022-11-02T14:40:39.730Z · comments (35)
A philosopher's critique of RLHF
ThomasW (ThomasWoodside) · 2022-11-07T02:42:51.234Z · comments (8)
Announcing Nonlinear Emergency Funding
KatWoods (ea247) · 2022-11-13T19:02:57.803Z · comments (0)
Human-level Diplomacy was my fire alarm
Lao Mein (derpherpize) · 2022-11-23T10:05:36.127Z · comments (15)
Some advice on independent research
Marius Hobbhahn (marius-hobbhahn) · 2022-11-08T14:46:19.134Z · comments (5)
Kelsey Piper's recent interview of SBF
agucova · 2022-11-16T20:30:35.901Z · comments (29)
What's the Alternative to Independence?
jefftk (jkaufman) · 2022-11-13T15:30:01.186Z · comments (3)
Human-level Full-Press Diplomacy (some bare facts).
Cleo Nardo (strawberry calm) · 2022-11-22T20:59:18.155Z · comments (7)
Noting an unsubstantiated communal belief about the FTX disaster
Yitz (yitz) · 2022-11-13T05:37:03.087Z · comments (52)
Developer experience for the motivation
Adam Zerner (adamzerner) · 2022-11-16T07:12:19.893Z · comments (7)
"Rudeness", a useful coordination mechanic
Raemon · 2022-11-11T22:27:35.023Z · comments (20)
A Mystery About High Dimensional Concept Encoding
Fabien Roger (Fabien) · 2022-11-03T17:05:56.034Z · comments (13)
Information Markets
eva_ · 2022-11-02T01:24:11.639Z · comments (6)
A Short Dialogue on the Meaning of Reward Functions
Leon Lang (leon-lang) · 2022-11-19T21:04:30.076Z · comments (0)
For ELK truth is mostly a distraction
c.trout (ctrout) · 2022-11-04T21:14:52.279Z · comments (0)
[link] The FTX Saga - Simplified
Annapurna (jorge-velez) · 2022-11-16T02:42:55.739Z · comments (10)
Spectrum of Independence
jefftk (jkaufman) · 2022-11-05T02:40:03.822Z · comments (7)
The biological function of love for non-kin is to gain the trust of people we cannot deceive
chaosmage · 2022-11-07T20:26:29.876Z · comments (3)
Rationalist Town Hall: FTX Fallout Edition (RSVP Required)
Ben Pace (Benito) · 2022-11-23T01:38:25.516Z · comments (13)
The optimal angle for a solar boiler is different than for a solar panel
Yair Halberstadt (yair-halberstadt) · 2022-11-10T10:32:47.187Z · comments (4)
Weekly Roundup #4
Zvi · 2022-11-04T15:00:01.096Z · comments (1)
A newcomer’s guide to the technical AI safety field
zeshen · 2022-11-04T14:29:46.873Z · comments (3)
We must be very clear: fraud in the service of effective altruism is unacceptable
evhub · 2022-11-10T23:31:06.422Z · comments (56)
Don't align agents to evaluations of plans
TurnTrout · 2022-11-26T21:16:23.425Z · comments (49)
Why square errors?
Aprillion (Peter Hozák) (Aprillion) · 2022-11-26T13:40:37.318Z · comments (11)
Counterfactability
Scott Garrabrant · 2022-11-07T05:39:05.668Z · comments (4)
[link] Scott Aaronson on "Reform AI Alignment"
shminux · 2022-11-20T22:20:23.895Z · comments (17)
← previous page (newer posts) · next page (older posts) →