LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] The Witness
Richard_Ngo (ricraz) · 2023-12-03T22:27:16.248Z · comments (5)
Short Remark on the (subjective) mathematical 'naturalness' of the Nanda--Lieberum addition modulo 113 algorithm
carboniferous_umbraculum (Spencer Becker-Kahn) · 2023-06-01T11:31:37.796Z · comments (12)
Slightly against aligning with neo-luddites
Matthew Barnett (matthew-barnett) · 2022-12-26T22:46:42.693Z · comments (31)
I Don’t Know How To Count That Low
Elizabeth (pktechgirl) · 2021-10-22T22:00:02.708Z · comments (10)
[link] Direct effects matter!
Aaron Bergman (aaronb50) · 2021-03-14T04:33:11.493Z · comments (28)
Takes on "Alignment Faking in Large Language Models"
Joe Carlsmith (joekc) · 2024-12-18T18:22:34.059Z · comments (7)
Toward A Bayesian Theory Of Willpower
Scott Alexander (Yvain) · 2021-03-26T02:33:55.056Z · comments (28)
Karate Kid and Realistic Expectations for Disagreement Resolution
Raemon · 2019-12-04T23:25:59.608Z · comments (23)
Rapid Increase of Highly Mutated B.1.1.529 Strain in South Africa
dawangy · 2021-11-26T01:05:49.516Z · comments (15)
A shortcoming of concrete demonstrations as AGI risk advocacy
Steven Byrnes (steve2152) · 2024-12-11T16:48:41.602Z · comments (27)
Yes, AI research will be substantially curtailed if a lab causes a major disaster
lc · 2022-06-14T22:17:01.273Z · comments (31)
Money Stuff
Jacob Falkovich (Jacobian) · 2021-11-01T16:08:02.700Z · comments (18)
I Really Don't Understand Eliezer Yudkowsky's Position on Consciousness
J Bostock (Jemist) · 2021-10-29T11:09:20.559Z · comments (120)
Announcing Encultured AI: Building a Video Game
Andrew_Critch · 2022-08-18T02:16:26.726Z · comments (26)
Human takeover might be worse than AI takeover
Tom Davidson (tom-davidson-1) · 2025-01-10T16:53:27.043Z · comments (51)
Frequent arguments about alignment
John Schulman (john-schulman) · 2021-06-23T00:46:38.568Z · comments (17)
[link] A review of Where Is My Flying Car? by J. Storrs Hall
jasoncrawford · 2020-11-06T20:01:55.074Z · comments (23)
The Long Long Covid Post
Zvi · 2022-02-10T13:10:01.452Z · comments (29)
Biosecurity Culture, Computer Security Culture
jefftk (jkaufman) · 2023-08-30T16:40:03.101Z · comments (11)
Testing PaLM prompts on GPT3
Yitz (yitz) · 2022-04-06T05:21:06.841Z · comments (14)
[link] Scaling Laws for Reward Model Overoptimization
leogao · 2022-10-20T00:20:06.920Z · comments (13)
[link] Reproducing ARC Evals' recent report on language model agents
Thomas Broadley (thomas-broadley) · 2023-09-01T16:52:17.147Z · comments (17)
[link] Carl Sagan, nuking the moon, and not nuking the moon
eukaryote · 2024-04-13T04:08:50.166Z · comments (8)
My take on Vanessa Kosoy's take on AGI safety
Steven Byrnes (steve2152) · 2021-09-30T12:23:58.329Z · comments (10)
The Credit Assignment Problem
abramdemski · 2019-11-08T02:50:30.412Z · comments (40)
Experimentally evaluating whether honesty generalizes
paulfchristiano · 2021-07-01T17:47:57.847Z · comments (24)
[link] Turning air into bread
jasoncrawford · 2019-10-21T17:50:00.117Z · comments (12)
[link] [Linkpost] The Story Of VaccinateCA
hath · 2022-12-09T23:54:48.703Z · comments (4)
Solving Math Problems by Relay
Ben Goldhaber (bgold) · 2020-07-17T15:32:00.985Z · comments (26)
Key takeaways from our EA and alignment research surveys
Cameron Berg (cameron-berg) · 2024-05-03T18:10:41.416Z · comments (10)
Applied Linear Algebra Lecture Series
johnswentworth · 2022-12-22T06:57:26.643Z · comments (8)
LW Beta Feature: Side-Comments
jimrandomh · 2022-11-24T01:55:31.578Z · comments (47)
Introducing Leap Labs, an AI interpretability startup
Jessica Rumbelow (jessica-cooper) · 2023-03-06T16:16:22.182Z · comments (12)
Response to nostalgebraist: proudly waving my moral-antirealist battle flag
Steven Byrnes (steve2152) · 2024-05-29T16:48:29.408Z · comments (29)
Dreams of AI alignment: The danger of suggestive names
TurnTrout · 2024-02-10T01:22:51.715Z · comments (59)
Final Version Perfected: An Underused Execution Algorithm
willbradshaw · 2020-11-27T10:43:02.796Z · comments (34)
Open Source Sparse Autoencoders for all Residual Stream Layers of GPT2-Small
Joseph Bloom (Jbloom) · 2024-02-02T06:54:53.392Z · comments (37)
2022 was the year AGI arrived (Just don't call it that)
Logan Zoellner (logan-zoellner) · 2023-01-04T15:19:55.009Z · comments (60)
LLM Applications I Want To See
sarahconstantin · 2024-08-19T21:10:03.101Z · comments (5)
Perishable Knowledge
lsusr · 2021-12-18T05:53:03.343Z · comments (6)
Value systematization: how values become coherent (and misaligned)
Richard_Ngo (ricraz) · 2023-10-27T19:06:26.928Z · comments (48)
Contra shard theory, in the context of the diamond maximizer problem
So8res · 2022-10-13T23:51:29.532Z · comments (19)
Analysis: US restricts GPU sales to China
aogara (Aidan O'Gara) · 2022-10-07T18:38:06.517Z · comments (58)
Omicron Post #5
Zvi · 2021-12-09T21:10:00.469Z · comments (18)
What happens if you present 500 people with an argument that AI is risky?
KatjaGrace · 2024-09-04T16:40:03.562Z · comments (7)
Vegan Nutrition Testing Project: Interim Report
Elizabeth (pktechgirl) · 2023-01-20T05:50:03.565Z · comments (37)
Against "blankfaces"
philh · 2021-08-08T23:00:04.126Z · comments (12)
Safety Implications of LeCun's path to machine intelligence
Ivan Vendrov (ivan-vendrov) · 2022-07-15T21:47:44.411Z · comments (18)
[link] Alignment 201 curriculum
Richard_Ngo (ricraz) · 2022-10-12T18:03:03.454Z · comments (3)
[question] Exercise: Solve "Thinking Physics"
Raemon · 2023-08-01T00:44:48.975Z · answers+comments (30)
← previous page (newer posts) · next page (older posts) →