LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Shapley Value Attribution in Chain of Thought
leogao · 2023-04-14T05:56:18.208Z · comments (5)
10 reasons why lists of 10 reasons might be a winning strategy
trevor (TrevorWiesinger) · 2023-04-06T21:24:17.896Z · comments (7)
AI #8: People Can Do Reasonable Things
Zvi · 2023-04-20T15:50:00.826Z · comments (16)
AI Alignment Research Engineer Accelerator (ARENA): call for applicants
CallumMcDougall (TheMcDouglas) · 2023-04-17T20:30:03.965Z · comments (9)
The Social Alignment Problem
irving (judith) · 2023-04-28T14:16:17.825Z · comments (13)
Given the Restrict Act, Don’t Ban TikTok
Zvi · 2023-04-04T14:40:03.162Z · comments (9)
Would we even want AI to solve all our problems?
So8res · 2023-04-21T18:04:11.636Z · comments (15)
Scaffolded LLMs as natural language computers
beren · 2023-04-12T10:47:42.904Z · comments (10)
Communicating effectively under Knightian norms
Richard_Ngo (ricraz) · 2023-04-03T22:39:58.350Z · comments (54)
A Confession about the LessWrong Team
Ruby · 2023-04-01T21:47:11.572Z · comments (5)
[link] Singularities against the Singularity: Announcing Workshop on Singular Learning Theory and Alignment
Jesse Hoogland (jhoogland) · 2023-04-01T09:58:22.764Z · comments (0)
You can use GPT-4 to create prompt injections against GPT-4
WitchBOT · 2023-04-06T20:39:51.584Z · comments (7)
Why Simulator AIs want to be Active Inference AIs
Jan_Kulveit · 2023-04-10T18:23:35.101Z · comments (8)
Polio Lab Leak Caught with Wastewater Sampling
Cullen (Cullen_OKeefe) · 2023-04-07T01:06:35.245Z · comments (3)
[link] Anthropic is further accelerating the Arms Race?
sapphire (deluks917) · 2023-04-06T23:29:24.080Z · comments (22)
Capabilities and alignment of LLM cognitive architectures
Seth Herd · 2023-04-18T16:29:29.792Z · comments (18)
The Agency Overhang
Jeffrey Ladish (jeff-ladish) · 2023-04-21T07:47:19.454Z · comments (6)
AI #6: Agents of Change
Zvi · 2023-04-06T14:00:00.702Z · comments (13)
Introducing AlignmentSearch: An AI Alignment-Informed Conversional Agent
BionicD0LPH1N (jumeaux200) · 2023-04-01T16:39:09.643Z · comments (14)
AISafety.world is a map of the AIS ecosystem
Hamish Doodles (hamish-doodles) · 2023-04-06T18:37:15.360Z · comments (0)
The surprising parameter efficiency of vision models
beren · 2023-04-08T19:44:36.186Z · comments (28)
AI Safety via Luck
Jozdien · 2023-04-01T20:13:55.346Z · comments (7)
No convincing evidence for gradient descent in activation space
Blaine (blaine-rogers) · 2023-04-12T04:48:56.459Z · comments (8)
Research agenda: Supervising AIs improving AIs
Quintin Pope (quintin-pope) · 2023-04-29T17:09:21.182Z · comments (5)
[link] I was Wrong, Simulator Theory is Real
Robert_AIZI · 2023-04-26T17:45:03.146Z · comments (7)
Introducing the Nuts and Bolts Of Naturalism
LoganStrohl (BrienneYudkowsky) · 2023-04-22T18:31:25.620Z · comments (1)
[link] Risks from GPT-4 Byproduct of Recursively Optimizing AIs
ben hayum (hayum) · 2023-04-07T00:02:59.185Z · comments (10)
Locating Fulcrum Experiences
LoganStrohl (BrienneYudkowsky) · 2023-04-28T20:14:03.644Z · comments (31)
[question] [link] Is this true? @tyler_m_john: [If we had started using CFCs earlier, we would have ended most life on the planet]
tailcalled · 2023-04-10T14:22:07.230Z · answers+comments (15)
All images from the WaitButWhy sequence on AI
trevor (TrevorWiesinger) · 2023-04-08T07:36:06.044Z · comments (5)
[link] Power laws in Speedrunning and Machine Learning
Jsevillamol · 2023-04-24T10:06:35.332Z · comments (1)
SmartyHeaderCode: anomalous tokens for GPT3.5 and GPT-4
AdamYedidia (babybeluga) · 2023-04-15T22:35:30.039Z · comments (18)
Japan AI Alignment Conference Postmortem
Chris Scammell (chris-scammell) · 2023-04-20T10:58:34.065Z · comments (8)
SERI MATS - Summer 2023 Cohort
Aris · 2023-04-08T15:32:56.737Z · comments (25)
[link] My experience getting funding for my biological research
Metacelsus · 2023-04-16T22:53:33.453Z · comments (10)
The Computational Anatomy of Human Values
beren · 2023-04-06T10:33:24.989Z · comments (30)
AGI ruin mostly rests on strong claims about alignment and deployment, not about society
Rob Bensinger (RobbBB) · 2023-04-24T13:06:02.255Z · comments (8)
A decade of lurking, a month of posting
Max H (Maxc) · 2023-04-09T00:21:23.321Z · comments (4)
Exposure to Lizardman is Lethal
[DEACTIVATED] Duncan Sabien (Duncan_Sabien) · 2023-04-02T18:57:43.750Z · comments (96)
Romance, misunderstanding, social stances, and the human LLM
Kaj_Sotala · 2023-04-27T12:59:09.229Z · comments (32)
LW Account Restricted: OK for me, but not sure about LessWrong
amelia (314159) · 2023-04-12T19:45:17.042Z · comments (19)
[Linkpost] Sam Altman's 2015 Blog Posts Machine Intelligence Parts 1 & 2
OliviaJ (olivia-jimenez-1) · 2023-04-28T16:02:00.060Z · comments (4)
Why Are Maximum Entropy Distributions So Ubiquitous?
johnswentworth · 2023-04-05T20:12:57.748Z · comments (6)
Approximation is expensive, but the lunch is cheap
Jesse Hoogland (jhoogland) · 2023-04-19T14:19:12.570Z · comments (3)
Mechanistically interpreting time in GPT-2 small
rgould (Rhys Gould) · 2023-04-16T17:57:52.637Z · comments (6)
The Toxoplasma of AGI Doom and Capabilities?
Robert_AIZI · 2023-04-24T18:11:41.576Z · comments (12)
On "aiming for convergence on truth"
gjm · 2023-04-11T18:19:18.086Z · comments (55)
Subscripts for Probabilities
niplav · 2023-04-13T18:32:17.267Z · comments (9)
Top lesson from GPT: we will probably destroy humanity "for the lulz" as soon as we are able.
shminux · 2023-04-16T20:27:19.665Z · comments (28)
Apply to >30 AI safety funders in one application with the Nonlinear Network
KatWoods (ea247) · 2023-04-12T21:23:45.276Z · comments (12)
← previous page (newer posts) · next page (older posts) →