LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

[question] Are limited-horizon agents a good heuristic for the off-switch problem?
[deleted] · 2021-12-05T19:27:59.894Z · answers+comments (19)
ML Alignment Theory Program under Evan Hubinger
ozhang (oliver-zhang) · 2021-12-06T00:03:15.443Z · comments (3)
Anti-correlated causation
DirectedEvolution (AllAmericanBreakfast) · 2021-12-06T04:36:17.439Z · comments (2)
A Framework to Explain Bayesian Models
Jsevillamol · 2021-12-06T10:38:25.815Z · comments (1)
Modeling Failure Modes of High-Level Machine Intelligence
Ben Cottier (ben-cottier) · 2021-12-06T13:54:38.147Z · comments (1)
Life, struggle, and the psychological fallout from COVID
Alex Flint (alexflint) · 2021-12-06T16:59:39.611Z · comments (1)
Omicron Post #4
Zvi · 2021-12-06T17:00:01.470Z · comments (66)
Information bottleneck for counterfactual corrigibility
tailcalled · 2021-12-06T17:11:12.984Z · comments (1)
A Possible Resolution To Spurious Counterfactuals
JoshuaOSHickman · 2021-12-06T18:26:41.409Z · comments (5)
[link] Implications of the Grabby Aliens Model
harsimony · 2021-12-06T18:34:44.985Z · comments (3)
Are there alternative to solving value transfer and extrapolation?
Stuart_Armstrong · 2021-12-06T18:53:52.659Z · comments (8)
More Christiano, Cotra, and Yudkowsky on AI progress
Eliezer Yudkowsky (Eliezer_Yudkowsky) · 2021-12-06T20:33:12.164Z · comments (28)
Declustering, reclustering, and filling in thingspace
Stuart_Armstrong · 2021-12-06T20:53:14.559Z · comments (6)
Leaving Orbit
Rob Bensinger (RobbBB) · 2021-12-06T21:48:41.371Z · comments (17)
Dear Self; We Need To Talk About Social Media
Elizabeth (pktechgirl) · 2021-12-07T00:40:01.949Z · comments (19)
Ordering yourself around with an app
bfinn · 2021-12-07T00:49:11.546Z · comments (2)
Retail Investor Advantages
leogao · 2021-12-07T02:08:20.694Z · comments (13)
Considerations on interaction between AI and expected value of the future
Beth Barnes (beth-barnes) · 2021-12-07T02:46:19.215Z · comments (28)
Interviews on Improving the AI Safety Pipeline
Chris_Leong · 2021-12-07T12:03:04.420Z · comments (15)
Exterminating humans might be on the to-do list of a Friendly AI
RomanS · 2021-12-07T14:15:07.206Z · comments (8)
Counting Lightning
Jsevillamol · 2021-12-07T14:50:55.680Z · comments (8)
Randomness in Science
rogersbacon · 2021-12-07T18:17:51.232Z · comments (15)
HIRING: Inform and shape a new project on AI safety at Partnership on AI
madhu_lika · 2021-12-07T19:37:31.220Z · comments (0)
Let's buy out Cyc, for use in AGI interpretability systems?
Steven Byrnes (steve2152) · 2021-12-07T20:46:10.303Z · comments (10)
Theoretical Neuroscience For Alignment Theory
Cameron Berg (cameron-berg) · 2021-12-07T21:50:10.142Z · comments (18)
Some thoughts on why adversarial training might be useful
Beth Barnes (beth-barnes) · 2021-12-08T01:28:22.974Z · comments (6)
What makes for a good "argument"? (Request for thoughts and comments)
Simon DeDeo (simon-dedeo) · 2021-12-08T02:16:10.805Z · comments (3)
Interpreting the Biobot Spike
jefftk (jkaufman) · 2021-12-08T16:30:07.924Z · comments (1)
[link] Deepmind's Gopher--more powerful than GPT-3
hath · 2021-12-08T17:06:32.650Z · comments (26)
The Last Questions (part 1)
rogersbacon · 2021-12-08T18:09:53.760Z · comments (0)
[AN #170]: Analyzing the argument for risk from power-seeking AI
Rohin Shah (rohinmshah) · 2021-12-08T18:10:04.022Z · comments (1)
Finding the multiple ground truths of CoinRun and image classification
Stuart_Armstrong · 2021-12-08T18:13:01.576Z · comments (4)
Seeing the Invisible (And How to Think About Machine Learning)
Filip Dousek (fidnie) · 2021-12-08T21:04:49.828Z · comments (0)
COVID and the holidays
Connor_Flexman · 2021-12-08T23:13:56.097Z · comments (31)
Introduction to inaccessible information
Ryan Kidd (ryankidd44) · 2021-12-09T01:28:48.154Z · comments (6)
[link] [Linkpost] Cat Couplings
mike_hawke · 2021-12-09T01:41:11.646Z · comments (1)
Stop arbitrarily limiting yourself
unoptimal · 2021-12-09T02:42:34.466Z · comments (7)
Austin Winter Solstice
SilasBarta · 2021-12-09T05:01:17.511Z · comments (1)
Supervised learning and self-modeling: What's "superhuman?"
Charlie Steiner · 2021-12-09T12:44:14.004Z · comments (1)
[MLSN #2]: Adversarial Training
Dan H (dan-hendrycks) · 2021-12-09T17:16:49.684Z · comments (0)
[link] The end of Victorian culture, part I: structural forces
David Hugh-Jones (david-hugh-jones) · 2021-12-09T19:25:23.222Z · comments (0)
[question] What alignment-related concepts should be better known in the broader ML community?
Lauro Langosco · 2021-12-09T20:44:09.228Z · answers+comments (4)
LessWrong discussed in New Ideas in Psychology article
rogersbacon · 2021-12-09T21:01:17.920Z · comments (11)
Omicron Post #5
Zvi · 2021-12-09T21:10:00.469Z · comments (18)
Conversation on technology forecasting and gradualism
Richard_Ngo (ricraz) · 2021-12-09T21:23:21.187Z · comments (30)
Covid 12/9: Counting Down the Days
Zvi · 2021-12-09T21:40:01.105Z · comments (12)
Combining Forecasts
jsteinhardt · 2021-12-10T02:10:14.402Z · comments (1)
Are big brains for processing sensory input?
lsusr · 2021-12-10T07:08:31.495Z · comments (20)
The Promise and Peril of Finite Sets
davidad · 2021-12-10T12:29:56.535Z · comments (4)
There is essentially one best-validated theory of cognition.
abramdemski · 2021-12-10T15:51:06.423Z · comments (33)
← previous page (newer posts) · next page (older posts) →