LessWrong 2.0 Reader

View: New · Old · Top

← previous page (newer posts) · next page (older posts) →

[link] Against Diversification
Jack Malde (jackmalde) · 2022-12-22T13:29:38.765Z · comments (0)
[link] Notes on Meta's Diplomacy-Playing AI
Erich_Grunewald · 2022-12-22T11:34:27.384Z · comments (2)
Take 13: RLHF bad, conditioning good.
Charlie Steiner · 2022-12-22T10:44:06.359Z · comments (4)
Applied Linear Algebra Lecture Series
johnswentworth · 2022-12-22T06:57:26.643Z · comments (8)
Naive Set Theory, Halmos
David Udell · 2022-12-22T02:34:38.509Z · comments (1)
Not Getting Hacked
jefftk (jkaufman) · 2022-12-21T21:40:05.254Z · comments (14)
[link] Metaphor.systems
the gears to ascension (lahwran) · 2022-12-21T21:31:17.373Z · comments (9)
[question] How much is DQC (Dynamic Quantum Clustering) currently looked into in AI Capabilities Research?
macmillan · 2022-12-21T20:46:55.448Z · answers+comments (0)
[link] Think wider about the root causes of progress
jasoncrawford · 2022-12-21T20:05:46.986Z · comments (11)
[question] What readings did you consider best for the happy parts of the secular solstice?
ChristianKl · 2022-12-21T15:45:44.583Z · answers+comments (0)
Recreating logic in type theory
Thomas Kehrenberg (thomas-kehrenberg) · 2022-12-21T15:19:18.275Z · comments (0)
You become the UI you use
Viliam · 2022-12-21T15:04:17.072Z · comments (7)
Price's equation for neural networks
tailcalled · 2022-12-21T13:09:16.527Z · comments (4)
Decisions: Ontologically Shifting to Determinism
Chris_Leong · 2022-12-21T12:41:30.884Z · comments (11)
[link] A Comprehensive Mechanistic Interpretability Explainer & Glossary
Neel Nanda (neel-nanda-1) · 2022-12-21T12:35:08.589Z · comments (6)
[link] Google Search loses to ChatGPT fair and square
shminux · 2022-12-21T08:11:43.287Z · comments (17)
Sazen
[DEACTIVATED] Duncan Sabien (Duncan_Sabien) · 2022-12-21T07:54:51.415Z · comments (83)
[link] Podcast: What's Wrong With LessWrong
Alfred · 2022-12-21T07:06:08.728Z · comments (11)
[link] New AI risk intro from Vox [link post]
JakubK (jskatt) · 2022-12-21T06:00:06.031Z · comments (1)
Local Memes Against Geometric Rationality
Scott Garrabrant · 2022-12-21T03:53:28.196Z · comments (3)
Logging Shell History in Zsh
jefftk (jkaufman) · 2022-12-21T03:30:03.180Z · comments (2)
CIRL Corrigibility is Fragile
Rachel Freedman (rachelAF) · 2022-12-21T01:40:50.232Z · comments (9)
[question] [DISC] Are Values Robust?
DragonGod · 2022-12-21T01:00:29.939Z · answers+comments (9)
Performing an SVD on a time-series matrix of gradient updates on an MNIST network produces 92.5 singular values
Garrett Baker (D0TheMath) · 2022-12-21T00:44:55.373Z · comments (10)
[link] Progress links and tweets, 2022-12-20
jasoncrawford · 2022-12-21T00:35:59.686Z · comments (0)
K-complexity is silly; use cross-entropy instead
So8res · 2022-12-20T23:06:27.131Z · comments (53)
Podcast: Tamera Lanham on AI risk, threat models, alignment proposals, externalized reasoning oversight, and working at Anthropic
Akash (akash-wasil) · 2022-12-20T21:39:41.866Z · comments (2)
[link] Discovering Language Model Behaviors with Model-Written Evaluations
evhub · 2022-12-20T20:08:12.063Z · comments (34)
[link] Reflections: Bureaucratic Hell
Haris Rashid (haris-rashid) · 2022-12-20T19:22:13.606Z · comments (1)
[link] Proliferating Education
Haris Rashid (haris-rashid) · 2022-12-20T19:22:13.492Z · comments (2)
AGI is here, but nobody wants it. Why should we even care?
MGow · 2022-12-20T19:14:25.696Z · comments (0)
Properties of current AIs and some predictions of the evolution of AI from the perspective of scale-free theories of agency and regulative development
Roman Leventov · 2022-12-20T17:13:00.669Z · comments (3)
I believe some AI doomers are overconfident
[deleted] · 2022-12-20T17:09:23.325Z · comments (15)
Note on algorithms with multiple trained components
Steven Byrnes (steve2152) · 2022-12-20T17:08:24.057Z · comments (4)
Marvel Snap: Phase 2
Zvi · 2022-12-20T14:50:00.460Z · comments (1)
(Extremely) Naive Gradient Hacking Doesn't Work
ojorgensen · 2022-12-20T14:35:33.591Z · comments (0)
An Open Agency Architecture for Safe Transformative AI
davidad · 2022-12-20T13:04:06.409Z · comments (22)
[link] Under-Appreciated Ways to Use Flashcards - Part I
Florence Hinder (florence-hinder) · 2022-12-20T12:43:31.387Z · comments (5)
EA & LW Forums Weekly Summary (12th Dec - 18th Dec 22')
Zoe Williams (GreyArea) · 2022-12-20T09:49:51.463Z · comments (0)
[link] [link, 2019] AI paradigm: interactive learning from unlabeled instructions
the gears to ascension (lahwran) · 2022-12-20T06:45:30.035Z · comments (0)
[Fiction] Unspoken Stone
Gordon Seidoh Worley (gworley) · 2022-12-20T05:11:23.231Z · comments (0)
Notice when you stop reading right before you understand
just_browsing · 2022-12-20T05:09:43.224Z · comments (6)
Take 12: RLHF's use is evidence that orgs will jam RL at real-world problems.
Charlie Steiner · 2022-12-20T05:01:50.659Z · comments (1)
More notes from raising a late-talking kid
Steven Byrnes (steve2152) · 2022-12-20T02:13:01.018Z · comments (2)
The "Minimal Latents" Approach to Natural Abstractions
johnswentworth · 2022-12-20T01:22:25.101Z · comments (24)
[link] our deepest wishes
Tamsin Leake (carado-1) · 2022-12-20T00:23:32.892Z · comments (0)
Shard Theory in Nine Theses: a Distillation and Critical Appraisal
LawrenceC (LawChan) · 2022-12-19T22:52:20.031Z · comments (30)
[question] Will research in AI risk jinx it? Consequences of training AI on AI risk arguments
Yann Dubois (yann-dubois) · 2022-12-19T22:42:30.959Z · answers+comments (6)
AGI Timelines in Governance: Different Strategies for Different Timeframes
simeon_c (WayZ) · 2022-12-19T21:31:25.746Z · comments (28)
Towards Hodge-podge Alignment
Cleo Nardo (strawberry calm) · 2022-12-19T20:12:14.540Z · comments (30)
← previous page (newer posts) · next page (older posts) →