LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

[link] Soft Nationalization: how the USG will control AI labs
Deric Cheng (deric-cheng) · 2024-08-27T15:11:14.601Z · comments (7)

Interpreting Preference Models w/ Sparse Autoencoders
Logan Riggs (elriggs) · 2024-07-01T21:35:40.603Z · comments (12)

Joshua Achiam Public Statement Analysis
Zvi · 2024-10-10T12:50:06.285Z · comments (14)

[link] A Narrow Path: a plan to deal with AI extinction risk
Andrea_Miotti (AndreaM) · 2024-10-07T13:02:15.229Z · comments (10)

[link] Is Deep Learning Actually Hitting a Wall? Evaluating Ilya Sutskever's Recent Claims
garrison · 2024-11-13T17:00:01.005Z · comments (11)

[question] Interest in Leetcode, but for Rationality?
Gregory (gregory-eales) · 2024-10-16T17:54:25.578Z · answers+comments (20)

In Defense of Open-Minded UDT
abramdemski · 2024-08-12T18:27:36.220Z · comments (27)

[link] LK-99 in retrospect
bhauth · 2024-07-07T02:06:27.660Z · comments (21)

[link] Excerpts from "A Reader's Manifesto"
Arjun Panickssery (arjun-panickssery) · 2024-09-06T22:37:40.254Z · comments (1)

D&D.Sci Scenario Index
aphyer · 2024-07-23T02:00:43.483Z · comments (0)

AI for Bio: State Of The Field
sarahconstantin · 2024-08-30T18:00:02.187Z · comments (2)

FarmKind's Illusory Offer
jefftk (jkaufman) · 2024-08-09T11:30:07.082Z · comments (5)

Guide to SB 1047
Zvi · 2024-08-20T13:10:07.408Z · comments (18)

Neutrality
sarahconstantin · 2024-11-13T23:10:05.469Z · comments (0)

The Mask Comes Off: At What Price?
Zvi · 2024-10-21T23:50:05.247Z · comments (16)

The Packaging and the Payload
Screwtape · 2024-11-12T03:07:37.209Z · comments (1)

[link] Yoshua Bengio: Reasoning through arguments against taking AI safety seriously
Judd Rosenblatt (judd) · 2024-07-11T23:53:17.187Z · comments (3)

If we solve alignment, do we die anyway?
Seth Herd · 2024-08-23T13:13:10.933Z · comments (68)

Adam Optimizer Causes Privileged Basis in Transformer LM Residual Stream
Diego Caples (diego-caples) · 2024-09-06T17:55:34.265Z · comments (7)

Automation collapse
Geoffrey Irving · 2024-10-21T14:50:54.500Z · comments (10)

[link] If far-UV is so great, why isn't it everywhere?
Austin Chen (austin-chen) · 2024-10-19T18:56:58.910Z · comments (23)

[link] [Paper] A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
chanind · 2024-09-25T09:31:03.296Z · comments (15)

[link] Investigating an insurance-for-AI startup
L Rudolf L (LRudL) · 2024-09-21T15:29:10.083Z · comments (0)

Multiplex Gene Editing: Where Are We Now?
sarahconstantin · 2024-07-16T20:50:04.590Z · comments (6)

The King and the Golem - The Animation
Writer · 2024-11-08T18:23:10.935Z · comments (0)

Analyzing DeepMind's Probabilistic Methods for Evaluating Agent Capabilities
Axel Højmark (hojmax) · 2024-07-22T16:17:07.665Z · comments (0)

[link] Video lectures on the learning-theoretic agenda
Vanessa Kosoy (vanessa-kosoy) · 2024-10-27T12:01:32.777Z · comments (0)

The Hessian rank bounds the learning coefficient
Lucius Bushnaq (Lblack) · 2024-08-08T20:55:36.960Z · comments (9)

[link] GPT-4o System Card
Zach Stein-Perlman · 2024-08-08T20:30:52.633Z · comments (11)

Estimating Tail Risk in Neural Networks
Mark Xu (mark-xu) · 2024-09-13T20:00:06.921Z · comments (9)

[link] Peak Human Capital
PeterMcCluskey · 2024-09-30T21:13:30.421Z · comments (3)

Brief notes on the Wikipedia game
Olli Järviniemi (jarviniemi) · 2024-07-14T02:28:22.473Z · comments (9)

Different senses in which two AIs can be “the same”
Vivek Hebbar (Vivek) · 2024-06-24T03:16:43.400Z · comments (1)

AI #79: Ready for Some Football
Zvi · 2024-08-29T13:30:10.902Z · comments (16)

What is it to solve the alignment problem?
Joe Carlsmith (joekc) · 2024-08-24T21:19:34.280Z · comments (17)

Timaeus is hiring!
Jesse Hoogland (jhoogland) · 2024-07-12T23:42:28.651Z · comments (6)

o1-preview is pretty good at doing ML on an unknown dataset
Håvard Tveit Ihle (havard-tveit-ihle) · 2024-09-20T08:39:49.927Z · comments (1)

[link] The economics of space tethers
harsimony · 2024-08-22T16:15:22.699Z · comments (22)

What and Why: Developmental Interpretability of Reinforcement Learning
Garrett Baker (D0TheMath) · 2024-07-09T14:09:40.649Z · comments (4)

Why Large Bureaucratic Organizations?
johnswentworth · 2024-08-27T18:30:07.422Z · comments (52)

[link] Open Source Automated Interpretability for Sparse Autoencoder Features
kh4dien · 2024-07-30T21:11:36.866Z · comments (1)

Indecision and internalized authority figures
Kaj_Sotala · 2024-07-06T10:10:02.528Z · comments (1)

An AI Race With China Can Be Better Than Not Racing
niplav · 2024-07-02T17:57:36.976Z · comments (32)

Friendship is transactional, unconditional friendship is insurance
Ruby · 2024-07-17T22:52:41.967Z · comments (24)

EIS XIV: Is mechanistic interpretability about to be practically useful?
scasper · 2024-10-11T22:13:51.033Z · comments (4)

[link] Static Analysis As A Lifestyle
adamShimi · 2024-07-03T18:29:37.384Z · comments (11)

Fear of centralized power vs. fear of misaligned AGI: Vitalik Buterin on 80,000 Hours
Seth Herd · 2024-08-05T15:38:09.682Z · comments (22)

SAEs (usually) Transfer Between Base and Chat Models
Connor Kissane (ckkissane) · 2024-07-18T10:29:46.138Z · comments (0)

[link] On Shifgrethor
JustisMills · 2024-10-27T15:30:13.688Z · comments (17)

Advice to junior AI governance researchers
Akash (akash-wasil) · 2024-07-08T19:19:07.316Z · comments (1)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

tahp on Thoughts after the Wolfram and Yudkowsky discussion

I might as well check out the panel discussion. I didn't know about it.

I think I listened to the Hotz debate. The highlight of that one was when Hotz implied that he was using an LLM to drive a car, Yudkowsky freaks out a bit, and Hotz clarifies that he means the architecture for his learning algorithm is basically the same as an LLM.

I suspect the Destiny discussion is qualitatively similar to the Dwarkesh one.

At this point, maybe I should just read old MIRI papers.

charlie-steiner on Another UFO Bet

Ok, I'm agreeing in principle to make the same bet as with RatsWrongAboutUAP.

("I commit to paying up if I agree there's a >0.4 probability something non-mundane happened in a UFO/UAP case, or if there's overwhelming consensus to that effect and my probability is >0.1.")

milan-w on Heresies in the Shadow of the Sequences

Thank you for clarifying. I appreciate and point out as relevant the fact that Legg-Hutter includes in it's definition "for all environments (ie action:observation mappings)". I can now say I agree with your "heresy" with a high credence for the cases where compute budgets are not ludicrously small relative to I/O scale, and the utility function is not trivial. I'm a bit weirded out by the environment space being conditional on a fixed hardware variable (namely, I/O) in this operationalization, but whathever.

kvmanthinking on The Treacherous Path to Rationality

And I’m not even mentioning the strange sexual dynamics

Is this a joke? I'm confused.

milan-w on Thoughts after the Wolfram and Yudkowsky discussion

I asked GPT4o to perform a web search for podcast appearances by Yudkowsky. It dug up these two lists (apparently, autogenerated from scrapped data). When I asked it to base use these lists as a starting point to look for high quality debates and after some further elicitation and wrangling, the best we could find was this moderated panel discussion featuring Yudkowsky, Liv Boeree, and Joscha Bach. There's also the Yudkowsky v/s George Hotz debate on Lex Fridman, and the time Yudkowsky debated AI risk with the streamer and political commentaror known as Destiny. I have watched none of the three debates I just mentioned; but I know that Hotz is a heavily vibes-based (rather than object-level-based) thinker, and that Destiny has no background in AI risk, but has good epistemics. I think he probably offered reasonable-at-first-approximation-yet-mostly-uninformed pushback.

EDIT: Upon looking a bit more at the Destiny-Yudkowsky discussion, i may have unwittingly misrepresented it a bit. It occurred during Manifest, and was billed as a debate. ChatGPT says Destiny's skepticism was rather active, and did not budge much.

cole-wyeth on Heresies in the Shadow of the Sequences

Perhaps Legg-Hutter intelligence.
I'm not sure how much the goal matters - probably the details depend on the utility function you want to optimize. I think you can do about as well as possible by carving out a utility function module and designing the rest uniformly to pursue the objectives of that module. But perhaps this comes at a fairly significant cost (i.e. you'd need a somewhat larger computer to get the same performance if you insist on doing it this way).
...And yes, there does exist a computer program which is remarkably good at just chess and nothing else, but that's not the kind of thing I'm talking about here.
Yes, the I/O channels should be fixed along with the hardware.

olli-jaerviniemi on The Parable of the Dagger

dagon on nikola's Shortform

Hmm. I think there are two dimensions to the advice (what is a reasonable distribution of timelines to have, vs what should I actually do). It's perfectly fine to have some humility about one while still giving opinions on the other. "If you believe Y, then it's reasonable to do X" can be a useful piece of advice. I'd normally mention that I don't believe Y, but for a lot of conversations, we've already had that conversation, and it's not helpful to repeat it.

tomcatfish on Drawing Less Wrong: Should You Learn to Draw?

People seeing this in the future: Check out Draw a Box for some low-level mechanical stuff.

milan-w on Heresies in the Shadow of the Sequences

Though there are elegant and still practical specifications for intelligent behavior, the most intelligent agent that runs on some fixed hardware has completely unintelligible cognitive structures and in fact its source code is indistinguishable from white noise.

What does "most intelligent agent" mean?
Don't you think we'd also need to specify "for a fixed (basket of) tasks"?
Are the I/O channels fixed along with the hardware?