LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Bitter lessons about lucid dreaming
avturchin · 2024-10-16T21:27:04.725Z · comments (62)

Dentistry, Oral Surgeons, and the Inefficiency of Small Markets
GeneSmith · 2024-11-01T17:26:06.466Z · comments (16)

Effective Evil's AI Misalignment Plan
lsusr · 2024-12-15T07:39:34.046Z · comments (9)

LLM chatbots have ~half of the kinds of "consciousness" that humans believe in. Humans should avoid going crazy about that.
Andrew_Critch · 2024-11-22T03:26:11.681Z · comments (53)

Secular Solstice Round Up 2024
dspeyer · 2024-11-21T10:49:36.682Z · comments (15)

What is malevolence? On the nature, measurement, and distribution of dark traits
David Althaus (wallowinmaya) · 2024-10-23T08:41:33.197Z · comments (15)

The Packaging and the Payload
Screwtape · 2024-11-12T03:07:37.209Z · comments (1)

[link] Video lectures on the learning-theoretic agenda
Vanessa Kosoy (vanessa-kosoy) · 2024-10-27T12:01:32.777Z · comments (0)

Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)
Elizabeth (pktechgirl) · 2024-10-22T18:20:01.194Z · comments (79)

Could randomly choosing people to serve as representatives lead to better government?
John Huang · 2024-10-21T17:10:20.920Z · comments (13)

Counting AGIs
cash (cshunter) · 2024-11-26T00:06:17.845Z · comments (19)

Matryoshka Sparse Autoencoders
Noa Nabeshima (noa-nabeshima) · 2024-12-14T02:52:32.017Z · comments (10)

Introducing Transluce — A Letter from the Founders
jsteinhardt · 2024-10-23T18:10:02.526Z · comments (2)

[link] Cost, Not Sacrifice
Joe Rogero · 2024-11-20T21:32:26.281Z · comments (13)

[question] Interest in Leetcode, but for Rationality?
Gregory (gregory-eales) · 2024-10-16T17:54:25.578Z · answers+comments (20)

[link] SAEBench: A Comprehensive Benchmark for Sparse Autoencoders
Can (Can Rager) · 2024-12-11T06:30:37.076Z · comments (1)

The 2023 LessWrong Review: The Basic Ask
Raemon · 2024-12-04T19:52:40.435Z · comments (25)

🇫🇷 Announcing CeSIA: The French Center for AI Safety
Charbel-Raphaël (charbel-raphael-segerie) · 2024-12-20T14:17:13.104Z · comments (0)

The Mask Comes Off: At What Price?
Zvi · 2024-10-21T23:50:05.247Z · comments (16)

[link] If far-UV is so great, why isn't it everywhere?
Austin Chen (austin-chen) · 2024-10-19T18:56:58.910Z · comments (23)

Automation collapse
Geoffrey Irving · 2024-10-21T14:50:54.500Z · comments (9)

The King and the Golem - The Animation
Writer · 2024-11-08T18:23:10.935Z · comments (0)

[link] "Map of AI Futures" - An interactive flowchart
swante · 2024-11-27T21:31:40.269Z · comments (3)

[link] New o1-like model (QwQ) beats Claude 3.5 Sonnet with only 32B parameters
Jesse Hoogland (jhoogland) · 2024-11-27T22:06:12.914Z · comments (4)

Personal AI Planning
jefftk (jkaufman) · 2024-11-10T14:00:06.837Z · comments (10)

AIs Will Increasingly Fake Alignment
Zvi · 2024-12-24T13:00:07.770Z · comments (0)

The Third Fundamental Question
Screwtape · 2024-11-15T04:01:33.770Z · comments (7)

When AI 10x's AI R&D, What Do We Do?
Logan Riggs (elriggs) · 2024-12-21T23:56:11.069Z · comments (12)

[link] On Shifgrethor
JustisMills · 2024-10-27T15:30:13.688Z · comments (18)

[link] Drexler's Nanotech Software
PeterMcCluskey · 2024-12-02T04:55:20.432Z · comments (9)

Occupational Licensing Roundup #1
Zvi · 2024-10-30T11:00:04.516Z · comments (11)

A Qualitative Case for LTFF: Filling Critical Ecosystem Gaps
Linch · 2024-12-03T21:57:23.597Z · comments (2)

Perils of Generalizing from One's Social Group
localdeity · 2024-11-24T15:31:18.332Z · comments (1)

[Intuitive self-models] 8. Rooting Out Free Will Intuitions
Steven Byrnes (steve2152) · 2024-11-04T18:16:26.736Z · comments (16)

Retrospective: PIBBSS Fellowship 2024
DusanDNesic · 2024-12-20T15:55:24.194Z · comments (1)

AI Craftsmanship
abramdemski · 2024-11-11T22:17:01.112Z · comments (7)

Brief analysis of OP Technical AI Safety Funding
22tom (thomas-barnes) · 2024-10-25T19:37:41.674Z · comments (5)

SAEs are highly dataset dependent: a case study on the refusal direction
Connor Kissane (ckkissane) · 2024-11-07T05:22:18.807Z · comments (4)

[link] RL, but don't do anything I wouldn't do
Gunnar_Zarncke · 2024-12-07T22:54:50.714Z · comments (5)

[link] Electrostatic Airships?
DaemonicSigil · 2024-10-27T04:32:34.852Z · comments (13)

[Intuitive self-models] 6. Awakening / Enlightenment / PNSE
Steven Byrnes (steve2152) · 2024-10-22T13:23:08.836Z · comments (8)

[link] Slightly More Than You Wanted To Know: Pregnancy Length Effects
JustisMills · 2024-10-21T01:26:02.030Z · comments (4)

AI #95: o1 Joins the API
Zvi · 2024-12-19T15:10:05.196Z · comments (1)

[link] Anthropic leadership conversation
Zach Stein-Perlman · 2024-12-20T22:00:45.229Z · comments (16)

[link] Zen and The Art of Semiconductor Manufacturing
Recurrented (rachel-farley) · 2024-12-09T17:19:35.236Z · comments (2)

Cognitive Work and AI Safety: A Thermodynamic Perspective
Daniel Murfet (dmurfet) · 2024-12-08T21:42:17.023Z · comments (7)

Why imperfect adversarial robustness doesn't doom AI control
Buck · 2024-11-18T16:05:06.763Z · comments (26)

[link] electric turbofans
bhauth · 2024-11-02T22:50:59.807Z · comments (2)

A case for donating to AI risk reduction (including if you work in AI)
tlevin (trevor) · 2024-12-02T19:05:06.658Z · comments (2)

Training AI agents to solve hard problems could lead to Scheming
Marius Hobbhahn (marius-hobbhahn) · 2024-11-19T00:10:55.522Z · comments (12)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

habryka4 on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser

We have met the first of our three fundraising goals! Thank you all so much! Seeing all the outpouring of support from so many different people has been very heartening.

braydenm on AI Control: Improving Safety Despite Intentional Subversion

Was a widely impactful piece of work, beyond the bounds of the less wrong community

tetraspace-grouping on shortplav

Dominance/submission dynamics in relationships

In Act I outputs Claudes do a lot of this, e.g. this screenshot of Sonnet 3.6

hide on Hire (or become) a Thinking Assistant / Body Double

It’s true any job can find unqualified applicants. What I’m saying is that this in particular relies on an untenably small niche of feasible candidates that will take an enormous amount of time to find/filter through on average.

Sure, you might get lucky immediately, but without a reliable way to find the “independently wealthy guy who’s an intellectual and is sufficiently curious about you specifically that he wants to sit silently and watch you for 8 hours a day for a nominal fee”, your recruitment time will, on average, be very long, especially in comparison to what would likely be a very short average tenure given the many countervailing opportunities that would be presented to such a candidate.

Yes, it’s possible in principle to articulate the perfect candidate, but my point is more about real-world feasibility.

tsvibt on What are the strongest arguments for very short timelines?

The burden is on you because you're saying "we have gone from not having the core algorithms for intelligence in our computers, to yes having them".

https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce#The__no_blockers__intuition [LW · GW]

And I think you're admitting that your argument is "if we mush all capabilities together into one dimension, AI is moving up on that one dimension, so things will keep going up".

Would you say the same thing about the invention of search engines? That was a huge jump in the capability of our computers. And it looks even more impressive if you blur out your vision--pretend you don't know that the text that comes up on your screen is written by a humna, and pretend you don't know that search is a specific kind of task distinct from a lot of other activity that would be involved in "True Understanding, woooo"--and just say "wow! previously our computers couldn't write a poem, but now with just a few keystrokes my computer can literally produce Billy Collins level poetry!".

Blurring things together at that level works for, like, macroeconomic trends. But if you look at macroeconomic trends it doesn't say singularity in 2 years! Going to 2 or 10 years is an inside-view thing to conclude! You're making some inference like "there's an engine that is very likely operating here, that takes us to AGI in xyz years".

nim on Hire (or become) a Thinking Assistant / Body Double

Oops! I only realized in your reply that you're considering "reliability" the load-bearing element. Yes, the hiring pipeline will look like a background noise of consistent interest from the unqualified, and sporadic hits from excellent candidates. You're approaching it from the perspective that the background noise of incompetents is the more important part, whereas I think that the availability of an adequate candidate eventually is the important part.

I think this because basically anywhere that hires can reliably find unqualified applicants. For a role where people stay in the job for 6 months, for instance, you only need to find a suitable replacement once every 6 months... so "reliably" being able to find an excellent candidate every day seems simply irrelevant.

ulrik-horn on Last Line of Defense: Minimum Viable Shelters for Mirror Bacteria

Just a note that I intend to answer this comment, but it might be a couple of days.

cakubilo on People aren't properly calibrated on FrontierMath

For example, if the statement to be proved is say independent of ZFC, then no computer that can be computed from a Turing Machine (which includes all LLMs) can resolve the conjecture, and due to independent statements, you can make conjectures that are arbitrarily hard to solve, and even the non-independent conjectures may in practice be unsolvable by any human or AI for a long time, which means the benchmark is less useful for real AIs.

I don't believe this is true, actually! What do you mean by "resolve the conjecture"? If you mean write up with a proof of it, then of course you can write a turing machine that will write a proof of the conjecture, it's just infinite monkeys. ZFC is best thought of as the "minimal set of axioms to do most math". It's not anything particularly special. You can have various foundations such as ETCS, NF, Type theory, etc. If we have a model that can genuinely reason mathematically, then the set of axioms the model uses should be immaterial to its mathematical ability. In fact, it should certainly be able to handle more or less axioms, like replacing full choice with countable choice etc. Maybe I misunderstood your point here.

While I am not experienced at all in formalizing math, and thus am willing to update and be corrected by any expert on mathematics, especially those that formalize mathematics in proof assistants, I'd expect 2 language independent reasons for why formalizing mathematics in proof assistants are difficult:

But my point was that there are things that should be extremely easy, like proving lemmas about elementary row transformations, that have not been done in Lean yet. That is not due to a lack of people formalizing, but due to fundamental limitations with the proof assistant. The point that I'm failing to make explicit is that this seems like a copout. The ultimate naturalistic benchmark for an LLM's math ability is being able to formalize the undergraduate math curriculum! But it starts with having a proof assistant that is amenable to the formalization project, which seems to be the bottleneck today.

cole-wyeth on Cole Wyeth's Shortform

Most ordinary people don't know that no one understands how neural networks work (or even that modern "Generative A.I." is based on neural networks). This might be an underrated message since the inferential distance here is surprisingly high.

It's hard to explain the more sophisticated models that we often use to argue that human dis-empowerment is the default outcome but perhaps much better leveraged to explain these three points:

1) No one knows how A.I models / LLMs / neural nets work (with some explanation of how this is conceptually possible).

2) We don't know how smart they will get how soon.

3) We can't control what they'll do once they're smarter than us.

At least under my state of knowledge, this is also a particularly honest messaging strategy, because it emphasizes the fundamental ignorance of A.I. researchers.

cole-wyeth on Cole Wyeth's Shortform

A "Christmas edition" of the new book on AIXI is freely available in pdf form at http://www.hutter1.net/publ/uaibook2.pdf