LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

AI as a powerful meme, via CGP Grey
TheManxLoiner · 2024-10-30T18:31:58.544Z · comments (6)

[link] MIRI's September 2024 newsletter
Harlan · 2024-09-16T18:15:40.785Z · comments (0)

Toy Models of Feature Absorption in SAEs
chanind · 2024-10-07T09:56:53.609Z · comments (7)

Bounty for Evidence on Some of Palisade Research's Beliefs
benwr · 2024-09-23T20:01:20.917Z · comments (4)

I finally got ChatGPT to sound like me
lsusr · 2024-09-17T09:39:59.415Z · comments (18)

Big Picture AI Safety: Introduction
EuanMcLean (euanmclean) · 2024-05-23T11:15:44.037Z · comments (7)

[link] Contra Scott on Abolishing the FDA
Maxwell Tabarrok (maxwell-tabarrok) · 2023-12-15T14:00:17.247Z · comments (3)

AI #41: Bring in the Other Gemini
Zvi · 2023-12-07T15:10:05.552Z · comments (16)

1. The CAST Strategy
Max Harms (max-harms) · 2024-06-07T22:29:13.005Z · comments (19)

On OpenAI’s Model Spec
Zvi · 2024-06-21T13:00:03.014Z · comments (3)

[link] Bayesians Commit the Gambler's Fallacy
Kevin Dorst · 2024-01-07T12:54:59.939Z · comments (28)

Thoughts on "The Offense-Defense Balance Rarely Changes"
Cullen (Cullen_OKeefe) · 2024-02-12T03:26:50.662Z · comments (4)

In Defense of Parselmouths
Screwtape · 2023-11-15T23:02:19.344Z · comments (10)

AI #68: Remarkably Reasonable Reactions
Zvi · 2024-06-13T16:30:02.969Z · comments (11)

AI Safety 101 : Capabilities - Human Level AI, What? How? and When?
markov (markovial) · 2024-03-07T17:29:53.260Z · comments (8)

[link] The Leeroy Jenkins principle: How faulty AI could guarantee "warning shots"
titotal (lombertini) · 2024-01-14T15:03:21.087Z · comments (6)

AI doing philosophy = AI generating hands?
Wei Dai (Wei_Dai) · 2024-01-15T09:04:39.659Z · comments (22)

On the Proposed California SB 1047
Zvi · 2024-02-12T16:40:04.854Z · comments (18)

[link] For Civilization and Against Niceness
Gabriel Alfour (gabriel-alfour-1) · 2023-11-20T10:56:20.352Z · comments (14)

AI #75: Math is Easier
Zvi · 2024-08-01T13:40:05.539Z · comments (25)

So You Created a Sociopath - New Book Announcement!
Garrett Baker (D0TheMath) · 2024-04-01T18:02:18.010Z · comments (3)

Enriched tab is now the default LW Frontpage experience for logged-in users
Ruby · 2024-06-21T00:09:30.441Z · comments (27)

[link] If Clarity Seems Like Death to Them
Zack_M_Davis · 2023-12-30T17:40:42.622Z · comments (191)

The predictive power of dissipative adaptation
dr_s · 2023-12-17T14:01:31.568Z · comments (14)

On Tapping Out
Screwtape · 2023-11-17T03:23:55.880Z · comments (13)

AI #54: Clauding Along
Zvi · 2024-03-07T16:00:05.066Z · comments (11)

Things Solenoid Narrates
Solenoid_Entity · 2024-04-12T23:57:16.169Z · comments (2)

Atlantis: Berkeley event venue available for rent
Jonas V (Jonas Vollmer) · 2023-11-22T01:47:12.026Z · comments (0)

[link] AlphaGeometry: An Olympiad-level AI system for geometry
alyssavance · 2024-01-17T17:17:30.913Z · comments (9)

[link] I'd also take $7 trillion
bhauth · 2024-02-19T03:31:45.552Z · comments (12)

A starting point for making sense of task structure (in machine learning)
Kaarel (kh) · 2024-02-24T01:51:49.227Z · comments (2)

Quick thoughts on the implications of multi-agent views of mind on AI takeover
Kaj_Sotala · 2023-12-11T06:34:06.395Z · comments (14)

[link] Rational Animations' intro to mechanistic interpretability
Writer · 2024-06-14T16:10:57.015Z · comments (1)

[link] Towards Evaluating AI Systems for Moral Status Using Self-Reports
Ethan Perez (ethan-perez) · 2023-11-16T20:18:51.730Z · comments (3)

Startup Roundup #2
Zvi · 2024-08-06T13:30:06.554Z · comments (0)

AI #72: Denying the Future
Zvi · 2024-07-11T15:00:05.865Z · comments (8)

[link] Fluent dreaming for language models (AI interpretability method)
tbenthompson (ben-thompson) · 2024-02-06T06:02:59.296Z · comments (4)

Monthly Roundup #18: May 2024
Zvi · 2024-05-13T12:30:04.863Z · comments (10)

Some open-source dictionaries and dictionary learning infrastructure
Sam Marks (samuel-marks) · 2023-12-05T06:05:21.903Z · comments (7)

[link] Paper: Tell, Don't Show- Declarative facts influence how LLMs generalize
Owain_Evans · 2023-12-19T19:14:26.423Z · comments (4)

We Don't Know Our Own Values, but Reward Bridges The Is-Ought Gap
johnswentworth · 2024-09-19T22:22:05.307Z · comments (47)

Work with me on agent foundations: independent fellowship
Alex_Altair · 2024-09-21T13:59:16.706Z · comments (5)

[link] What Ketamine Therapy Is Like
Sable · 2024-11-11T11:09:08.602Z · comments (6)

~80 Interesting Questions about Foundation Model Agent Safety
RohanS · 2024-10-28T16:37:04.713Z · comments (4)

Dating Roundup #3: Third Time’s the Charm
Zvi · 2024-05-08T13:30:03.232Z · comments (27)

We ran an AI safety conference in Tokyo. It went really well. Come next year!
Blaine (blaine-rogers) · 2024-07-17T06:55:39.620Z · comments (1)

Principled Satisficing To Avoid Goodhart
JenniferRM · 2024-08-16T19:05:27.204Z · comments (2)

The Gemini Incident Continues
Zvi · 2024-02-27T16:00:05.648Z · comments (6)

[link] Book review: Everything Is Predictable
PeterMcCluskey · 2024-05-27T03:33:53.857Z · comments (0)

[link] How people stopped dying from diarrhea so much (& other life-saving decisions)
Writer · 2024-03-16T16:00:47.830Z · comments (0)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

sharmake-farah on johnswentworth's Shortform

While I'm not a believer in the scaling has died meme yet, I'm glad you do have a plan for what happens if AI scaling does stop.

benito on Sabotage Evaluations for Frontier Models

But your story lacks any mechanism for accountability if the King is behaving badly. It is important to design systems of power that do not rely on the people in power being good and right, but instead make it so that if they behave badly, they are held to account. I don't think I have to explain why incentives and accountability matter for how the powerful wield their powers.

elityre on Lao Mein's Shortform

I don't think that's a valid inference.

jrockwar on The Best Software For Every Need

On the topic of ffmpeg - additional shoutout to Handbrake, which is essentially ffmpeg with a GUI on top.

maelstrom on Alexander Gietelink Oldenziel's Shortform

One needs only to read 4 or so papers on category theory applied to AI to understand the problem. None of them share a common foundation on what type of constructions to use or formalize in category theory. The core issue is that category theory is a general language for all of mathematics, and as commonly used just exponentially increase the search space for useful mathematical ideas.

I want to be wrong about this, but I have yet to find category theory uniquely useful outside of some subdomains of pure math.

cata on When will computer programming become an unskilled job (if ever)?

One and a half years later it seems like AI tools are able to sort of help humans with very rote programming work (e.g. changing or writing code to accomplish a simple goal, implementing versions of things that are well-known to the AI like a textbook algorithm or a browser form to enter data, answering documentation-like questions about a system) but aren't much help yet on the more skilled labor parts of software engineering.

yonge on The Foraging (Ex-)Bandit [Ruleset & Reflections]

I was expecting earlier choices of foraging location to have a much stronger impact, and mistook some of the randomness for affects of earlier choices. In retrospect it would have been better to spend longer exploreing various possibilites rather than settling on an exploit strategy so soon. Adding an explicit target was a big improvement as it gave some idea of "how good a strategy" we should be searching for.

nate-showell on Heresies in the Shadow of the Sequences

Some other examples:

Agency and embeddedness are fundamentally at odds with each other. Decision theory and physics are incompatible approaches to world-modeling, with each making assumptions that are inconsistent with the other. Attempting to build mathematical models of embedding agency will fail as an attempt to understand advanced AI behavior.
Reductionism is false. If modeling a large-scale system in terms of the exact behavior of its small-scale components would take longer than the age of the universe, or would require a universe-sized computer, the large-scale system isn't explicable in terms of small-scale interactions even in principle. The Sequences are incorrect to describe non-reductionism as ontological realism about large-scale entities [LW · GW] -- the former doesn't inherently imply the latter.
Relatedly, nothing is ontologically primitive. Not even elementary particles: if, for example, you took away the mass of an electron, it would cease to be an electron and become something else. The properties of those particles, as well, depend on having fields to interact with. And if a field couldn't interact with anything, could it still be said to exist?
Ontology creates axiology and axiology creates ontology. We aren't born with fully formed utility functions in our heads telling us what we do and don't value. Instead, we have to explore and model the world over time, forming opinions along the way about what things and properties we prefer. And in turn, our preferences guide our exploration of the world and the models we form of what we experience. Classical game theory, with its predefined sets of choices and payoffs, only has narrow applicability, since such contrived setups are only rarely close approximations to the scenarios we find ourselves in.

thomas-kehrenberg on OpenAI Email Archives (from Musk v. Altman)

I wonder if it would be a good idea to put editor's notes after likely typos, like:

but we want [editor's note: Elon likely meant “won’t” here] do any contract or agree to “evangelize”.

thomas-kehrenberg on OpenAI Email Archives (from Musk v. Altman)

Yes, it sounds that he put too much stock into Andrej's paper-counting argument, and then even left the board because he didn't want to be associated with a failing company?