LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

AXRP Episode 37 - Jaime Sevilla on Forecasting AI
DanielFilan · 2024-10-04T21:00:03.077Z · comments (3)

Alignment by default: the simulation hypothesis
gb (ghb) · 2024-09-25T16:26:00.552Z · comments (39)

AXRP Episode 35 - Peter Hase on LLM Beliefs and Easy-to-Hard Generalization
DanielFilan · 2024-08-24T22:30:02.039Z · comments (0)

What program structures enable efficient induction?
Daniel C (harper-owen) · 2024-09-05T10:12:14.058Z · comments (5)

2024 Unofficial LW Community Census, Request for Comments
Screwtape · 2024-11-01T16:34:14.758Z · comments (30)

Boring & straightforward trauma explanation
lukehmiles (lcmgcd) · 2024-11-08T09:45:19.486Z · comments (7)

[link] AI Model Registries: A Foundational Tool for AI Governance
Elliot Mckernon (elliot) · 2024-10-07T19:27:43.466Z · comments (1)

Musings on Text Data Wall (Oct 2024)
Vladimir_Nesov · 2024-10-05T19:00:21.286Z · comments (2)

[link] Compression Moves for Prediction
adamShimi · 2024-09-14T17:51:12.004Z · comments (0)

A necessary Membrane formalism feature
ThomasCederborg · 2024-09-10T21:33:09.508Z · comments (6)

Simon DeDeo on Explore vs Exploit in Science
Elizabeth (pktechgirl) · 2024-09-10T03:40:08.311Z · comments (0)

[link] Does natural selection favor AIs over humans?
cdkg · 2024-10-03T18:47:43.517Z · comments (1)

[question] What is the alpha in one bit of evidence?
J Bostock (Jemist) · 2024-10-22T21:57:09.056Z · answers+comments (12)

[link] Towards the Operationalization of Philosophy & Wisdom
Thane Ruthenis · 2024-10-28T19:45:07.571Z · comments (2)

My decomposition of the alignment problem
Daniel C (harper-owen) · 2024-09-02T00:21:08.359Z · comments (22)

Gell-Mann checks
Cleo Scrolls (cleo-scrolls) · 2024-09-26T22:45:43.569Z · comments (7)

AI Can be “Gradient Aware” Without Doing Gradient hacking.
Sodium · 2024-10-20T21:02:10.754Z · comments (0)

[link] Anthropic is being sued for copying books to train Claude
Remmelt (remmelt-ellen) · 2024-08-31T02:57:27.092Z · comments (4)

[link] Green and golden: a meditation
Richard_Ngo (ricraz) · 2024-08-18T01:36:43.613Z · comments (0)

The Logistics of Distribution of Meaning: Against Epistemic Bureaucratization
Sahil · 2024-11-07T05:27:20.276Z · comments (1)

How Often Does Taking Away Options Help?
niplav · 2024-09-21T21:52:40.822Z · comments (6)

[link] Fragile, Robust, and Antifragile Preference Satisfaction
adamShimi · 2024-11-02T17:25:55.986Z · comments (0)

Economics Roundup #4
Zvi · 2024-10-15T13:20:06.923Z · comments (4)

[link] To Be Born in a Bag
Niko_McCarty (niko-2) · 2024-10-06T17:21:00.605Z · comments (1)

[link] Update on the Mysterious Trump Buyers on Polymarket
Annapurna (jorge-velez) · 2024-11-04T19:22:06.540Z · comments (9)

Why I'm bearish on mechanistic interpretability: the shards are not in the network
tailcalled · 2024-09-13T17:09:25.407Z · comments (40)

Review: “The Case Against Reality”
David Gross (David_Gross) · 2024-10-29T13:13:29.643Z · comments (9)

Why Reflective Stability is Important
Johannes C. Mayer (johannes-c-mayer) · 2024-09-05T15:28:19.913Z · comments (2)

Announcing the PIBBSS Symposium '24!
DusanDNesic · 2024-09-03T11:19:47.568Z · comments (0)

Lab governance reading list
Zach Stein-Perlman · 2024-10-25T18:00:28.346Z · comments (3)

[question] What are the best resources for building gears-level models of how governments actually work?
adamShimi · 2024-08-19T14:05:02.590Z · answers+comments (6)

Scaling Laws and Likely Limits to AI
Davidmanheim · 2024-08-18T17:19:46.597Z · comments (0)

D/acc AI Security Salon
Allison Duettmann (allison-duettmann) · 2024-10-19T22:17:57.067Z · comments (0)

Looking for Goal Representations in an RL Agent - Update Post
CatGoddess · 2024-08-28T16:42:19.367Z · comments (0)

Avoiding the Bog of Moral Hazard for AI
Nathan Helm-Burger (nathan-helm-burger) · 2024-09-13T21:24:34.137Z · comments (12)

"Real AGI"
Seth Herd · 2024-09-13T14:13:24.124Z · comments (20)

Advisors for Smaller Major Donors?
jefftk (jkaufman) · 2024-11-06T14:30:06.187Z · comments (2)

Word Spaghetti
Gordon Seidoh Worley (gworley) · 2024-10-23T05:39:20.105Z · comments (9)

Can Large Language Models effectively identify cybersecurity risks?
emile delcourt (emile-delcourt) · 2024-08-30T20:20:21.345Z · comments (0)

[question] How great is the utility of "saving" endangered languages?
SpectrumDT · 2024-08-20T13:14:32.895Z · answers+comments (29)

Finding Deception in Language Models
Esben Kran (esben-kran) · 2024-08-20T09:42:13.060Z · comments (4)

Bridging the VLM and mech interp communities for multimodal interpretability
Sonia Joseph (redhat) · 2024-10-28T14:41:41.969Z · comments (5)

In the Name of All That Needs Saving
pleiotroth · 2024-11-07T15:26:12.252Z · comments (2)

[link] Should Sports Betting Be Banned?
Maxwell Tabarrok (maxwell-tabarrok) · 2024-09-21T14:13:35.404Z · comments (2)

Heresies in the Shadow of the Sequences
Cole Wyeth (Amyr) · 2024-11-14T05:01:11.889Z · comments (11)

[link] AlignedCut: Visual Concepts Discovery on Brain-Guided Universal Feature Space
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-09-14T23:23:26.296Z · comments (1)

My career exploration: Tools for building confidence
lynettebye · 2024-09-13T11:37:55.843Z · comments (0)

[link] Instruction Following without Instruction Tuning
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-09-24T13:49:09.078Z · comments (0)

Interview with Robert Kralisch on Simulators
WillPetillo · 2024-08-26T05:49:15.543Z · comments (0)

OpenAI defected, but we can take honest actions
Remmelt (remmelt-ellen) · 2024-10-21T08:41:25.728Z · comments (15)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

jchan on If I care about measure, choices have additional burden (+AI generated LW-comments)

However, in Many-Worlds Interpretation (MWI), I split my measure between multiple variants, which will be functionally different enough to regard my future selves as different minds. Thus, the act of choice itself lessens my measure by a factor of approximately 10. If I care about this, I'm caring about something unobservable.

If we're going to make sense of living in a branching multiverse, then we'll need to adopt a more fluid concept of personal identity.

Scenario: I take a sleeping pill that will make me fall asleep in 30 minutes. However, the person who wakes up in my bed the next morning will have no memory of that 30-minute period; his last memory will be of taking the pill.

If I imagine myself experiencing that 30-minute interval, intuitively it doesn't at all feel like "I have less than 30 minutes to live." Instead, it feels like I'd be pretty much indifferent to being in this situation - maybe the person who wakes up tomorrow is not "me" in the artificial sense of having a forward-looking continuity of consciousness with my current self, but that's not really what I care about anyway. He is similar enough to current-me that I value his existence and well-being to nearly the same degree as I do my own; in other words, he "is me" for all practical purposes.

The same is true of the versions of me in nearby world branches. I can no longer observe or influence them, but they still "matter" to me. Of course, the degree of self-identification will decrease over time as they diverge, but then again, so does my degree of identification with the "me" many decades in the future, even assuming a single timeline.

askwho on OpenAI Email Archives (from Musk v. Altman)

I've turned this into a full cast recording with ElevenLabs, with individual voices for all the players:
https://open.substack.com/pub/askwhocastsai/p/openai-email-archives-from-musk-v

sharmake-farah on johnswentworth's Shortform

While I'm not a believer in the scaling has died meme yet, I'm glad you do have a plan for what happens if AI scaling does stop.

benito on Sabotage Evaluations for Frontier Models

But your story lacks any mechanism for accountability if the King is behaving badly. It is important to design systems of power that do not rely on the people in power being good and right, but instead make it so that if they behave badly, they are held to account. I don't think I have to explain why incentives and accountability matter for how the powerful wield their powers. The central thing I am talking about is basic measures for accountability, of which I consider very high up to be engaging with criticism, dialogue, and argument (as is somewhat natural given my background philosophy from growing up on LessWrong).

As far as I recall as I write this, I do not believe that the CEOs of the companies building potentially omnicidal machines are committed to any particular plan that public intellectuals are debating and defending, nor have they any commitment mechanism to any such plan that has been put out publicly. The plan is something of their own choosing, the details of which are almost entirely inaccessible by those whose lives are being risked. If you believe otherwise I request you provide links, I would welcome specifics.

elityre on Lao Mein's Shortform

I don't think that's a valid inference.

jrockwar on The Best Software For Every Need

On the topic of ffmpeg - additional shoutout to Handbrake, which is essentially ffmpeg with a GUI on top.

maelstrom on Alexander Gietelink Oldenziel's Shortform

One needs only to read 4 or so papers on category theory applied to AI to understand the problem. None of them share a common foundation on what type of constructions to use or formalize in category theory. The core issue is that category theory is a general language for all of mathematics, and as commonly used just exponentially increase the search space for useful mathematical ideas.

I want to be wrong about this, but I have yet to find category theory uniquely useful outside of some subdomains of pure math.

cata on When will computer programming become an unskilled job (if ever)?

One and a half years later it seems like AI tools are able to sort of help humans with very rote programming work (e.g. changing or writing code to accomplish a simple goal, implementing versions of things that are well-known to the AI like a textbook algorithm or a browser form to enter data, answering documentation-like questions about a system) but aren't much help yet on the more skilled labor parts of software engineering.

yonge on The Foraging (Ex-)Bandit [Ruleset & Reflections]

I was expecting earlier choices of foraging location to have a much stronger impact, and mistook some of the randomness for affects of earlier choices. In retrospect it would have been better to spend longer exploreing various possibilites rather than settling on an exploit strategy so soon. Adding an explicit target was a big improvement as it gave some idea of "how good a strategy" we should be searching for.

nate-showell on Heresies in the Shadow of the Sequences

Some other examples:

Agency and embeddedness are fundamentally at odds with each other. Decision theory and physics are incompatible approaches to world-modeling, with each making assumptions that are inconsistent with the other. Attempting to build mathematical models of embedding agency will fail as an attempt to understand advanced AI behavior.
Reductionism is false. If modeling a large-scale system in terms of the exact behavior of its small-scale components would take longer than the age of the universe, or would require a universe-sized computer, the large-scale system isn't explicable in terms of small-scale interactions even in principle. The Sequences are incorrect to describe non-reductionism as ontological realism about large-scale entities [LW · GW] -- the former doesn't inherently imply the latter.
Relatedly, nothing is ontologically primitive. Not even elementary particles: if, for example, you took away the mass of an electron, it would cease to be an electron and become something else. The properties of those particles, as well, depend on having fields to interact with. And if a field couldn't interact with anything, could it still be said to exist?
Ontology creates axiology and axiology creates ontology. We aren't born with fully formed utility functions in our heads telling us what we do and don't value. Instead, we have to explore and model the world over time, forming opinions along the way about what things and properties we prefer. And in turn, our preferences guide our exploration of the world and the models we form of what we experience. Classical game theory, with its predefined sets of choices and payoffs, only has narrow applicability, since such contrived setups are only rarely close approximations to the scenarios we find ourselves in.