LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Release: Optimal Weave (P1): A Prototype Cohabitive Game
mako yass (MakoYass) · 2024-08-17T14:08:18.947Z · comments (21)

[link] What Depression Is Like
Sable · 2024-08-27T17:43:22.549Z · comments (23)

OpenAI o1, Llama 4, and AlphaZero of LLMs
Vladimir_Nesov · 2024-09-14T21:27:41.241Z · comments (24)

AI #83: The Mask Comes Off
Zvi · 2024-09-26T12:00:08.689Z · comments (19)

[link] Not every accommodation is a Curb Cut Effect: The Handicapped Parking Effect, the Clapper Effect, and more
Michael Cohn (michael-cohn) · 2024-09-15T05:27:36.691Z · comments (39)

Values Are Real Like Harry Potter
johnswentworth · 2024-10-09T23:42:24.724Z · comments (17)

Quick look: applications of chaos theory
Elizabeth (pktechgirl) · 2024-08-18T15:00:07.853Z · comments (45)

How to prevent collusion when using untrusted models to monitor each other
Buck · 2024-09-25T18:58:20.693Z · comments (5)

The case for a negative alignment tax
Cameron Berg (cameron-berg) · 2024-09-18T18:33:18.491Z · comments (20)

[Intuitive self-models] 2. Conscious Awareness
Steven Byrnes (steve2152) · 2024-09-25T13:29:02.820Z · comments (40)

[link] Is "superhuman" AI forecasting BS? Some experiments on the "539" bot from the Centre for AI Safety
titotal (lombertini) · 2024-09-18T13:07:40.754Z · comments (3)

Secular interpretations of core perennialist claims
zhukeepa · 2024-08-25T23:41:02.683Z · comments (32)

Self-prediction acts as an emergent regularizer
Cameron Berg (cameron-berg) · 2024-10-23T22:27:03.664Z · comments (4)

Information vs Assurance
johnswentworth · 2024-10-20T23:16:25.762Z · comments (5)

Value fragility and AI takeover
Joe Carlsmith (joekc) · 2024-08-05T21:28:07.306Z · comments (5)

Darwinian Traps and Existential Risks
KristianRonn · 2024-08-25T22:37:14.142Z · comments (14)

Bitter lessons about lucid dreaming
avturchin · 2024-10-16T21:27:04.725Z · comments (48)

Scaffolding for "Noticing Metacognition"
Raemon · 2024-10-09T17:54:13.657Z · comments (4)

The Obliqueness Thesis
jessicata (jessica.liu.taylor) · 2024-09-19T00:26:30.677Z · comments (16)

A Simple Toy Coherence Theorem
johnswentworth · 2024-08-02T17:47:50.642Z · comments (15)

Rationality Quotes - Fall 2024
Screwtape · 2024-10-10T18:37:55.013Z · comments (21)

My 10-year retrospective on trying SSRIs
Kaj_Sotala · 2024-09-22T20:30:02.483Z · comments (10)

[link] Soft Nationalization: how the USG will control AI labs
Deric Cheng (deric-cheng) · 2024-08-27T15:11:14.601Z · comments (7)

Joshua Achiam Public Statement Analysis
Zvi · 2024-10-10T12:50:06.285Z · comments (14)

[link] A Narrow Path: a plan to deal with AI extinction risk
Andrea_Miotti (AndreaM) · 2024-10-07T13:02:15.229Z · comments (10)

[link] Excerpts from "A Reader's Manifesto"
Arjun Panickssery (arjun-panickssery) · 2024-09-06T22:37:40.254Z · comments (1)

Introducing Transluce — A Letter from the Founders
jsteinhardt · 2024-10-23T18:10:02.526Z · comments (2)

Catastrophic sabotage as a major threat model for human-level AI systems
evhub · 2024-10-22T20:57:11.395Z · comments (4)

In Defense of Open-Minded UDT
abramdemski · 2024-08-12T18:27:36.220Z · comments (27)

AI for Bio: State Of The Field
sarahconstantin · 2024-08-30T18:00:02.187Z · comments (2)

[question] Interest in Leetcode, but for Rationality?
Gregory (gregory-eales) · 2024-10-16T17:54:25.578Z · answers+comments (20)

FarmKind's Illusory Offer
jefftk (jkaufman) · 2024-08-09T11:30:07.082Z · comments (5)

Guide to SB 1047
Zvi · 2024-08-20T13:10:07.408Z · comments (18)

The Mask Comes Off: At What Price?
Zvi · 2024-10-21T23:50:05.247Z · comments (16)

Could randomly choosing people to serve as representatives lead to better government?
John Huang · 2024-10-21T17:10:20.920Z · comments (13)

If we solve alignment, do we die anyway?
Seth Herd · 2024-08-23T13:13:10.933Z · comments (65)

[link] If far-UV is so great, why isn't it everywhere?
Austin Chen (austin-chen) · 2024-10-19T18:56:58.910Z · comments (22)

Adam Optimizer Causes Privileged Basis in Transformer LM Residual Stream
Diego Caples (diego-caples) · 2024-09-06T17:55:34.265Z · comments (7)

[link] [Paper] A is for Absorption: Studying Feature Splitting and Absorption in Sparse Autoencoders
chanind · 2024-09-25T09:31:03.296Z · comments (15)

The Hessian rank bounds the learning coefficient
Lucius Bushnaq (Lblack) · 2024-08-08T20:55:36.960Z · comments (9)

[link] GPT-4o System Card
Zach Stein-Perlman · 2024-08-08T20:30:52.633Z · comments (11)

AI #79: Ready for Some Football
Zvi · 2024-08-29T13:30:10.902Z · comments (16)

Estimating Tail Risk in Neural Networks
Mark Xu (mark-xu) · 2024-09-13T20:00:06.921Z · comments (9)

[link] The economics of space tethers
harsimony · 2024-08-22T16:15:22.699Z · comments (22)

Why Large Bureaucratic Organizations?
johnswentworth · 2024-08-27T18:30:07.422Z · comments (52)

[link] Open Source Automated Interpretability for Sparse Autoencoder Features
kh4dien · 2024-07-30T21:11:36.866Z · comments (1)

o1-preview is pretty good at doing ML on an unknown dataset
Håvard Tveit Ihle (havard-tveit-ihle) · 2024-09-20T08:39:49.927Z · comments (1)

EIS XIV: Is mechanistic interpretability about to be practically useful?
scasper · 2024-10-11T22:13:51.033Z · comments (4)

[Intuitive self-models] 3. The Homunculus
Steven Byrnes (steve2152) · 2024-10-02T15:20:18.394Z · comments (29)

Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now)
Elizabeth (pktechgirl) · 2024-10-22T18:20:01.194Z · comments (33)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

lsusr on is it possible to comment anonymously on a post?

No, but you can create an alt account.

alenoach on Are we dropping the ball on Recommendation AIs?

What do you think is the main issue preventing companies from making more ethical recommendation algorithms? Is it the difficulty of determining objectively what is accurate and ethical? Or is it more about the incentives, like an unwillingness to sacrifice addictiveness and part of their audience?

zack_m_davis on Claude Sonnet 3.5.1 and Haiku 3.5

The next major update can be Claude 4.0 (and Gemini 2.0) and after that we all agree to use actual normal version numbering rather than dating?

Date-based versions aren't the most popular, but it's not an unheard of thing that Anthropic just made up: see CalVer, as contrasted to SemVer. (For things that change frequently in small ways, it's convenient to just slap the date on it rather than having to soul-search about whether to increment the second or the third number.)

nathan-helm-burger on Claude Sonnet 3.5.1 and Haiku 3.5

For raw IQ, sure. I just mean "conversational flavor".

alenoach on Are we dropping the ball on Recommendation AIs?

Good recommendation engines are really important for our epistemic environment, in my opinion more than for example prediction markets. Because it indeed affects so much of the content that people ingest in their daily lives, on a large scale.

The tough question is how tractable it is. Tournesol has some audience, but also seems to struggle to scale it up despite pretty mature software. I really don't know how effective it would be to pressure companies like Facebook or TikTok, or to push for regulation, or to conduct more research on how to improve recommendation algorithms. Seems worth investigating whether there are cost-effective opportunities, whether through grants or job recommendations.

nisan on Linkpost: Memorandum on Advancing the United States’ Leadership in Artificial Intelligence

Section 3.3(f)(iii):

Within 120 days of the date of this memorandum, DOE, acting primarily through the National Nuclear Security Administration (NNSA) and in close coordination with AISI and NSA, shall seek to develop the capability to perform rapid systematic testing of AI models’ capacity to generate or exacerbate nuclear and radiological risks. This initiative shall involve the development and maintenance of infrastructure capable of running classified and unclassified tests, including using restricted data and relevant classified threat information. This initiative shall also feature the creation and regular updating of automated evaluations, the development of an interface for enabling human-led red-teaming, and the establishment of technical and legal tooling necessary for facilitating the rapid and secure transfer of United States Government, open-weight, and proprietary models to these facilities.

It sounds like the plan is for AI labs to transmit models to government datacenters for testing. I anticipate at least one government agency will quietly keep a copy for internal use.

brasslion on Hell is wasted on the evil

This is my favorite exchange I have ever read on LessWrong.

mo-putera on Adverse Selection by Life-Saving Charities

To be clear, GiveWell won’t be shocked by anything I’ve said so far. They’ve commissioned work and published reports on this. But as you might expect, these quality of life adjustments wouldnt feature in GiveWell’s calculations anyway, since the pitch to donors is about the price paid for a life, or a DALY.

Can you clarify what you mean by these quality of life adjustments not featuring in GiveWell's calculations?

To be more concrete, let's take their CEA of HKI's vitamin A supplementation (VAS) program in Burkina Faso. They estimate that a $1M grant would avert 553 under-5 deaths (~80% of total program benefit) and incrementally increase future income for the ~560,000 additional children receiving VAS (~20%) (these figures vary considerably by location by the way, from 60 deaths averted in Anambra, Nigeria to 1,475 deaths averted in Niger) then they convert this to 81,811 income-doubling equivalents (their altruistic common denominator — they don't use DALYs in any of their CEAs, so I'm always befuddled when people claim they do), make a lot of leverage- and funging-related adjustments which reduces this to 75,272 income doublings, then compare it with the 3,355 income doublings they estimate would be generated by donating that $1M to GiveDirectly to get their 22.4x cash multiplier for HKI VAS in Burkina Faso.

So: are you saying that GiveWell should add a "QoL discount" when converting lives saved and income increase, like what Happier Lives Institute suggests [EA · GW] for non-Epicurean accounts of badness of death?

baometrus on Universal Basic Income and Poverty

I have known some rich people. They don't act as a coordinated group almost ever; and the group they don't form, is flatly not capable of accurately predicting and deliberately directing world-historical equilibria over centuries.

Beg to differ. Rich people congregate, populate, and settle in areas, and form orgs and egregore that push away and hide the poor and the problems that make them uncomfortable, and this happens largely unconsciously - phenomenologically - yet there is organizational agency in it. And this kind of wealth phenomena has certainly steered world-historical equilibria over centuries. My point is: it's not a "conspiracy" of rich people, it's a phenomenon of rich people, and I think you should include the phenomenology of wealthy populations in your modeling.

jason-gross on Are we dropping the ball on Recommendation AIs?

I believe the closest research to this topic is under the heading "Performative Power" (cf, e.g., this arXiv paper). I think "The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power" by Shoshana Zuboff is also a pretty good book that seems related.