LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Review: Conor Moreton's "Civilization & Cooperation"
Duncan Sabien (Deactivated) (Duncan_Sabien) · 2024-05-26T19:32:43.131Z · comments (8)

Prediction Markets aren't Magic
SimonM · 2023-12-21T12:54:07.754Z · comments (29)

[link] Introducing METR's Autonomy Evaluation Resources
Megan Kinniment (megan-kinniment) · 2024-03-15T23:16:59.696Z · comments (0)

Partial value takeover without world takeover
KatjaGrace · 2024-04-05T06:20:03.961Z · comments (23)

Based Beff Jezos and the Accelerationists
Zvi · 2023-12-06T16:00:08.380Z · comments (29)

[link] New report: Safety Cases for AI
joshc (joshua-clymer) · 2024-03-20T16:45:27.984Z · comments (14)

[link] Large Language Models can Strategically Deceive their Users when Put Under Pressure.
ReaderM · 2023-11-15T16:36:04.446Z · comments (8)

Towards Multimodal Interpretability: Learning Sparse Interpretable Features in Vision Transformers
hugofry · 2024-04-29T20:57:35.127Z · comments (8)

story-based decision-making
bhauth · 2024-02-07T02:35:27.286Z · comments (11)

AI #73: Openly Evil AI
Zvi · 2024-07-18T14:40:05.770Z · comments (20)

Public Call for Interest in Mathematical Alignment
Davidmanheim · 2023-11-22T13:22:09.558Z · comments (9)

Dragon Agnosticism
jefftk (jkaufman) · 2024-08-01T17:00:06.434Z · comments (60)

Singular learning theory: exercises
Zach Furman (zfurman) · 2024-08-30T20:00:03.785Z · comments (5)

Teaching CS During Take-Off
andrew carle (andrew-carle) · 2024-05-14T22:45:39.447Z · comments (13)

[link] Debating with More Persuasive LLMs Leads to More Truthful Answers
Akbir Khan (akbir-khan) · 2024-02-07T21:28:10.694Z · comments (14)

Live Theory Part 0: Taking Intelligence Seriously
Sahil · 2024-06-26T21:37:10.479Z · comments (3)

[link] Executable philosophy as a failed totalizing meta-worldview
jessicata (jessica.liu.taylor) · 2024-09-04T22:50:18.294Z · comments (40)

Covert Malicious Finetuning
Tony Wang (tw) · 2024-07-02T02:41:51.698Z · comments (4)

Stagewise Development in Neural Networks
Jesse Hoogland (jhoogland) · 2024-03-20T19:54:06.181Z · comments (1)

On the abolition of man
Joe Carlsmith (joekc) · 2024-01-18T18:17:06.201Z · comments (18)

Natural Latents: The Concepts
johnswentworth · 2024-03-20T18:21:19.878Z · comments (18)

2024 Petrov Day Retrospective
Ben Pace (Benito) · 2024-09-28T21:30:14.952Z · comments (25)

I'm a bit skeptical of AlphaFold 3
Oleg Trott (oleg-trott) · 2024-06-25T00:04:41.274Z · comments (14)

[link] Self-Help Corner: Loop Detection
adamShimi · 2024-10-02T08:33:23.487Z · comments (6)

[link] Re: Anthropic's suggested SB-1047 amendments
RobertM (T3t) · 2024-07-27T22:32:39.447Z · comments (13)

[link] More Hyphenation
Arjun Panickssery (arjun-panickssery) · 2024-02-07T19:43:29.086Z · comments (19)

Growth and Form in a Toy Model of Superposition
Liam Carroll (liam-carroll) · 2023-11-08T11:08:04.359Z · comments (7)

Research update: Towards a Law of Iterated Expectations for Heuristic Estimators
Eric Neyman (UnexpectedValues) · 2024-10-07T19:29:29.033Z · comments (2)

[link] Detecting Genetically Engineered Viruses With Metagenomic Sequencing
jefftk (jkaufman) · 2024-06-27T14:01:34.868Z · comments (10)

Solving adversarial attacks in computer vision as a baby version of general AI alignment
Stanislav Fort (stanislavfort) · 2024-08-29T17:17:47.136Z · comments (8)

How well do truth probes generalise?
mishajw · 2024-02-24T14:12:19.729Z · comments (11)

We might be missing some key feature of AI takeoff; it'll probably seem like "we could've seen this coming"
Lukas_Gloor · 2024-05-09T15:43:11.490Z · comments (36)

Apply to be a Safety Engineer at Lockheed Martin!
yanni kyriacos (yanni) · 2024-03-31T21:02:08.499Z · comments (3)

There is a globe in your LLM
jacob_drori (jacobcd52) · 2024-10-08T00:43:40.300Z · comments (4)

GPT-o1
Zvi · 2024-09-16T13:40:06.236Z · comments (34)

OpenAI: Helen Toner Speaks
Zvi · 2024-05-30T21:10:02.938Z · comments (8)

The Aspiring Rationalist Congregation
maia · 2024-01-10T22:52:54.298Z · comments (23)

I turned decision theory problems into memes about trolleys
Tapatakt · 2024-10-30T20:13:29.589Z · comments (13)

Rejecting Television
Declan Molony (declan-molony) · 2024-04-23T04:59:50.253Z · comments (10)

A simple case for extreme inner misalignment
Richard_Ngo (ricraz) · 2024-07-13T15:40:37.518Z · comments (41)

Scalable oversight as a quantitative rather than qualitative problem
Buck · 2024-07-06T17:42:41.325Z · comments (11)

Addressing Feature Suppression in SAEs
Benjamin Wright (Benw8888) · 2024-02-16T18:32:51.927Z · comments (3)

Reflections on Less Online
Error · 2024-07-07T03:49:44.534Z · comments (15)

[link] Environmentalism in the United States Is Unusually Partisan
Jeffrey Heninger (jeffrey-heninger) · 2024-05-13T21:23:10.755Z · comments (26)

5 homegrown EA projects, seeking small donors
Austin Chen (austin-chen) · 2024-10-28T23:24:25.745Z · comments (0)

Fluent, Cruxy Predictions
Raemon · 2024-07-10T18:00:06.424Z · comments (12)

[link] Anxiety vs. Depression
Sable · 2024-03-17T00:15:08.255Z · comments (35)

[question] What are the best arguments for/against AIs being "slightly 'nice'"?
Raemon · 2024-09-24T02:00:19.605Z · answers+comments (51)

[Valence series] 2. Valence & Normativity
Steven Byrnes (steve2152) · 2023-12-07T16:43:49.919Z · comments (5)

[link] "AI Safety for Fleshy Humans" an AI Safety explainer by Nicky Case
habryka (habryka4) · 2024-05-03T18:10:12.478Z · comments (10)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

signer on The Compendium, A full argument about extinction risk from AGI

It sure doesn't seem to generalize in GPT-4o case. But what's the hypothesis for Sonnet 3.5 refusing in 85% of cases? And CoT improving score and o1 being better in browser suggests the problem is in models not understanding consequences, not in them not trying to be good. What's the rate of capability generalization to agent environment? Are we going to conclude that Sonnet is just demonstrates reasoning, instead of doing it for real, if it solves only 85% of tasks it correctly talks about?

Also, what's the rate of generalization of unprompted problematic behaviour avoidance? It's much less of a problem if your AI does what you tell it to do - you can just don't give it to users, tell it to invent nanotechnology, and win.

metawrong on Ryan Kidd's Shortform

LASR (https://www.lasrlabs.org/) is giving a £11,000 stipend for a 13 week program, assuming 40h/week it works out to ~$27

rhollerith_dot_com on Dentistry, Oral Surgeons, and the Inefficiency of Small Markets

VCs are already doing this. They have offered to buy both the oral surgery practice and the dental practice I use in town.

Investors have offered to buy both, but why do you believe those investors were VCs? It seem very unlikely to me that they were.

sharmake-farah on wrapper-minds are the enemy

The realist in me says that tyrannical souls/tyrannical governments seem likely to be the default state of governance, because the forces that power democracy and liberty will be gone with the rise of advanced AI, so we should start planning to make the future AIs we build, and the people that control AI, and the future AIs that do control the government.

More generally, I expect value alignment to be much more of a generator of outcomes in the 21st century than most other forces with the rise of AI, and this is not just about the classical AI alignment problem, compared to people selfishly doing stuff that generates positive externalities as a side effect.

metachirality on JargonBot Beta Test

Why not generate it after it's posted publically?

jbash on Dentistry, Oral Surgeons, and the Inefficiency of Small Markets

Any time I am faced with this kind of shocking inefficiency, I ask myself a simple question: why was no one doing this before?

Well, as I understand it, the general belief is that...

The "scaled up" practices are relatively unpleasant to work in, and make people (who went through a lot of education expecting to get "prestige" jobs, mind you...) feel deprived of agency, deprived of choices about the when-where-and-how of their work, and just generally devalued.
The "non business savvy" people who actually generate the value believe, probably entirely correctly, that somewhere between most and actually-more-than-all of the increased income from that kind of scale-up will end up going to MBAs (or to the one or two theoretically-practitioners who actually own of a "medium-sized" practice), and not to them^[1].
Healthcare facilities operated by private equity are widely believed, both based on industry rumor and based on actual measurement, to reduce quality of care, and people don't like to be forced to do a bad job if they don't have to?

Why would you voluntarily make your daily life actually unpleasant just to increase an already high income that you'll probably have less time to enjoy anyway?

... and it may not drive prices down for the consumer as much as you might think, either, because many consumers have limited price sensitivity as well as very limited ability to evaluate the quality of care. ↩︎

daniel-kokotajlo on wrapper-minds are the enemy

I continue to think this is a great post. Part of why I think that is that I haven't forgotten it; it keeps circling back into my mind.

Recently this happened and I made a fun connection: What you call wrapper-minds seem similar to what Plato (in The Republic) calls people-with-tyrannical-souls. i.e. people whose minds are organized the way a tyrannical city is organized, with a single desire/individual (or maybe a tiny junta) in total control, and everything else subservient.

I think the concepts aren't exactly the same though -- Plato would have put more emphasis on the single bit, whereas for your concept of wrapper-mind it doesn't matter much if it's e.g. just paperclips vs. some complicated mix of lots of different things, for the concept of wrapper-mind the emphasis is on immutability and in particular insensitivity to reasoned discussion / learning / etc.

dagon on Dentistry, Oral Surgeons, and the Inefficiency of Small Markets

"there was a model that worked ok, and there weren't enough businesses savvy people who understood enough of the details to really scale the DSO model."

This applies to a lot of the enshittification of the world. There used to be tons of small/family businesses, where "successful" for the owner was defined as "make a decent living, by working harder than average". There was tons of value left on the table (or rather, lots of unmeasured surplus went to consumers). When things started getting moneyballed - optimized financially and reframed in terms of capital and returns, that surplus got squeezed out.

wbrom42-gmail-com on Dentistry, Oral Surgeons, and the Inefficiency of Small Markets

VCs are already doing this. They have offered to buy both the oral surgery practice and the dental practice I use in town.
The care they provide turns worse and worse because the model you envision turns a professional (someone who should have a fiduciary responsibility to the patient's best interest above their own) into an employee of a non-professional corporation. All of the pre-and postoperative care that you envision being done by less highly paid individuals in order to free up the surgeon to "generate profit" gets done cheaply and more slapdash resulting in worse and worse patient care. Either the oral surgeon fights back and attempts to maintain the physician patient relationship and gets fired from their own practice that they sold out (pretty common already with Derm and Optho) or they don't and you get the actual medical version of the plastic surgery chop shops common in Miami. This ethical problem is why non-lawyers cannot own a legal practice and yet we failed to recognize the same destruction of the professional relationship when it comes to physicians.
Aspen dental is a franchise based venture capital funded organization that already does this.
This is where rationalists fall apart. Everything you say makes sense, but it doesn't take into account the sociocultural aspects that make a physician patient relationship different than the value extractive relationship that you propose.

tiago-macedo on Conservation of Expected Evidence and Random Sampling in Anthropics

On the same day I posted my original comment I later realized what I said was wrong, and I'll soon edit it to reflect that.

Regarding your response: I think I have a guess on the important difference you're referring to. They both seem to be equivalent to an Incubator Sleeping Beauty, but see consideration 2 bellow.

1

I think another useful (at least to me) way of seeing/stating what is happening here is that all of the following sentences are true, in an ISB and your two experiments:

The probability (from an external POV) that the coin was Heads or Tails is 1/2.
Each individual "me" (however many there are) will experience the coin being Heads or Tails one half of the time.
If every "me" always predicts Heads, all of my mes will be correct 1/3 of the time and wrong 2/3 of the time. Each individual me will only be able to notice this if we get together after the experiments to compare notes.

I think this is equivalent to the difference in scoring methods you used in Anthropical Motte and Bailey in two versions of Sleeping Beauty.

2

With the two experiments in your response, the only significant difference I can see is that, in experiment 1, there are two identical copies of me, and in 2, there are two different people. I don't know if you're implying that this changes any probabilities, and I'm not sure that it does. What I can say is that experiment 2 is, AFAICT, equivalent to the Doomsday argument in it's setup: two theories on the amount of people that will come to be, with 1:1 prior odds between them, and the question is "should you update on your existing". I have more reflection to make before I can give any firm answer here, but I'm inclined toward "no".

3

I have a feeling that, even though we agree with the final probabilities, we disagree on some of the internal details of how these experiments work. What would you say is the significant difference between the experiments, and does it change the numbers?