LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

The Cognitive Bootcamp Agreement
Raemon · 2024-10-16T23:24:05.509Z · comments (0)

Effectively Handling Disagreements - Introducing a New Workshop
Camille Berger (Camille Berger) · 2024-04-15T16:33:50.339Z · comments (2)

LLMs can strategically deceive while doing gain-of-function research
Igor Ivanov (igor-ivanov) · 2024-01-24T15:45:08.795Z · comments (4)

[link] Provably Safe AI
PeterMcCluskey · 2023-10-05T22:18:26.013Z · comments (15)

AI #63: Introducing Alpha Fold 3
Zvi · 2024-05-09T14:20:03.176Z · comments (2)

[link] AI Safety Memes Wiki
plex (ete) · 2024-07-24T18:53:04.977Z · comments (1)

Regrant up to $600,000 to AI safety projects with GiveWiki
Dawn Drescher (Telofy) · 2023-10-28T19:56:06.676Z · comments (1)

AI Safety Strategies Landscape
Charbel-Raphaël (charbel-raphael-segerie) · 2024-05-09T17:33:45.853Z · comments (1)

[link] Genocide isn't Decolonization
robotelvis · 2023-10-20T04:14:07.716Z · comments (19)

Proveably Safe Self Driving Cars [Modulo Assumptions]
Davidmanheim · 2024-09-15T13:58:19.472Z · comments (26)

How I build and run behavioral interviews
benkuhn · 2024-02-26T05:50:05.328Z · comments (6)

[link] Why you, personally, should want a larger human population
jasoncrawford · 2024-02-23T19:48:10.526Z · comments (32)

Being good at the basics
dominicq · 2023-11-04T14:18:50.976Z · comments (1)

0. The Value Change Problem: introduction, overview and motivations
Nora_Ammann · 2023-10-26T14:36:15.466Z · comments (0)

Preface to the Sequence on LLM Psychology
Quentin FEUILLADE--MONTIXI (quentin-feuillade-montixi) · 2023-11-07T16:12:07.742Z · comments (0)

Video and transcript of presentation on Scheming AIs
Joe Carlsmith (joekc) · 2024-03-22T15:52:03.311Z · comments (1)

Learning Math in Time for Alignment
Nicholas / Heather Kross (NicholasKross) · 2024-01-09T01:02:37.446Z · comments (3)

Padding the Corner
jefftk (jkaufman) · 2023-09-13T01:30:04.009Z · comments (4)

In Defense of Lawyers Playing Their Part
Isaac King (KingSupernova) · 2024-07-01T01:32:58.695Z · comments (9)

[link] the subreddit size threshold
bhauth · 2024-01-23T00:38:13.747Z · comments (3)

The International PauseAI Protest: Activism under uncertainty
Joseph Miller (Josephm) · 2023-10-12T17:36:15.716Z · comments (1)

Investigating the Ability of LLMs to Recognize Their Own Writing
Christopher Ackerman (christopher-ackerman) · 2024-07-30T15:41:44.017Z · comments (0)

Monthly Roundup #13: December 2023
Zvi · 2023-12-19T15:10:08.293Z · comments (5)

[link] How "Pause AI" advocacy could be net harmful
Tamsin Leake (carado-1) · 2023-12-26T16:19:20.724Z · comments (9)

Is suffering like shit?
KatjaGrace · 2024-05-31T01:20:03.855Z · comments (5)

[link] Lying is Cowardice, not Strategy
Connor Leahy (NPCollapse) · 2023-10-24T13:24:25.450Z · comments (73)

[link] NAO Updates, Fall 2024
jefftk (jkaufman) · 2024-10-18T00:00:04.142Z · comments (2)

[link] Manifund: 2023 in Review
Austin Chen (austin-chen) · 2024-01-18T23:50:13.557Z · comments (0)

[link] End Single Family Zoning by Overturning Euclid V Ambler
Maxwell Tabarrok (maxwell-tabarrok) · 2024-07-26T14:08:45.046Z · comments (1)

[question] How unusual is the fact that there is no AI monopoly?
Viliam · 2024-08-16T20:21:51.012Z · answers+comments (15)

A quick experiment on LMs’ inductive biases in performing search
Alex Mallen (alex-mallen) · 2024-04-14T03:41:08.671Z · comments (2)

Being against involuntary death and being open to change are compatible
Andy_McKenzie · 2024-05-27T06:37:27.644Z · comments (5)

DunCon @Lighthaven
Duncan Sabien (Deactivated) (Duncan_Sabien) · 2024-09-29T04:56:27.205Z · comments (0)

Computational Approaches to Pathogen Detection
jefftk (jkaufman) · 2023-11-01T00:30:13.012Z · comments (5)

If you are also the worst at politics
lukehmiles (lcmgcd) · 2024-05-26T20:07:49.201Z · comments (8)

Model of psychosis, take 2
Steven Byrnes (steve2152) · 2023-08-17T19:11:17.386Z · comments (11)

5 Reasons Why Governments/Militaries Already Want AI for Information Warfare
trevor (TrevorWiesinger) · 2023-10-30T16:30:38.020Z · comments (0)

[link] Uncovering Latent Human Wellbeing in LLM Embeddings
ChengCheng (ccstan99) · 2023-09-14T01:40:24.483Z · comments (7)

Some of my predictable updates on AI
Aaron_Scher · 2023-10-23T17:24:34.720Z · comments (8)

[link] Talking With People Who Speak to Congressional Staffers about AI risk
Eneasz · 2023-12-14T17:55:50.606Z · comments (0)

[link] A computational complexity argument for many worlds
jessicata (jessica.liu.taylor) · 2024-08-13T19:35:10.116Z · comments (15)

[link] New Tool: the Residual Stream Viewer
AdamYedidia (babybeluga) · 2023-10-01T00:49:51.965Z · comments (7)

An argument that consequentialism is incomplete
cousin_it · 2024-10-07T09:45:12.754Z · comments (27)

Update to "Dominant Assurance Contract Platform"
moyamo · 2023-09-21T16:09:57.044Z · comments (1)

Could We Automate AI Alignment Research?
Stephen McAleese (stephen-mcaleese) · 2023-08-10T12:17:05.194Z · comments (10)

An Introduction to Representation Engineering - an activation-based paradigm for controlling LLMs
Jan Wehner · 2024-07-14T10:37:21.544Z · comments (4)

[link] OpenAI, DeepMind, Anthropic, etc. should shut down.
Tamsin Leake (carado-1) · 2023-12-17T20:01:22.332Z · comments (48)

UDT1.01: Plannable and Unplanned Observations (3/10)
Diffractor · 2024-04-12T05:24:34.435Z · comments (0)

Attention Output SAEs Improve Circuit Analysis
Connor Kissane (ckkissane) · 2024-06-21T12:56:07.969Z · comments (0)

D&D.Sci (Easy Mode): On The Construction Of Impossible Structures [Evaluation and Ruleset]
abstractapplic · 2024-05-20T09:38:55.228Z · comments (2)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

jblack on Noah Birnbaum's Shortform

Yes, that was a pretty terrible take. Markets quite clearly do not price externalities well, and never have done. So long as any given investor rates their specific investment as being unlikely to tip the balance into doom, they get the upside of directly financially benefiting from major economic growth due to AI, and essentially the same downside risk as if they didn't invest. Arguments like "short some markets, or go long volatility, and then send those profits to Somalia to mitigate suffering for a few years before the whole world ends" are obviously not even trying to seriously reflect the widespread investment decisions that affect real markets.

raemon on What are the best arguments for/against AIs being "slightly 'nice'"?

I think he's making a bet that, if people could literally get tasty steak without killing cows for the same price (or cheaper), most of them would not pay for factory farmed cow.

raemon on "Slow" takeoff is a terrible term for "maybe even faster takeoff, actually"

alas not easily

david-matolcsi on You can, in fact, bamboozle an unaligned AI into sparing your life

Here is the promised comment on what kind of "commitment" I want to make given all the responses.

I agree with Buck that no one should make very direct commitment about this sort of thing, as there might be blackmail related scary things lurking in the shadows when one does acausal trade. I think we will probably figure out how to handle that, but we shouldn't make any strong promises of specific actions until we figure that out.

However, the promise I'm intending to keep is that if humanity wins and I'm alive to see it, I will remember how scary it was to live in the world before we knew we'd win, and if there is no currently unknown reason not to do something like the simulation scheme described in the post, and the only cost is not utilizing some small fraction of the conquered Universe, I will advocate for doing it and spend a significant fraction of my resources on it if necessary. I think it would be good for sending future AIs a signal about the seriousness of this, if some other people who are worried about everyone dying from AI, but are not strongly longtermist absolute utilitarians, also left a comment making a similar vague promise that if they ever get to control full star systems, they will contribute to this plan if no currently unknown serious counterargument comes up. Or at least leave an Agreed sing on this sentence.

Other updates from the comments: If I wrote the post now, I would more strongly emphasize a framing that this plan is basically about entering into an acausal insurance contract with other young civilizations who don't yet know what their chances are for aligning AI, neither how big their Universe is compared to the usual Universe-size in the great prior over possible worlds. Then the civilizations who align their AIs and find that they live in a relatively big Universe, bail out everyone else in the insurance contract.

But I think that, while this acausal insurance framing might be more satisfying to the people who are already thinking a lot about acausal trade, in practice the way we implement this "insurance" will likely be very similar to the scheme described in the post. So I maintain that for most people it's better not to think in terms of acausal trade, but just think about the simulation proposal described in the post.

sustrik on What's a good book for a technically-minded 11-year old?

Exactly. You can't make the kid read something, but if he doesn't know the book exists he's not going to read it for sure.

declan-molony on Conversational Signposts—An Antidote to Dull Social Interactions

Conversational signposts are just one technique to improve social interactions. For more advanced techniques, I would recommend checking out:

How to Talk to Anyone: 92 Little Tricks for Big Success in Relationships by Leil Lowndes
How to Win Friends & Influence People by Dale Carnegie

drake-thomas on Dario Amodei — Machines of Loving Grace

I agree it seems unlikely that we'll see coordination on slowing down before one actor or coalition has a substantial enough lead over other actors that it can enforce such a slowdown unilaterally, but I think it's reasonably likely that such a lead will arise before things get really insane.

A few different stories under which one might go from aligned "genius in a datacenter" level AI at time t to outcomes merely at the level of weirdness in this essay at t + 5-10y:

The techniques that work to align "genius in a datacenter" level AI don't scale to wildly superhuman intelligence (eg because they lose some value fidelity from human-generated oversight signals that's tolerable at one remove but very risky at ten). The alignment problem for serious ASI is quite hard to solve at the mildly superintelligent level, and it genuinely takes a while to work out enough that we can scale up (since the existing AIs, being aligned, won't design unaligned successors).
If people ask their only-somewhat-superhuman AI what to do next, the AIs say "A bunch of the decisions from this point on hinge on pretty subtle philosophical questions, and frankly it doesn't seem like you guys have figured all this out super well, have you heard of this thing called a long reflection?" That's what I'd say if I were a million copies of me in a datacenter advising a 2024-era US government on what to do about Dyson swarms!
A leading actor uses their AI to ensure continued strategic dominance and prevent competing AI projects from posing a meaningful threat. Having done so, they just... don't really want crazy things to happen really fast, because the actor in question is mostly composed of random politicians or whatever. (I'm personally sympathetic to astronomical waste arguments, but it's not clear to me that people likely to end up with the levers of power here are.)
The serial iteration times and experimentation loops are just kinda slow and annoying, and mildly-superhuman AI isn't enough to circumvent experimentation time bottlenecks (some of which end up being relatively slow), and there are stupid zoning restrictions on the land you want to use for datacenters, and some regulation adds lots of mandatory human overhead to some critical iteration loop, etc.
- This isn't a claim that maximal-intelligence-per-cubic-meter ASI initialized in one datacenter would face long delays in making efficient use of its lightcone, just that it might be tough for a not-that-much-better-than-human AGI that's aligned and trying to respect existing regulations and so on to scale itself all that rapidly.
Among the tech unlocked in relatively early-stage AGI is better coordination, and that helps Earth get out of unsavory race dynamics and decide to slow down.
The alignment tax at the superhuman level is pretty steep, and doing self-improvement while preserving alignment goes much slower than unrestricted self-improvement would; since at this point we have many fewer ongoing moral catastrophes (eg everyone who wants to be cryopreserved is, we've transitioned to excellent cheap lab-grown meat), there's little cost to proceeding very cautiously.
- This is sort of a continuous version of the first bullet point with a finite rather than infinite alignment tax.

All that said, upon reflection I think I was probably lowballing the odds of crazy stuff on the 10y timescale, and I'd go to more like 50-60% that we're seeing mind uploads and Kardashev level 1.5-2 civilizations etc. a decade out from the first powerful AIs.

I do think it's fair to call out the essay for not highlighting the ways in which it might be lowballing things or rolling in an assumption of deliberate slowdown; I'd rather it have given more of a nod to these considerations and made the conditions of its prediction clearer.

thedstrat on What are the best arguments for/against AIs being "slightly 'nice'"?

"'it seems pretty plausible that AI will be at least somewhat nice', similar to how humans are somewhat nice to animals.". He must not know about factory farming - that thing where humans systematically hold harm and slaughter billions yearly. The following paragraph discusses if AI will care at all about humans and see us as valuable agents. The author then discusses how we do care about the wellbeing of cows. "We won't kill them for arbitrarily insignificant reasons" because clearly the taste of a good steak being more enjoyable than a plant based burger is anything but arbitrary. It's objectively more important than their life. To not understand the irony and ridiculousness here says something about the quality of the entire passage and the author's reasoning ability.

remmelt-ellen on OpenAI defected, but we can take honest actions

Resonating with you here! Yes, I think autonomous corporations (and other organisations) would result in society-wide extraction, destabilisation and totalitarianism.

john-huang on Could randomly choosing people to serve as representatives lead to better government?

I'm not that concerned with lobbyists ruining the deliberative proceedings. I think you're underestimating normal people a bit. They have state power to shut down annoying and undesired feedback if they wish. I also think the assembly will tend to trust their own advisors, whom they hired themselves, over outside self-proclaimed expert lobbyists.

My bigger concern is with corruption and bribery. Because we're dealing with very normal people, we also ought to expect normal criminal behavior. We ought to expect assembly members getting arrested from time to time, and doing all the normal things we expect from 500 random people.

I think bribery is a sufficiently high concern that a police force should constantly operate to perform sting operations and monitor elicit behavior from assembly members. IMO, this should already be happening with elected officials too.

Another big concern is whether a purely lottocratic assembly would self-regulate its own corruption. It has some interest to, in that the lottocrats help their future selves, after their term has ended, by creating future rules that would regulate corruption. Terril Bouricus attempts to create a system where layers on layers of assemblies check and re-check the work of other assemblies to mitigate corruption concerns.

I can't easily conclude whether election or sortition would be better at corruption mitigation. With elections, opposition parties have an incentive to investigate their enemies to root out corruption. HOWEVER, the same opposition parties have an incentive to lie about the results of investigations, leading to an environment of fake news, where voters cannot distinguish between a political attack and actual corruption. In the American context, bribery is about already legalized with campaign donations.

Sortition could possibly lead to a ridiculous scenario:

Imagine the public is outraged at the insane level of corruption of the sortition-assembly. However as a new assembly is formed by lottery, these anti-corruption sentiments are suddenly rotated into office. The members of the public hate corruption, as does this new assembly! The question is, would the members of the assembly be able to do the Machiavellian about-face and suddenly embrace corruption? I have a hard time believing they would, though I have doubts. In my opinion, normal people being utterly normal, would rather do the easy thing and yes, go ahead and regulate the corruption while enjoying their government salary. Getting to serve in office is already a win-win, why not win and also be declared heroes? Alternatively they can "Go Breaking Bad", embrace corruption and pilfer the state coffers. They can win big (for now) but will become despised. What do you think normal people would do? High risk high reward, or low risk medium reward? I don't think going "Breaking Bad" is the best of ideas. Elected politicians use their offices to protect themselves from legal challenge. Obvious example, Donald Trump using the presidency to overcome his legal problems. He's obviously not the first politician to cling to office in order to protect themselves. Lottocrats can't do the same. Lottocrats soon lose their powers and become vulnerable.