LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

Dangers of Closed-Loop AI
Gordon Seidoh Worley (gworley) · 2024-03-22T23:52:22.010Z · comments (9)

Forecasting AI (Overview)
jsteinhardt · 2023-11-16T19:00:04.218Z · comments (0)

[link] Hyperreals in a Nutshell
Yudhister Kumar (randomwalks) · 2023-10-15T14:23:58.027Z · comments (27)

'Theories of Values' and 'Theories of Agents': confusions, musings and desiderata
Mateusz Bagiński (mateusz-baginski) · 2023-11-15T16:00:48.926Z · comments (8)

[Valence series] 4. Valence & Social Status (deprecated)
Steven Byrnes (steve2152) · 2023-12-15T14:24:41.040Z · comments (19)

An explanation for every token: using an LLM to sample another LLM
Max H (Maxc) · 2023-10-11T00:53:55.249Z · comments (5)

[link] Twitter thread on politics of AI safety
Richard_Ngo (ricraz) · 2024-07-31T00:00:34.298Z · comments (2)

Open Thread – Winter 2023/2024
habryka (habryka4) · 2023-12-04T22:59:49.957Z · comments (160)

How I select alignment research projects
Ethan Perez (ethan-perez) · 2024-04-10T04:33:08.092Z · comments (4)

[link] AISN #25: White House Executive Order on AI, UK AI Safety Summit, and Progress on Voluntary Evaluations of AI Risks
aogara (Aidan O'Gara) · 2023-10-31T19:34:54.837Z · comments (1)

My Detailed Notes & Commentary from Secular Solstice
Jeffrey Heninger (jeffrey-heninger) · 2024-03-23T18:48:51.894Z · comments (16)

Categories of leadership on technical teams
benkuhn · 2024-07-22T04:50:04.071Z · comments (0)

Empirical vs. Mathematical Joints of Nature
Elizabeth (pktechgirl) · 2024-06-26T01:55:22.858Z · comments (1)

Representation Tuning
Christopher Ackerman (christopher-ackerman) · 2024-06-27T17:44:33.338Z · comments (9)

[link] My Model of Epistemology
adamShimi · 2024-08-31T17:01:45.472Z · comments (0)

Live Machinery: An Interface Design Philosophy for Wholesome AI Futures (Workshop @ EA Hotel!)
Sahil · 2024-11-01T17:24:09.957Z · comments (2)

Agency in Politics
Martin Sustrik (sustrik) · 2024-07-17T05:30:01.873Z · comments (2)

Book Review: On the Edge: The Gamblers
Zvi · 2024-09-24T11:50:06.065Z · comments (1)

Open consultancy: Letting untrusted AIs choose what answer to argue for
Fabien Roger (Fabien) · 2024-03-12T20:38:03.785Z · comments (5)

What Helped Me - Kale, Blood, CPAP, X-tiamine, Methylphenidate
Johannes C. Mayer (johannes-c-mayer) · 2024-01-03T13:22:11.700Z · comments (12)

List of strategies for mitigating deceptive alignment
joshc (joshua-clymer) · 2023-12-02T05:56:50.867Z · comments (2)

Predictive model agents are sort of corrigible
Raymond D · 2024-01-05T14:05:03.037Z · comments (6)

[link] My article in The Nation — California’s AI Safety Bill Is a Mask-Off Moment for the Industry
garrison · 2024-08-15T19:25:59.592Z · comments (0)

Open Problems in AIXI Agent Foundations
Cole Wyeth (Amyr) · 2024-09-12T15:38:59.007Z · comments (2)

An Introduction to Representation Engineering - an activation-based paradigm for controlling LLMs
Jan Wehner · 2024-07-14T10:37:21.544Z · comments (5)

Proposal for improving the global online discourse through personalised comment ordering on all websites
Roman Leventov · 2023-12-06T18:51:37.645Z · comments (21)

[link] On Fables and Nuanced Charts
Niko_McCarty (niko-2) · 2024-09-08T17:09:07.503Z · comments (2)

[link] List of Collective Intelligence Projects
Chipmonk · 2024-07-02T14:10:41.789Z · comments (9)

Economics Roundup #2
Zvi · 2024-07-02T12:40:05.908Z · comments (5)

Video and transcript of presentation on Otherness and control in the age of AGI
Joe Carlsmith (joekc) · 2024-10-08T22:30:38.054Z · comments (1)

A sketch of acausal trade in practice
Richard_Ngo (ricraz) · 2024-02-04T00:32:54.622Z · comments (4)

Doomsday Argument and the False Dilemma of Anthropic Reasoning
Ape in the coat · 2024-07-05T05:38:39.428Z · comments (55)

Secondary Risk Markets
Vaniver · 2023-12-11T21:52:46.836Z · comments (4)

Monthly Roundup #22: September 2024
Zvi · 2024-09-17T12:20:08.297Z · comments (10)

How predictive processing solved my wrist pain
max_shen (makoshen) · 2024-07-04T01:56:20.162Z · comments (8)

[link] OpenAI appoints Retired U.S. Army General Paul M. Nakasone to Board of Directors
Joel Burget (joel-burget) · 2024-06-13T21:28:18.110Z · comments (10)

The Schumer Report on AI (RTFB)
Zvi · 2024-05-24T15:10:03.122Z · comments (3)

Computational Mechanics Hackathon (June 1 & 2)
Adam Shai (adam-shai) · 2024-05-24T22:18:44.352Z · comments (5)

[link] Robin Hanson & Liron Shapira Debate AI X-Risk
Liron · 2024-07-08T21:45:40.609Z · comments (4)

Flipping Out: The Cosmic Coinflip Thought Experiment Is Bad Philosophy
Joe Rogero · 2024-11-12T23:55:46.770Z · comments (17)

Motivating Alignment of LLM-Powered Agents: Easy for AGI, Hard for ASI?
RogerDearnaley (roger-d-1) · 2024-01-11T12:56:29.672Z · comments (4)

[link] The last era of human mistakes
owencb · 2024-07-24T09:58:42.116Z · comments (2)

[link] Inferring the model dimension of API-protected LLMs
Ege Erdil (ege-erdil) · 2024-03-18T06:19:25.974Z · comments (3)

[link] AI governance needs a theory of victory
Corin Katzke (corin-katzke) · 2024-06-21T16:15:46.560Z · comments (6)

If You Can Climb Up, You Can Climb Down
jefftk (jkaufman) · 2024-07-30T00:00:06.295Z · comments (9)

Augmenting Statistical Models with Natural Language Parameters
jsteinhardt · 2024-09-20T18:30:10.816Z · comments (0)

[link] Romae Industriae
Maxwell Tabarrok (maxwell-tabarrok) · 2024-07-19T13:03:31.536Z · comments (2)

Difficulty classes for alignment properties
Jozdien · 2024-02-20T09:08:24.783Z · comments (5)

Drug development costs can range over two orders of magnitude
rossry · 2024-11-03T23:13:17.685Z · comments (0)

[link] legged robot scaling laws
bhauth · 2024-01-20T05:45:56.632Z · comments (8)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

rhollerith_dot_com on Thoughts after the Wolfram and Yudkowsky discussion

Near the end of the interview, Wolfram say that he cannot do much processing of what was discussed "in real time", which strongly suggests to me that he expects to process it slowly over the next days and weeks. I.e., he is now trying to reassure himself that the AI project won't kill his four children or any grandchildren he has or will have. Because Wolfram is much better AFAICT at this kind of slow "strategic rational" deliberation than most people at his level of life accomplishment, there is a good chance he will fail to find his slow deliberations reassuring, in which case he probably then declares himself an AI doomer. Specifically, my probability is .21 that 18 months from now, Wolfram will have come out publicly against allowing ambitious frontier AI research to continue. P = .21 is much much higher than my P for the average 65-year-old of his intellectual stature who is not specialized in AI, so I thought it worthwhile to announce my probability.

I agree with another comment that Wolfram has not "done the reading" on AI extinction risk. Being able to watch his face while he confronts some of the convolutions for the first time made it easier, not harder, for me to predict where he will come down 18 months from now.

dagon on A Theory of Equilibrium in the Offense-Defense Balance

I think this is the right way to think of most anti-inductive (planner-adversarial or competitive exploitation) situations. Where there are multiple dimensions of assymetric capabilities, any change is likely to shift the equilibrium, but not necessarily by as much as the shift in component.

That said, tipping points are real, and sometimes a component shift can have a BIGGER effect, because it shifts the search to a new local minimum. In most cases, this is not actully entirely due to that component change, but the discovery and reconfiguration is triggered by it. The rise of mass shootings in the US is an example - there are a lot of causes, but the shift happened quite quickly.

Offense-defense is further confused as an example, because there are at least two different equilibria involved. when you say

The offense-defense balance is a concept that compares how easy it is to protect vs conquer or destroy resources.

Conquer control vs retain control is a different thing than destroy vs preserve. Frank Herbert claimed (via fiction) that "The people who can destroy a thing, they control it." but it's actually true in very few cases. The equilibrium of who gets what share of the value from something can shift very separately from the equilibrium of how much total value that thing provides.

sarahconstantin on sarahconstantin's Shortform

links 11/15/2024: https://roamresearch.com/#/app/srcpublic/page/11-15-2024

https://www.reddit.com/r/self/comments/1gleyhg/people_like_me_are_the_reason_trump_won/ a moderate/swing-voter (Obama, Trump, Biden) explains why he voted for Trump this time around:
- he thinks Kamala Harris was an "empty shell" and unlikable and he felt the campaign was manipulative and deceptive.
- he didn't like that she seemed to be a "DEI hire", but doesn't have a problem with black or female candidates generally, it's just that he resents cynical demographic box-checking.
  - this is a coherent POV -- he did vote for Obama, after all. and plenty of people are like "I want the best person regardless of demographics, not a person chosen for their demographics."
    - hm. why doesn't it seem natural to portray Obama as a "DEI hire"? his campaign made a bigger deal about race than Harris's, and he was criticized a lot for inexperience.
      - One guess: it's laughable to think Obama was chosen by anyone besides himself. He was not the Democratic Party's anointed -- that was Hillary. He's clearly an ambitious guy who wanted to be president on his own initiative and beat the odds to get the nomination. He can't be a "DEI hire" because he wasn't a hire at all.
      - another guess: Obama is clearly smart, speaks/writes in complete sentences, and welcomes lots of media attention and talks about his policies, while Harris has a tendency towards word salad, interviews poorly, avoids discussing issues, etc.
      - another guess: everyone seems to reject the idea that people prefer male to female candidates, but I'm still really not sure there isn't a gender effect! This is very vibes-based on my part, and apparently the data goes the other way, so very uncertain here.
https://trevorklee.substack.com/p/if-langurs-can-drink-seawater-can Trevor Klee on adaptations for drinking seawater

habryka4 on Seven lessons I didn't learn from election day

This was a really good analysis of a bunch of election stuff that I hadn't seen presented clearly like this anywhere else. If it wasn't about elections and news I would curate it.

maxwell-peterson on Seven lessons I didn't learn from election day

A good post, of interest to all across the political spectrum, marred by the mistake at the end to become explicitly politically opinionated and say bad things about those who voted differently than OP.

sharmake-farah on Seven lessons I didn't learn from election day

The one thing I'll say on the election is that a lot of people are using Kamala Harris's loss to put in their own reasons for why Kamala Harris lost that are essentially ideological propaganda.

Basically only the story that she was doomed from the start because of global backlash against incumbents for inflation matches the evidence best, and a lot of other theories are very much there for ideological purposes.

ben-lang on Seven lessons I didn't learn from election day

I think this question is maybe logically flawed.

Say I have a shuffled deck of cards. You say the probability that the top card is the Ace of Spades is 1/52. I show you the top card, it is the 5 of diamonds. I then ask, knowing what you know now, what probability you should have given.

I picked a card analogy, and you picked a dice one. I think the card one is better in this case, for weird idiosyncratic reasons I give below that might just be irrelevant to the train of thought you are on.

Cards vs Dice: If we could reset the whole planet to its exact state 1 week before the election then we would I think get the same result (I don't think quantum will mess with us in one week). What if we do a coarser grained reset? So if there was a kettle of water at 90 degrees a week before the election that kettle is reset to contain the same volume of water in the same part of my kitchen, and the water is still 90 degrees, but the individual water molecules have different momenta. For some value of "macro" the world is reset to the same macrostate but not the same microstate, it had 1 week before election day. If we imagine this experiment I still think Trump wins every (or almost every) time, given what we know now. For me to think this kind of thermal-level randomness made a difference in one week it would have to have been much closer.

In my head things that change on the coarse-grained reset feel more like unrolled dice, and things that don't more like facedown cards. Although in detail the distinction is fuzzy: it is based on an arbitrary line between micro an macro, and it is time sensitive, because cards that are going to be shuffled in the future are in the same category as dice.

EDIT: I did as asked, and replied without reading your comments on the EA forum. Reading that I think we are actually in complete agreement, although you actually know the proper terms for the things I gestured at.

abramdemski on o1 is a bad idea

Thanks for this response! I agree with the argument. I'm not sure what it would take to ensure CoT faithfulness, but I agree that it is an important direction to try and take things; perhaps even the most promising direction for near-term frontier-lab safety research (given the incentives pushing those labs in o1-ish directions).

mondsemmel on Lao Mein's Shortform

That might very well help, yes. However, two thoughts, neither at all well thought out:

If the Trump administration does fight OpenAI, let's hope Altman doesn't manage to judo flip the situation like he did with the OpenAI board saga, and somehow magically end up replacing Musk or Trump in the upcoming administration...
Musk's own track record on AI x-risk is not great. I guess he did endorse California's SB 1047, so that's better than OpenAI's current position. But he helped found OpenAI, and recently founded another AI company. There's a scenario where we just trade extinction risk from Altman's OpenAI for extinction risk from Musk's xAI.

notfnofn on D0TheMath's Shortform

For example, if you ask mathematicians whether ZFC + not Consistent(ZFC) is consistent, they will say "no, of course not!"

Certainly not a mathematician with any background in logic.

Similarly, if we have the Peano axioms without induction, mathematicians will say that induction should be there, but in fact you cannot prove this fact from within Peano

What exactly do you mean here? That the Peano axioms minus induction do not adequately characterize the natural numbers because they have nonstandard models? Why would I then be surprised that induction (which does characterize the natural numbers) can't be proven from the remaining axioms?

and given induction mathematicians will say transfinite induction should be there.

Transfinite induction is a consequence of ZF that makes sense in the context in sets. Yes, it can prove additional statements about the natural numbers (e.g. goodstein sequences converge), but why would it be added as an axiom when the natural numbers are already characterized up to isomorphism by the Peano axioms? How would you even add it as an axiom in the language of natural numbers? (that last question is non-rhetorical).