LessWrong 2.0 Reader

View: New · Old · Top

Restrict date range: Today · This week · This month · Last three months · This year · All time

← previous page (newer posts) · next page (older posts) →

The Bar for Contributing to AI Safety is Lower than You Think
Chris_Leong · 2024-08-16T15:20:19.055Z · comments (1)

[link] [Linkpost] 'The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery'
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-08-15T21:32:59.979Z · comments (1)

AI Can be “Gradient Aware” Without Doing Gradient hacking.
Sodium · 2024-10-20T21:02:10.754Z · comments (0)

[link] Does natural selection favor AIs over humans?
cdkg · 2024-10-03T18:47:43.517Z · comments (1)

[question] Have people given up on iterated distillation and amplification?
Chris_Leong · 2024-07-19T12:23:04.625Z · answers+comments (1)

[link] The Great Organism Theory of Evolution
rogersbacon · 2024-08-10T12:26:02.434Z · comments (0)

Economics Roundup #4
Zvi · 2024-10-15T13:20:06.923Z · comments (4)

D/acc AI Security Salon
Allison Duettmann (allison-duettmann) · 2024-10-19T22:17:57.067Z · comments (0)

Tokenized SAEs: Infusing per-token biases.
tdooms · 2024-08-04T09:17:46.755Z · comments (20)

Review: “The Case Against Reality”
David Gross (David_Gross) · 2024-10-29T13:13:29.643Z · comments (9)

Lab governance reading list
Zach Stein-Perlman · 2024-10-25T18:00:28.346Z · comments (3)

[link] To Be Born in a Bag
Niko_McCarty (niko-2) · 2024-10-06T17:21:00.605Z · comments (1)

Why I'm bearish on mechanistic interpretability: the shards are not in the network
tailcalled · 2024-09-13T17:09:25.407Z · comments (40)

[link] Update on the Mysterious Trump Buyers on Polymarket
Annapurna (jorge-velez) · 2024-11-04T19:22:06.540Z · comments (9)

[link] [Linkpost] A Case for AI Consciousness
cdkg · 2024-07-06T14:52:21.704Z · comments (2)

Looking for Goal Representations in an RL Agent - Update Post
CatGoddess · 2024-08-28T16:42:19.367Z · comments (0)

A Second Wetsuit Summer
jefftk (jkaufman) · 2024-07-13T02:00:05.412Z · comments (2)

[link] How Big a Deal are MatMul-Free Transformers?
JustisMills · 2024-06-27T22:28:40.888Z · comments (6)

Why Reflective Stability is Important
Johannes C. Mayer (johannes-c-mayer) · 2024-09-05T15:28:19.913Z · comments (2)

[question] What are the best resources for building gears-level models of how governments actually work?
adamShimi · 2024-08-19T14:05:02.590Z · answers+comments (6)

Scaling Laws and Likely Limits to AI
Davidmanheim · 2024-08-18T17:19:46.597Z · comments (0)

[link] Fragile, Robust, and Antifragile Preference Satisfaction
adamShimi · 2024-11-02T17:25:55.986Z · comments (0)

Announcing the PIBBSS Symposium '24!
DusanDNesic · 2024-09-03T11:19:47.568Z · comments (0)

Ten counter-arguments that AI is (not) an existential risk (for now)
Ariel Kwiatkowski (ariel-kwiatkowski) · 2024-08-13T22:35:15.341Z · comments (5)

Sustainability of Digital Life Form Societies
Hiroshi Yamakawa (hiroshi-yamakawa) · 2024-07-19T13:59:13.973Z · comments (1)

In the Name of All That Needs Saving
pleiotroth · 2024-11-07T15:26:12.252Z · comments (2)

Games of My Childhood: The Troops
Kaj_Sotala · 2024-07-08T11:20:03.033Z · comments (0)

Avoiding the Bog of Moral Hazard for AI
Nathan Helm-Burger (nathan-helm-burger) · 2024-09-13T21:24:34.137Z · comments (12)

"Real AGI"
Seth Herd · 2024-09-13T14:13:24.124Z · comments (20)

Word Spaghetti
Gordon Seidoh Worley (gworley) · 2024-10-23T05:39:20.105Z · comments (9)

[link] Should Sports Betting Be Banned?
Maxwell Tabarrok (maxwell-tabarrok) · 2024-09-21T14:13:35.404Z · comments (2)

[link] AI existential risk probabilities are too unreliable to inform policy
Oleg Trott (oleg-trott) · 2024-07-28T00:59:59.497Z · comments (5)

Bryan Johnson and a search for healthy longevity
NancyLebovitz · 2024-07-27T15:28:13.117Z · comments (17)

Advisors for Smaller Major Donors?
jefftk (jkaufman) · 2024-11-06T14:30:06.187Z · comments (2)

Determining the power of investors over Frontier AI Labs is strategically important to reduce x-risk
Lucie Philippon (lucie-philippon) · 2024-07-25T01:12:20.518Z · comments (7)

[question] How great is the utility of "saving" endangered languages?
SpectrumDT · 2024-08-20T13:14:32.895Z · answers+comments (29)

Finding Deception in Language Models
Esben Kran (esben-kran) · 2024-08-20T09:42:13.060Z · comments (4)

Bridging the VLM and mech interp communities for multimodal interpretability
Sonia Joseph (redhat) · 2024-10-28T14:41:41.969Z · comments (5)

Rabin's Paradox
Charlie Steiner · 2024-08-14T05:40:25.572Z · comments (40)

Can Large Language Models effectively identify cybersecurity risks?
emile delcourt (emile-delcourt) · 2024-08-30T20:20:21.345Z · comments (0)

D&D.Sci: Whom Shall You Call? [Evaluation and Ruleset]
abstractapplic · 2024-07-17T22:34:25.111Z · comments (5)

[link] Instruction Following without Instruction Tuning
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-09-24T13:49:09.078Z · comments (0)

Is Text Watermarking a lost cause?
egor.timatkov · 2024-10-01T16:20:51.113Z · comments (13)

[link] How (and why) to get tested for CMV
Metacelsus · 2024-07-15T20:06:05.649Z · comments (0)

[link] AlignedCut: Visual Concepts Discovery on Brain-Guided Universal Feature Space
Bogdan Ionut Cirstea (bogdan-ionut-cirstea) · 2024-09-14T23:23:26.296Z · comments (1)

Four ways I've made bad decisions
Sodium · 2024-07-14T22:18:47.630Z · comments (1)

The new UK government's stance on AI safety
Elliot Mckernon (elliot) · 2024-07-31T15:23:59.235Z · comments (0)

My career exploration: Tools for building confidence
lynettebye · 2024-09-13T11:37:55.843Z · comments (0)

[link] AI Safety Newsletter #39: Implications of a Trump Administration for AI Policy Plus, Safety Engineering
Corin Katzke (corin-katzke) · 2024-07-29T17:50:52.454Z · comments (1)

[link] Jonothan Gorard:The territory is isomorphic to an equivalence class of its maps
Daniel C (harper-owen) · 2024-09-07T10:04:47.840Z · comments (18)

← previous page (newer posts) · next page (older posts) →

Archive

Recent comments

rhollerith_dot_com on Thoughts after the Wolfram and Yudkowsky discussion

Near the end of the interview, Wolfram say that he cannot do much processing of what was discussed "in real time", which strongly suggests to me that he expects to process it slowly over the next days and weeks. I.e., he is now trying to reassure himself that the AI project won't kill his four children or any grandchildren he has or will have. Because Wolfram is much better AFAICT at this kind of slow "strategic rational" deliberation than most people at his level of life accomplishment, there is a good chance he will fail to find his slow deliberations reassuring, in which case he probably then declares himself an AI doomer. Specifically, my probability is .21 that 18 months from now, Wolfram will have come out publicly against allowing ambitious frontier AI research to continue. P = .21 is much much higher than my P for the average 65-year-old of his intellectual stature who is not specialized in AI, so I thought it worthwhile to announce my probability.

I agree with another comment that Wolfram has not "done the reading" on AI extinction risk. Being able to watch his face while he confronts some of the convolutions for the first time made it easier, not harder, for me to predict where he will come down 18 months from now.

dagon on A Theory of Equilibrium in the Offense-Defense Balance

I think this is the right way to think of most anti-inductive (planner-adversarial or competitive exploitation) situations. Where there are multiple dimensions of assymetric capabilities, any change is likely to shift the equilibrium, but not necessarily by as much as the shift in component.

That said, tipping points are real, and sometimes a component shift can have a BIGGER effect, because it shifts the search to a new local minimum. In most cases, this is not actully entirely due to that component change, but the discovery and reconfiguration is triggered by it. The rise of mass shootings in the US is an example - there are a lot of causes, but the shift happened quite quickly.

Offense-defense is further confused as an example, because there are at least two different equilibria involved. when you say

The offense-defense balance is a concept that compares how easy it is to protect vs conquer or destroy resources.

Conquer control vs retain control is a different thing than destroy vs preserve. Frank Herbert claimed (via fiction) that "The people who can destroy a thing, they control it." but it's actually true in very few cases. The equilibrium of who gets what share of the value from something can shift very separately from the equilibrium of how much total value that thing provides.

sarahconstantin on sarahconstantin's Shortform

links 11/15/2024: https://roamresearch.com/#/app/srcpublic/page/11-15-2024

https://www.reddit.com/r/self/comments/1gleyhg/people_like_me_are_the_reason_trump_won/ a moderate/swing-voter (Obama, Trump, Biden) explains why he voted for Trump this time around:
- he thinks Kamala Harris was an "empty shell" and unlikable and he felt the campaign was manipulative and deceptive.
- he didn't like that she seemed to be a "DEI hire", but doesn't have a problem with black or female candidates generally, it's just that he resents cynical demographic box-checking.
  - this is a coherent POV -- he did vote for Obama, after all. and plenty of people are like "I want the best person regardless of demographics, not a person chosen for their demographics."
    - hm. why doesn't it seem natural to portray Obama as a "DEI hire"? his campaign made a bigger deal about race than Harris's, and he was criticized a lot for inexperience.
      - One guess: it's laughable to think Obama was chosen by anyone besides himself. He was not the Democratic Party's anointed -- that was Hillary. He's clearly an ambitious guy who wanted to be president on his own initiative and beat the odds to get the nomination. He can't be a "DEI hire" because he wasn't a hire at all.
      - another guess: Obama is clearly smart, speaks/writes in complete sentences, and welcomes lots of media attention and talks about his policies, while Harris has a tendency towards word salad, interviews poorly, avoids discussing issues, etc.
      - another guess: everyone seems to reject the idea that people prefer male to female candidates, but I'm still really not sure there isn't a gender effect! This is very vibes-based on my part, and apparently the data goes the other way, so very uncertain here.
https://trevorklee.substack.com/p/if-langurs-can-drink-seawater-can Trevor Klee on adaptations for drinking seawater

habryka4 on Seven lessons I didn't learn from election day

This was a really good analysis of a bunch of election stuff that I hadn't seen presented clearly like this anywhere else. If it wasn't about elections and news I would curate it.

maxwell-peterson on Seven lessons I didn't learn from election day

A good post, of interest to all across the political spectrum, marred by the mistake at the end to become explicitly politically opinionated and say bad things about those who voted differently than OP.

sharmake-farah on Seven lessons I didn't learn from election day

The one thing I'll say on the election is that a lot of people are using Kamala Harris's loss to put in their own reasons for why Kamala Harris lost that are essentially ideological propaganda.

Basically only the story that she was doomed from the start because of global backlash against incumbents for inflation matches the evidence best, and a lot of other theories are very much there for ideological purposes.

ben-lang on Seven lessons I didn't learn from election day

I think this question is maybe logically flawed.

Say I have a shuffled deck of cards. You say the probability that the top card is the Ace of Spades is 1/52. I show you the top card, it is the 5 of diamonds. I then ask, knowing what you know now, what probability you should have given.

I picked a card analogy, and you picked a dice one. I think the card one is better in this case, for weird idiosyncratic reasons I give below that might just be irrelevant to the train of thought you are on.

Cards vs Dice: If we could reset the whole planet to its exact state 1 week before the election then we would I think get the same result (I don't think quantum will mess with us in one week). What if we do a coarser grained reset? So if there was a kettle of water at 90 degrees a week before the election that kettle is reset to contain the same volume of water in the same part of my kitchen, and the water is still 90 degrees, but the individual water molecules have different momenta. For some value of "macro" the world is reset to the same macrostate but not the same microstate, it had 1 week before election day. If we imagine this experiment I still think Trump wins every (or almost every) time, given what we know now. For me to think this kind of thermal-level randomness made a difference in one week it would have to have been much closer.

In my head things that change on the coarse-grained reset feel more like unrolled dice, and things that don't more like facedown cards. Although in detail the distinction is fuzzy: it is based on an arbitrary line between micro an macro, and it is time sensitive, because cards that are going to be shuffled in the future are in the same category as dice.

EDIT: I did as asked, and replied without reading your comments on the EA forum. Reading that I think we are actually in complete agreement, although you actually know the proper terms for the things I gestured at.

abramdemski on o1 is a bad idea

Thanks for this response! I agree with the argument. I'm not sure what it would take to ensure CoT faithfulness, but I agree that it is an important direction to try and take things; perhaps even the most promising direction for near-term frontier-lab safety research (given the incentives pushing those labs in o1-ish directions).

mondsemmel on Lao Mein's Shortform

That might very well help, yes. However, two thoughts, neither at all well thought out:

If the Trump administration does fight OpenAI, let's hope Altman doesn't manage to judo flip the situation like he did with the OpenAI board saga, and somehow magically end up replacing Musk or Trump in the upcoming administration...
Musk's own track record on AI x-risk is not great. I guess he did endorse California's SB 1047, so that's better than OpenAI's current position. But he helped found OpenAI, and recently founded another AI company. There's a scenario where we just trade extinction risk from Altman's OpenAI for extinction risk from Musk's xAI.

notfnofn on D0TheMath's Shortform

For example, if you ask mathematicians whether ZFC + not Consistent(ZFC) is consistent, they will say "no, of course not!"

Certainly not a mathematician with any background in logic.

Similarly, if we have the Peano axioms without induction, mathematicians will say that induction should be there, but in fact you cannot prove this fact from within Peano

What exactly do you mean here? That the Peano axioms minus induction do not adequately characterize the natural numbers because they have nonstandard models? Why would I then be surprised that induction (which does characterize the natural numbers) can't be proven from the remaining axioms?

and given induction mathematicians will say transfinite induction should be there.

Transfinite induction is a consequence of ZF that makes sense in the context in sets. Yes, it can prove additional statements about the natural numbers (e.g. goodstein sequences converge), but why would it be added as an axiom when the natural numbers are already characterized up to isomorphism by the Peano axioms? How would you even add it as an axiom in the language of natural numbers? (that last question is non-rhetorical).