Posts

Overview of strong human intelligence amplification methods 2024-10-08T08:37:18.896Z
TsviBT's Shortform 2024-06-16T23:22:54.134Z
Koan: divining alien datastructures from RAM activations 2024-04-05T18:04:57.280Z
What could a policy banning AGI look like? 2024-03-13T14:19:07.783Z
A hermeneutic net for agency 2024-01-01T08:06:30.289Z
What is wisdom? 2023-11-14T02:13:49.681Z
Human wanting 2023-10-24T01:05:39.374Z
Hints about where values come from 2023-10-18T00:07:58.051Z
Time is homogeneous sequentially-composable determination 2023-10-08T14:58:15.913Z
Telopheme, telophore, and telotect 2023-09-17T16:24:03.365Z
Sum-threshold attacks 2023-09-08T17:13:37.044Z
Fundamental question: What determines a mind's effects? 2023-09-03T17:15:41.814Z
Views on when AGI comes and on strategy to reduce existential risk 2023-07-08T09:00:19.735Z
The fraught voyage of aligned novelty 2023-06-26T19:10:42.195Z
Provisionality 2023-06-19T11:49:06.680Z
Explicitness 2023-06-12T15:05:04.962Z
Wildfire of strategicness 2023-06-05T13:59:17.316Z
The possible shared Craft of deliberate Lexicogenesis 2023-05-20T05:56:41.829Z
A strong mind continues its trajectory of creativity 2023-05-14T17:24:00.337Z
Better debates 2023-05-10T19:34:29.148Z
An anthropomorphic AI dilemma 2023-05-07T12:44:48.449Z
The voyage of novelty 2023-04-30T12:52:16.817Z
Endo-, Dia-, Para-, and Ecto-systemic novelty 2023-04-23T12:25:12.782Z
Possibilizing vs. actualizing 2023-04-16T15:55:40.330Z
Expanding the domain of discourse reveals structure already there but hidden 2023-04-09T13:36:28.566Z
Ultimate ends may be easily hidable behind convergent subgoals 2023-04-02T14:51:23.245Z
New Alignment Research Agenda: Massive Multiplayer Organism Oversight 2023-04-01T08:02:13.474Z
Descriptive vs. specifiable values 2023-03-26T09:10:56.334Z
Shell games 2023-03-19T10:43:44.184Z
Are there cognitive realms? 2023-03-12T19:28:52.935Z
Do humans derive values from fictitious imputed coherence? 2023-03-05T15:23:04.065Z
Counting-down vs. counting-up coherence 2023-02-27T14:59:39.041Z
Does novel understanding imply novel agency / values? 2023-02-19T14:41:40.115Z
Please don't throw your mind away 2023-02-15T21:41:05.988Z
The conceptual Doppelgänger problem 2023-02-12T17:23:56.278Z
Control 2023-02-05T16:16:41.015Z
Structure, creativity, and novelty 2023-01-29T14:30:19.459Z
Gemini modeling 2023-01-22T14:28:20.671Z
Non-directed conceptual founding 2023-01-15T14:56:36.940Z
Dangers of deference 2023-01-08T14:36:33.454Z
The Thingness of Things 2023-01-01T22:19:08.026Z
[link] The Lion and the Worm 2022-05-16T20:40:22.659Z
Harms and possibilities of schooling 2022-02-22T07:48:09.542Z
Rituals and symbolism 2022-02-10T16:00:14.635Z
Index of some decision theory posts 2017-03-08T22:30:05.000Z
Open problem: thin logical priors 2017-01-11T20:00:08.000Z
Training Garrabrant inductors to predict counterfactuals 2016-10-27T02:41:49.000Z
Desiderata for decision theory 2016-10-27T02:10:48.000Z
Failures of throttling logical information 2016-02-24T22:05:51.000Z
Speculations on information under logical uncertainty 2016-02-24T21:58:57.000Z

Comments

Comment by TsviBT on How to Make Superbabies · 2025-02-21T07:06:02.669Z · LW · GW

No I mean like a person can't 10x their compute.

Comment by TsviBT on How to Make Superbabies · 2025-02-21T02:49:52.710Z · LW · GW

Can you comment your current thoughts on rare haplotypes?

Comment by TsviBT on How to Make Superbabies · 2025-02-20T23:53:47.026Z · LW · GW

I'd say this is true, in that human misalignments don't threaten the human species, or even billions of people, whereas AI does, so in that regard I admit human misalignment is less impactful than AGI misalignment.

Right, ok, agreed.

the plasticity/sensitivity of values goes way down when you are an adult, and changing values is much, much harder.

I agree qualitatively, but I do mean to say he's in charge of Germany, but somehow has hours of free time every day to spend with the whisperer. If it's in childhood I would guess you could do it with a lot less contact, though not sure. TBC, the whisperer here would be considered a world-class, like, therapist or coach or something, so I'm not saying it's easy. My point is that I have a fair amount of trust in "human decision theory" working out pretty well in most cases in the long run with enough wisdom.

I even think something like this is worth trying with present-day AGI researchers (what I call "confrontation-worthy empathy"), though that is hard mode because you have so much less access.

Comment by TsviBT on How to Make Superbabies · 2025-02-20T23:13:32.956Z · LW · GW

lots of humans are not in fact aligned with each other,

Ok... so I think I understand and agree with you here. (Though plausibly we'd still have significant disagreement; e.g. I think it would be feasible to bring even Hitler back and firmly away from the death fever if he spent, IDK, a few years or something with a very skilled listener / psychic helper.)

The issue in this discourse, to me, is comparing this with AGI misalignment. It's conceptually related in some interesting ways, but in practical terms they're just extremely quantitatively different. And, naturally, I care about this specific non-comparability being clear because it says whether to do human intelligence enhancement; and in fact many people cite this as a reason to not do human IE.

Comment by TsviBT on How to Make Superbabies · 2025-02-20T22:58:48.868Z · LW · GW

I mostly just want people to pay attention to this problem.

Ok. To be clear, I strongly agree with this. I think I've been responding to a claim (maybe explicit, or maybe implicit / imagined by me) from you like: "There's this risk, and therefore we should not do this.". Where I want to disagree with the implication, not the antecedent. (I hope to more gracefully agree with things like this. Also someone should make a LW post with a really catchy term for this implication / antecedent discourse thing, or link me the one that's already been written.)

But I do strongly disagree with the conclusion "...we should not do this", to the point where I say "We should basically do this as fast as possible, within the bounds of safety and sanity.". The benefits are large, the risks look not that bad and largely ameliorable, and in particular the need regarding existential risk is great and urgent.

That said, more analysis is definitely needed. Though in defense of the pro-germline engineering position, there's few resources, and everyone has a different objection.

Comment by TsviBT on How to Make Superbabies · 2025-02-20T22:48:55.613Z · LW · GW

You shouldn't and won't be satisfied with this alone, as it doesn't deal with or even emphasize any particular peril; but to be clear, I have definitely thought about the perils: https://berkeleygenomics.org/articles/Potential_perils_of_germline_genomic_engineering.html

Comment by TsviBT on How to Make Superbabies · 2025-02-20T22:46:53.274Z · LW · GW

Also true, though maybe only for O(99%) of people.

Comment by TsviBT on How to Make Superbabies · 2025-02-20T22:43:02.564Z · LW · GW

Small-scale failures give us data about possible large-scale failures.

But you don't go from a 160 IQ person with a lot of disagreeability and ambition, who ends up being a big commercial player or whatnot, to 195 IQ and suddenly get someone who just sits in their room for a decade and then speaks gibberish into a youtube livestream and everyone dies, or whatever. The large-scale failures aren't feasible for humans acting alone. For humans acting very much not alone, like big AGI research companies, yeah that's clearly a big problem. But I don't think the problem is about any of the people you listed having too much brainpower.

(I feel we're somewhat talking past each other, but I appreciate the conversation and still want to get where you're coming from.)

Comment by TsviBT on How to Make Superbabies · 2025-02-20T22:37:44.396Z · LW · GW

I'm saying that (waves hands vigorously) 99% of people are beneficent or "neutral" (like, maybe not helpful / generous / proactively kind, but not actively harmful, even given the choice) in both intention and in action. That type of neutral already counts as in a totally different league of being aligned compared to AGI.

one human group is vastly unaligned to another human group

Ok, yes, conflict between large groups is something to be worried about, though I don't much see the connection with germline engineering. I thought we were talking about, like, some liberal/techie/weirdo people have some really really smart kids, and then those kids are somehow a threat to the future of humanity that's comparable to a fast unbounded recursive self-improvement AGI foom.

Comment by TsviBT on How to Make Superbabies · 2025-02-20T21:45:10.703Z · LW · GW

Tell that to all the other species that went extinct as a result of our activity on this planet?

Individual humans.

Brent Dill, Ziz, Sam Bankman-Fried, etc.

  1. These are incredibly small peanuts compared to AGI omnicide.
  2. You're somehow leaving out all the people who are smarter than those people, and who were great for the people around them and humanity? You've got like 99% actually alignment or something, and you're like "But there's some chance it'll go somewhat bad!"... Which, yes, we should think about this, and prepare and plan and prevent, but it's just a totally totally different calculus from AGI.
Comment by TsviBT on How to Make Superbabies · 2025-02-20T21:19:42.712Z · LW · GW

It's utterly different.

  • Humans are very far from fooming.
    • Fixed skull size; no in silico simulator.
    • Highly dependent on childhood care.
    • Highly dependent on culturally transmitted info, including in-person.
  • Humans, genomically engineered or not, come with all the stuff that makes humans human. Fear, love, care, empathy, guilt, language, etc. (It should be banned, though, to remove any human universals, though defining that seems tricky.) So new humans are close to us in values-space, and come with the sort of corrigibility that humans have, which is, you know, not a guarantee of safety, but still some degree of (okay I'm going to say something that will trigger your buzzword detector but I think it's a fairly precise description of something clearly real) radical openness to co-creating shared values.
Comment by TsviBT on How to Make Superbabies · 2025-02-20T20:05:05.748Z · LW · GW

This is a big ethical issue. Also, I haven't checked, but I'd guess that generally to have much of a noticeable effect, you're stepping somewhat to the edge of / outside of the natural range, which carries risks. Separately, this might not even be good on purely instrumental grounds; altriciality is quite plausibly really important for intelligence!

Comment by TsviBT on How to Make Superbabies · 2025-02-20T19:33:15.978Z · LW · GW

And then suddenly it's different for personality? Kinda weird.

Comment by TsviBT on How to Make Superbabies · 2025-02-20T18:41:27.872Z · LW · GW

~80% of variance is additive.

Is this what you meant to say? Citation?

Comment by TsviBT on How to Make Superbabies · 2025-02-20T18:17:09.445Z · LW · GW

This is a good argument for not going outside the human envelope in one shot. But if you're firmly within the realm where natural human genomes are, we have 8 billion natural experiments running around, some of which are sibling RCTs.

Comment by TsviBT on How to Make Superbabies · 2025-02-20T18:08:31.570Z · LW · GW

After I finish my methods article, I want to lay out a basic picture of genomic emancipation. Genomic emancipation means making genomic liberty a right and a practical option. In my vision, genomic liberty is quite broad: it would include for example that parents should be permitted and enabled to choose:

  • to enhance their children (e.g. supra-normal health; IQ at the outer edges of the human envelope); and/or
  • to propagate their own state even if others would object (e.g. blind people can choose to have blind children); and/or
  • to make their children more normal even if there's no clear justification through beneficence (I would go so far as to say that, for example, parents can choose to make their kid have a lower IQ than a random embryo from the parents would be in expectation, if that brings the kid closer to what's normal).

These principles are more narrow than general genomic liberty ("parents can do whatever they please"), and I think have stronger justifications. I want to make these narrower "tentpole" principles inside of the genomic liberty tent, because the wider principle isn't really tenable, in part for the reasons you bring up. There are genomic choices that should be restricted--perhaps by law, or by professional ethics for clinicians, or by avoiding making it technically feasible, or by social stigma. (The implementation seems quite tricky; any compromise of full genomic liberty does come with costs as well as preventing costs. And at least to some small extent, it erodes the force of genomic liberty's contraposition to eugenics, which seeks to impose population-wide forces on individual's procreative choice.)

Examples:

  • As you say, if there's a very high risk of truly egregious behavior, that should be pushed against somehow.
    • Example: People should not make someone who is 170 Disagreeable Quotient and 140 Unconscientiousness Quotient, because that is most of the way to being a violent psychopath.
    • Counterexample: People should, given good information, be able to choose to have a kid who is 130 Disagreeable Quotient and 115 Unconscientiousness Quotient, because, although there might be associated difficulties, that's IIUC a personality profile enriched with creative genius.
  • People should not be allowed to create children with traits specifically designed to make the children suffer. (Imagine for instance a parent who thinks that suffering, in itself, builds character or makes you productive or something.)
  • Case I'm unsure about, needs more investigation: Autism plus IQ might be associated with increased suicidal ideation (https://www.sciencedirect.com/science/article/abs/pii/S1074742722001228). Not sure what the implication should be.

Another thing to point out is that to a significant degree, in the longer-term, many of these things should self-correct, through the voice of the children (e.g. if a deaf kid grows up and starts saying "hey, listen, I love my parents and I know they wanted what was best for me, but I really don't like that I didn't get to hear music and my love's voice until I got my brain implant, please don't do the same for your kid"), and through seeing the results in general. If someone is destructively ruthless, it's society's job to punish them, and it's parents's job to say "ah, that is actually not good".

Comment by TsviBT on Martin Randall's Shortform · 2025-02-19T03:15:25.620Z · LW · GW

Good point... Still unsure, I suspect it would still tilt people toward not having the missing mood about AGI x-risk.

Comment by TsviBT on Martin Randall's Shortform · 2025-02-17T22:40:41.984Z · LW · GW

While the object level calculation is central of course, I'd want to note that there's a symbolic value to cryonics. (Symbolic action is tricky, and I agree with not straightforwardly taking symbolic action for the sake of the symbolism, but anyway.) If we (broadly) were more committed to Life then maybe some preconditions for AGI researchers racing to destroy the world would be removed.

Comment by TsviBT on Matthias Dellago's Shortform · 2025-02-14T04:03:48.406Z · LW · GW

Very relevant: https://web.archive.org/web/20090608111223/http://www.paul-almond.com/WhatIsALowLevelLanguage.htm

Comment by TsviBT on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-14T00:09:34.484Z · LW · GW

Ok, I think I see what you're saying. To check part of my understanding: when you say "AI R&D is fully automated", I think you mean something like:

Most major AI companies have fired almost all of their SWEs. They still have staff to physically build datacenters, do business, etc.; and they have a few overseers / coordinators / strategizers of the fleet of AI R&D research gippities; but the overseers are acknowledged to basically not be doing much, and not clearly be even helping; and the overall output of the research group is "as good or better" than in 2025--measured... somehow.

Comment by TsviBT on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-13T23:58:35.322Z · LW · GW

Ok. So I take it you're very impressed with the difficulty of the research that is going on in AI R&D.

we can agree that once the AIs are automating whole companies stuff

(FWIW I don't agree with that; I don't know what companies are up to, some of them might not be doing much difficult stuff and/or the managers might not be able to or care to tell the difference.)

Comment by TsviBT on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-13T23:46:58.127Z · LW · GW

Thanks... but wait, this is among the most impressive things you expect to see? (You know more than I do about that distribution of tasks, so you could justifiably find it more impressive than I do.)

Comment by TsviBT on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-13T21:29:32.087Z · LW · GW

What are some of the most impressive things you do expect to see AI do, such that if you didn't see them within 3 or 5 years, you'd majorly update about time to the type of AGI that might kill everyone?

Comment by TsviBT on Nick Land: Orthogonality · 2025-02-11T01:47:31.194Z · LW · GW

would you think it wise to have TsviBT¹⁹⁹⁹ align contemporary Tsvi based on his values? How about vice versa?

It would be mostly wise either way, yeah, but that's relying on both directions being humble / anapartistic.

Comment by TsviBT on Nick Land: Orthogonality · 2025-02-11T01:31:27.486Z · LW · GW

do you think stable meta-values are to be observed between australopiteci and say contemporary western humans?

on the other hand: do values across primitive tribes or early agricultural empires not look surprisingly similar?

I'm not sure I understand the question, or rather, I don't know how I could know this. Values are supposed to be things that live in an infinite game / Nomic context. You'd have to have these people get relatively more leisure before you'd see much of their values.

Comment by TsviBT on Nick Land: Orthogonality · 2025-02-11T01:12:10.758Z · LW · GW

I mean, I don't know how it works in full, that's a lofty and complex question. One reason to think it's possible is that there's a really big difference between the kind of variation and selection we do in our heads with ideas and the kind evolution does with organisms. (Our ideas die so we don't have to and so forth.) I do feel like some thoughts change some aspects of some of my values, but these are generally "endorsed by more abstract but more stable meta-values", and I also feel like I can learn e.g. most new math without changing any values. Where "values" is, if nothing else, cashed out as "what happens to the universe in the long run due to my agency" or something (it's more confusing when there's peer agents). Mateusz's point is still relevant; there's just lots of different ways the universe can go, and you can choose among them.

Comment by TsviBT on TsviBT's Shortform · 2025-02-10T09:29:27.635Z · LW · GW

I quite dislike earplugs. Partly it's the discomfort, which maybe those can help with; but partly I just don't like being closed away from hearing what's around me. But maybe I'll try those, thanks (even though the last 5 earplugs were just uncomfortable contra promises).

Yeah, I mean I think the music thing is mainly nondistraction. The quiet of night is great for thinking, which doesn't help the sleep situation.

Comment by TsviBT on TsviBT's Shortform · 2025-02-09T09:09:51.585Z · LW · GW

Yep! Without cybernetic control (I mean, melatonin), I have a non-24-hour schedule, and I believe this contributes >10% of that.

Comment by TsviBT on TsviBT's Shortform · 2025-02-08T13:28:56.543Z · LW · GW

(1) was my guess. Another guess is that there's a magazine "GQ".

Comment by TsviBT on TsviBT's Shortform · 2025-02-08T13:00:08.564Z · LW · GW

Ohhh. Thanks. I wonder why I did that.

Comment by TsviBT on TsviBT's Shortform · 2025-02-08T10:26:47.704Z · LW · GW

No yeah that's my experience too, to some extent. But I would say that I can do good mathematical thinking there, including correctly truth-testing; just less good at algebra, and as you say less good at picking up an unfamiliar math concept.

Comment by TsviBT on TsviBT's Shortform · 2025-02-08T09:48:22.264Z · LW · GW

(These are 100% unscientific, just uncritical subjective impressions for fun. CQ = cognitive capacity quotient, like generally good at thinky stuff)

  • Overeat a bit, like 10% more than is satisfying: -4 CQ points for a couple hours.
  • Overeat a lot, like >80% more than is satisfying: -9 CQ points for 20 hours.
  • Sleep deprived a little, like stay up really late but without sleep debt: +5 CQ points.
  • Sleep debt, like a couple days not enough sleep: -11 CQ points.
  • Major sleep debt, like several days not enough sleep: -20 CQ points.
  • Oversleep a lot, like 11 hours: +6 CQ points.
  • Ice cream (without having eaten ice cream in the past week): +5 CQ points.
  • Being outside: +4 CQ points.
  • Being in a car: -8 CQ points.
  • Walking in the hills: +9 CQ points.
  • Walking specifically up a steep hill: -5 CQ points.
  • Too much podcasts: -8 CQ points for an hour.
  • Background music: -6 to -2 CQ points.
  • Kinda too hot: -3 CQ points.
  • Kinda too cold: +2 CQ points.

(stimulants not listed because they tend to pull the features of CQ apart; less good at real thinking, more good at relatively rote thinking and doing stuff)

Comment by TsviBT on So You Want To Make Marginal Progress... · 2025-02-08T07:24:52.843Z · LW · GW

When they're nearing the hotel, Alice gets the car's attention. And she's like, "Listen guys, I have been lying to you. My real name is Mindy. Mindy the Middlechainer.".

Comment by TsviBT on ozziegooen's Shortform · 2025-02-05T19:31:55.811Z · LW · GW

I'm saying that just because we know algorithms that will successfully leverage data and compute to set off an intelligence explosion (...ok I just realized you wrote TAI but IDK what anyone means by anything other than actual AGI), doesn't mean we know much about how they leverage it and how that influences the explody-guy's long-term goals.

Comment by TsviBT on ozziegooen's Shortform · 2025-02-05T18:50:43.476Z · LW · GW

I assume that at [year(TAI) - 3] we'll have a decent idea of what's needed

Why?? What happened to the bitter lesson?

Comment by TsviBT on evhub's Shortform · 2025-02-05T13:45:47.797Z · LW · GW

Isn't this what the "coherent" part is about? (I forget.)

Comment by TsviBT on Nick Land: Orthogonality · 2025-02-05T09:09:29.362Z · LW · GW

A start of one critique is:

It simply means Darwinian processes have no limits that matter to us.

Not true! Roughly speaking, we can in principle just decide to not do that. A body can in principle have an immune system that doesn't lose to infection; there could in principle be a world government that picks the lightcone's destiny. The arguments about novel understanding implying novel values might be partly right, but they don't really cut against Mateusz's point.

Comment by TsviBT on Nick Land: Orthogonality · 2025-02-05T07:55:22.786Z · LW · GW

Reason to care about engaging /acc:

https://www.lesswrong.com/posts/HE3Styo9vpk7m8zi4/evhub-s-shortform?commentId=kDjrYXCXgNvjbJfaa

I've recently been thinking that it's a mistake to think of this type of thing--"what to do after the acute risk period is safed"--as being a waste of time / irrelevant; it's actually pretty important, specifically because you want people trying to advance AGI capabilities to have an alternative, actually-good vision of things. A hypothesis I have is that many of them are in a sense genuinely nihilistic/accelerationist; "we can't imagine the world after AGI, so we can't imagine it being good, so it cannot be good, so there is no such thing as a good future, so we cannot be attached to a good future, so we should accelerate because that's just what is happening".

Comment by TsviBT on Nick Land: Orthogonality · 2025-02-05T07:54:25.830Z · LW · GW

I strong upvoted, not because it's an especially helpful post IMO, but because I think /acc needs better critique, so there should be more communication. I suspect the downvotes are more about the ideological misalignment than the quality.

Given the quality of the post, I think it would not be remotely rude to respond with a comment like "These are are well-tread topics; you should read X and Y and Z if you want to participate in a serious discussion about this.". But no one wrote that comment, and what would X, Y, Z be?? One could probably correct some misunderstandings in the post this way just by linking to the LW wiki on Orthogonality or whatever, but I personally wouldn't know what to link to, to actually counter the actual point.

Comment by TsviBT on evhub's Shortform · 2025-02-05T06:59:35.470Z · LW · GW

(Interesting. FWIW I've recently been thinking that it's a mistake to think of this type of thing--"what to do after the acute risk period is safed"--as being a waste of time / irrelevant; it's actually pretty important, specifically because you want people trying to advance AGI capabilities to have an alternative, actually-good vision of things. A hypothesis I have is that many of them are in a sense genuinely nihilistic/accelerationist; "we can't imagine the world after AGI, so we can't imagine it being good, so it cannot be good, so there is no such thing as a good future, so we cannot be attached to a good future, so we should accelerate because that's just what is happening".)

Comment by TsviBT on Do you have High-Functioning Asperger's Syndrome? · 2025-02-05T05:31:35.593Z · LW · GW

It's not clear why anyone would want to claim a self-diagnosis of that, since little about it is 'egosyntonic', as the psychiatrists say.

Since a friend mentioned I might be schizoid, I've been like "...yeah? somewhat? maybe? seems mixed? aren't I just avoidant? but I feel more worried about relating than about being rejected?", though I'm not very motivated to learn a bunch about it. So IDK. But anyway, re/ egosyntonicity:

  • Compared to avoidant, schizoid seems vaguely similar, but less pathetic; less needy or cowardly.
  • Schizoid has some benefits of "disagreeability, but not as much of an asshole". Thinking for yourself, not being taken in by common modes of being.
  • Schizoid is maybe kinda like "I have a really really high bar for who I want to relate to", which is kinda high-status.
Comment by TsviBT on Thread for Sense-Making on Recent Murders and How to Sanely Respond · 2025-02-04T21:11:56.272Z · LW · GW

Just FYI none of what you said responds to anything I said, AFAICT. Are you just arguing "Ziz is bad"? My comment is about what causes people to end up the way Ziz ended up, which is relevant to your question "Is there a lesson to learn?".

By the way, is there an explanation somewhere what actually happened? (Not just what Ziz believed.)

Somewhere on this timeline I think https://x.com/jessi_cata/with_replies

Comment by TsviBT on Viliam's Shortform · 2025-02-04T14:06:04.473Z · LW · GW

The rest of the team should update about listening to her concerns.

I believe (though my memory in general is very very fuzzy) that I was the one who most pushed for Ziz and Gwen to be at that workshop. (I don't recall talking with Anna about that at all before the workshop, and don't see anything about that in my emails on a quick look, though I could very easily have forgotten. I do recall her saying at the workshop that she thought Ziz shouldn't be there.) I did indeed update later (also based on other things) quite strongly that I really truly cannot detect various kinds of pathology, and therefore have to be much more circumspect, deferent to others about this, not making lots of decisions like this, etc. (I do think there was good reason to have Ziz at the workshop though, contra others; Ziz was indeed smart and an interesting original insightful thinker, which is in short supply. I'm also unclear on what evidence Anna had at the time, and the extent to which the community's behavior with Ziz was escalatorily causing Ziz's eventual fate.)

Anna seems like a good judge of character,

Uh, for the record, I would definitely take it very seriously, probably as pretty strong or very strong evidence, if Anna says that someone is harmful / evil / etc.; but I would not take it as strong evidence of someone not being so, if she endorses them.

Comment by TsviBT on Thread for Sense-Making on Recent Murders and How to Sanely Respond · 2025-02-04T13:46:37.402Z · LW · GW

There's a lot more complexity, obviously, but one thing that sticks out to me is this paragraph, from https://sinceriously.blog-mirror.com/net-negative/ :

I described how I felt like I was the only one with my values in a world of flesh eating monsters, how it was horrifying seeing the amoral bullet biting consistency of the rationality community, where people said it was okay to eat human babies as long as they weren’t someone else’s property if I compared animals to babies. How I was constantly afraid that their values would leak into me and my resolve would weaken and no one would be judging futures according to sentient beings in general. How it was scary Eliezer Yudkowsky seemed to use “sentient” to mean “sapient”. How I was constantly afraid if I let my brain categorize them as my “in-group” then I’d lose my values.

This is one among several top hypothesis-parts for something at the core of how Ziz, and by influence other Zizians, gets so far gone from normal structures of relating. It is indeed true that normal people (I mean, including the vast majority of rationalists) live deep in an ocean of {algorithm, stance, world, god}-sharing with people around them. And it's true that this can infect you in various ways, erode structures in you, erode values. So you can see how someone might think it's a good idea to become oppositional to many normal structures of relating; and how that can be in a reinforcing feedback loop with other people's reactions to your oppositionality and rejection.

(As an example of another top hypothesis-part: I suspect Ziz views betrayal of values (the blackmail payout thing) and betrayal of trust (the behavior of Person A) sort of as described here: https://sideways-view.com/2016/11/14/integrity-for-consequentialists/ In other words, if someone's behavior is almost entirely good, but then in some subtle sneaky ways or high-stakes ways bad, that's an extreme mark against them. Many would agree with the high-stakes part of (my imagined) Ziz's stance here, but many fewer would agree so strongly with the subtle sneaky part.)

If that's a big part of what was going on, it poses a general question (which is partly a question of community behavior and partly a question of mental technology for individuals): How to make it more feasible to get the goods of being in a community, without the bads of value erosion?

Comment by TsviBT on Views on when AGI comes and on strategy to reduce existential risk · 2025-02-03T08:00:12.462Z · LW · GW

really smart people

Differences between people are less directly revelative of what's important in human intelligence. My guess is that all or very nearly all human children have all or nearly all the intelligence juice. We just, like, don't appreciate how much a child is doing in constructing zer world.

the current models have basically all the tools a moderately smart human have, with regards to generating novel ideas

Why on Earth do you think this? (I feel like I'm in an Asch Conformity test, but with really really high production value. Like, after the experiment, they don't tell you what the test was about. They let you take the card home. On the walk home you ask people on the street, and they all say the short line is long. When you get home, you ask your housemates, and they all agree, the short line is long.)

I don't see what's missing that a ton of training on a ton of diverse, multimodal tasks + scaffoldin + data flywheel isn't going to figure out.

My response is in the post.

Comment by TsviBT on Views on when AGI comes and on strategy to reduce existential risk · 2025-02-03T02:40:49.017Z · LW · GW

I'm curious if you have a sense from talking to people.

More recently I've mostly disengaged (except for making kinda-shrill LW comments). Some people say that "concepts" aren't a thing, or similar. E.g. by recentering on performable tasks, by pointing to benchmarks going up and saying that the coarser category of "all benchmarks" or similar is good enough for predictions. (See e.g. Kokotajlo's comment here https://www.lesswrong.com/posts/oC4wv4nTrs2yrP5hz/what-are-the-strongest-arguments-for-very-short-timelines?commentId=QxD5DbH6fab9dpSrg, though his actual position is of course more complex and nuanced.) Some people say that the training process is already concept-gain-complete. Some people say that future research, such as "curiosity" in RL, will solve it. Some people say that the "convex hull" of existing concepts is already enough to set off FURSI (fast unbounded recursive self-improvement).

(though I feel confused about how to update on the conjunction of those, and the things LLMs are good at — all the ways they don't behave like a person who doesn't understand X, either, for many X.)

True; I think I've heard some various people discussing how to more precisely think of the class of LLM capabilities, but maybe there should be more.

if that's less sample-efficient than what humans are doing, it's not apparent to me that it can't still accomplish the same things humans do, with a feasible amount of brute force

It's often awkward discussing these things, because there's sort of a "seeing double" that happens. In this case, the "double" is:

"AI can't FURSI because it has poor sample efficiency...

  1. ...and therefore it would take k orders of magnitude more data / compute than a human to do AI research."
  2. ...and therefore more generally we've not actually gotten that much evidence that the AI has the algorithms which would have caused both good sample efficiency and also the ability to create novel insights / skills / etc."

The same goes mutatis mutandis for "can make novel concepts".

I'm more saying 2. rather than 1. (Of course, this would be a very silly thing for me to say if we observed the gippities creating lots of genuine novel useful insights, but with low sample complexity (whatever that should mean here). But I would legit be very surprised if we soon saw a thing that had been trained on 1000x less human data, and performs at modern levels on language tasks (allowing it to have breadth of knowledge that can be comfortably fit in the training set).)

can't still accomplish the same things humans do

Well, I would not be surprised if it can accomplish a lot of the things. It already can of course. I would be surprised if there weren't some millions of jobs lost in the next 10 years from AI (broadly, including manufacturing, driving, etc.). In general, there's a spectrum/space of contexts / tasks, where on the one hand you have tasks that are short, clear-feedback, and common / stereotyped, and not that hard; on the other hand you have tasks that are long, unclear-feedback, uncommon / heterogenous, and hard. The way humans do things is that we practice the short ones in some pattern to build up for the variety of long ones. I expect there to be a frontier of AIs crawling from short to long ones. I think at any given time, pumping in a bunch of brute force can expand your frontier a little bit, but not much, and it doesn't help that much with more permanently ratcheting out the frontier.

AI that's narrowly superhuman on some range of math & software tasks can accelerate research

As you're familiar with, if you have a computer program that has 3 resources bottlenecks A (50%), B (25%), and C (25%), and you optimize the fuck out of A down to ~1%, you ~double your overall efficiency; but then if you optimize the fuck out of A again down to .1%, you've basically done nothing. The question to me isn't "does AI help a significant amount with some aspects of AI research", but rather "does AI help a significant and unboundedly growing amount with all aspects of AI research, including the long-type tasks such as coming up with really new ideas".

AI is transformative enough to motivate a whole lot of sustained attention on overcoming its remaining limitations

This certainly makes me worried in general, and it's part of why my timelines aren't even longer; I unfortunately don't expect a large "naturally-occurring" AI winter.

seems bizarre if whatever conceptual progress is required takes multiple decades

Unfortunately I haven't addressed your main point well yet... Quick comments:

  • Strong minds are the most structurally rich things ever. That doesn't mean they have high algorithmic complexity; obviously brains are less algorithmically complex than entire organisms, and the relevant aspects of brains are presumably considerably simpler than actual brains. But still, IDK, it just seems weird to me to expect to make such an object "by default" or something? Craig Venter made a quasi-synthetic lifeform--but how long would it take us to make a minimum viable unbounded invasive organic replicator actually from scratch, like without copying DNA sequences from existing lifeforms?
  • I think my timelines would have been considered normalish among X-risk people 15 years ago? And would have been considered shockingly short by most AI people.
  • I think most of the difference is in how we're updating, rather than on priors? IDK.
Comment by TsviBT on Views on when AGI comes and on strategy to reduce existential risk · 2025-02-03T01:53:40.935Z · LW · GW

It's a good question. Looking back at my example, now I'm just like "this is a very underspecified/confused example". This deserves a better discussion, but IDK if I want to do that right now. In short the answer to your question is

  • I at least would not be very surprised if gippity-seek-o5-noAngular could do what I think you're describing.
  • That's not really what I had in mind, but I had in mind something less clear than I thought. The spirit is about "can the AI come up with novel concepts", but the issue here is that "novel concepts" are big things, and their material and functioning and history are big and smeared out.

I started writing out a bunch of thoughts, but they felt quite inadequate because I knew nothing about the history of the concept of angular momentum; so I googled around a tiny little bit. The situation seems quite awkward for the angular momentum lesion experiment. What did I "mean to mean" by "scrubbed all mention of stuff related to angular momentum"--presumably this would have to include deleting all subsequent ideas that use angular moment in their definitions, but e.g. did I also mean to delete the notion of cross product?

It seems like angular momentum was worked on in great detail well before the cross product was developed at all explicitly. See https://arxiv.org/pdf/1511.07748 and https://en.wikipedia.org/wiki/Cross_product#History. Should I still expect gippity-seek-o5-noAngular to notice the idea if it doesn't have the cross product available? Even if not, what does and doesn't this imply about this decade's AI's ability to come up with novel concepts?

(I'm going to mull on why I would have even said my previous comment above, given that on reflection I believe that "most" concepts are big and multifarious and smeared out in intellectual history. For some more examples of smearedness, see the subsection here: https://tsvibt.blogspot.com/2023/03/explicitness.html#the-axiom-of-choice)

Comment by TsviBT on The Failed Strategy of Artificial Intelligence Doomers · 2025-02-02T18:25:15.662Z · LW · GW

I can't tell what you mean by much of this (e.g. idk what you mean by "pretty simple heuristics" or "science + engineering SI" or "self-play-ish regime"). (Not especially asking you to elaborate.) Most of my thoughts are here, including the comments:

https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce

I would take a bet with you about what we expect to see in the next 5 years.

Not really into formal betting, but what are a couple Pareto[impressive, you're confident we'll see within 5 years] things?

But more than that, what kind of epistemology do you think I should be doing that I'm not?

Come on, you know. Actually doubt, and then think it through.

I mean, I don't know. Maybe you really did truly doubt a bunch. Maybe you could argue me from 5% omnicide in next ten years to 50%. Go ahead. I'm speaking from informed priors and impressions.

Comment by TsviBT on The Failed Strategy of Artificial Intelligence Doomers · 2025-02-02T18:10:55.193Z · LW · GW

Jessica I'm less sure about. Sam, from large quantities of insights in many conversations. If you want something more legible, I'm what, >300 ELO points better than you at math; Sam's >150 ELO points better than me at math if I'm trained up, now probably more like >250 or something.

Not by David's standard though, lol.

Comment by TsviBT on The Failed Strategy of Artificial Intelligence Doomers · 2025-02-02T12:08:58.091Z · LW · GW

Broadly true, I think.

almost any X that is not trivially verifiable

I'd probably quibble a lot with this.

E.g. there are many activities that many people engage in frequently--eating, walking around, reading, etc etc. Knowledge and skill related to those activities is usually not vibes-based, or only half vibes-based, or something, even if not trivially verifiable. For example, after a few times accidentally growing mold on some wet clothes or under a sink, very many people learn not to leave areas wet.

E.g. anyone who studies math seriously must learn to verify many very non-trivial things themselves. (There will also be many things they will believe partly based on vibes.)

I don't think AI timelines are an unusual topic in that regard.

In that regard, technically, yes, but it's not very comparable. It's unusual in that it's a crucial question that affects very many people's decisions. (IIRC, EVERY SINGLE ONE of the >5 EA / LW / X-derisking adjacent funder people that I've talked to about human intelligence enhancement says "eh, doesn't matter, timelines short".) And it's in an especially uncertain field, where consensus should much less strongly be expected to be correct. And it's subject to especially strong deference and hype dynamics and disinformation. For comparison, you can probably easily find entire communities in which the vibe is very strongly "COVID came from the wet market" and others where it's very strongly "COVID came from the lab". You can also find communities that say "AGI a century away". There are some questions where the consensus is right for the right reasons and it's reasonable to trust the consensus on some class of beliefs. But vibes-based reasoning is just not robust, and nearly all the resources supposedly aimed at X-derisking in general are captured by a largely vibes-based consensus.