Posts

Overview of strong human intelligence amplification methods 2024-10-08T08:37:18.896Z
TsviBT's Shortform 2024-06-16T23:22:54.134Z
Koan: divining alien datastructures from RAM activations 2024-04-05T18:04:57.280Z
What could a policy banning AGI look like? 2024-03-13T14:19:07.783Z
A hermeneutic net for agency 2024-01-01T08:06:30.289Z
What is wisdom? 2023-11-14T02:13:49.681Z
Human wanting 2023-10-24T01:05:39.374Z
Hints about where values come from 2023-10-18T00:07:58.051Z
Time is homogeneous sequentially-composable determination 2023-10-08T14:58:15.913Z
Telopheme, telophore, and telotect 2023-09-17T16:24:03.365Z
Sum-threshold attacks 2023-09-08T17:13:37.044Z
Fundamental question: What determines a mind's effects? 2023-09-03T17:15:41.814Z
Views on when AGI comes and on strategy to reduce existential risk 2023-07-08T09:00:19.735Z
The fraught voyage of aligned novelty 2023-06-26T19:10:42.195Z
Provisionality 2023-06-19T11:49:06.680Z
Explicitness 2023-06-12T15:05:04.962Z
Wildfire of strategicness 2023-06-05T13:59:17.316Z
The possible shared Craft of deliberate Lexicogenesis 2023-05-20T05:56:41.829Z
A strong mind continues its trajectory of creativity 2023-05-14T17:24:00.337Z
Better debates 2023-05-10T19:34:29.148Z
An anthropomorphic AI dilemma 2023-05-07T12:44:48.449Z
The voyage of novelty 2023-04-30T12:52:16.817Z
Endo-, Dia-, Para-, and Ecto-systemic novelty 2023-04-23T12:25:12.782Z
Possibilizing vs. actualizing 2023-04-16T15:55:40.330Z
Expanding the domain of discourse reveals structure already there but hidden 2023-04-09T13:36:28.566Z
Ultimate ends may be easily hidable behind convergent subgoals 2023-04-02T14:51:23.245Z
New Alignment Research Agenda: Massive Multiplayer Organism Oversight 2023-04-01T08:02:13.474Z
Descriptive vs. specifiable values 2023-03-26T09:10:56.334Z
Shell games 2023-03-19T10:43:44.184Z
Are there cognitive realms? 2023-03-12T19:28:52.935Z
Do humans derive values from fictitious imputed coherence? 2023-03-05T15:23:04.065Z
Counting-down vs. counting-up coherence 2023-02-27T14:59:39.041Z
Does novel understanding imply novel agency / values? 2023-02-19T14:41:40.115Z
Please don't throw your mind away 2023-02-15T21:41:05.988Z
The conceptual Doppelgänger problem 2023-02-12T17:23:56.278Z
Control 2023-02-05T16:16:41.015Z
Structure, creativity, and novelty 2023-01-29T14:30:19.459Z
Gemini modeling 2023-01-22T14:28:20.671Z
Non-directed conceptual founding 2023-01-15T14:56:36.940Z
Dangers of deference 2023-01-08T14:36:33.454Z
The Thingness of Things 2023-01-01T22:19:08.026Z
[link] The Lion and the Worm 2022-05-16T20:40:22.659Z
Harms and possibilities of schooling 2022-02-22T07:48:09.542Z
Rituals and symbolism 2022-02-10T16:00:14.635Z
Index of some decision theory posts 2017-03-08T22:30:05.000Z
Open problem: thin logical priors 2017-01-11T20:00:08.000Z
Training Garrabrant inductors to predict counterfactuals 2016-10-27T02:41:49.000Z
Desiderata for decision theory 2016-10-27T02:10:48.000Z
Failures of throttling logical information 2016-02-24T22:05:51.000Z
Speculations on information under logical uncertainty 2016-02-24T21:58:57.000Z

Comments

Comment by TsviBT on Matthias Dellago's Shortform · 2025-02-14T04:03:48.406Z · LW · GW

Very relevant: https://web.archive.org/web/20090608111223/http://www.paul-almond.com/WhatIsALowLevelLanguage.htm

Comment by TsviBT on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-14T00:09:34.484Z · LW · GW

Ok, I think I see what you're saying. To check part of my understanding: when you say "AI R&D is fully automated", I think you mean something like:

Most major AI companies have fired almost all of their SWEs. They still have staff to physically build datacenters, do business, etc.; and they have a few overseers / coordinators / strategizers of the fleet of AI R&D research gippities; but the overseers are acknowledged to basically not be doing much, and not clearly be even helping; and the overall output of the research group is "as good or better" than in 2025--measured... somehow.

Comment by TsviBT on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-13T23:58:35.322Z · LW · GW

Ok. So I take it you're very impressed with the difficulty of the research that is going on in AI R&D.

we can agree that once the AIs are automating whole companies stuff

(FWIW I don't agree with that; I don't know what companies are up to, some of them might not be doing much difficult stuff and/or the managers might not be able to or care to tell the difference.)

Comment by TsviBT on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-13T23:46:58.127Z · LW · GW

Thanks... but wait, this is among the most impressive things you expect to see? (You know more than I do about that distribution of tasks, so you could justifiably find it more impressive than I do.)

Comment by TsviBT on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-13T21:29:32.087Z · LW · GW

What are some of the most impressive things you do expect to see AI do, such that if you didn't see them within 3 or 5 years, you'd majorly update about time to the type of AGI that might kill everyone?

Comment by TsviBT on Nick Land: Orthogonality · 2025-02-11T01:47:31.194Z · LW · GW

would you think it wise to have TsviBT¹⁹⁹⁹ align contemporary Tsvi based on his values? How about vice versa?

It would be mostly wise either way, yeah, but that's relying on both directions being humble / anapartistic.

Comment by TsviBT on Nick Land: Orthogonality · 2025-02-11T01:31:27.486Z · LW · GW

do you think stable meta-values are to be observed between australopiteci and say contemporary western humans?

on the other hand: do values across primitive tribes or early agricultural empires not look surprisingly similar?

I'm not sure I understand the question, or rather, I don't know how I could know this. Values are supposed to be things that live in an infinite game / Nomic context. You'd have to have these people get relatively more leisure before you'd see much of their values.

Comment by TsviBT on Nick Land: Orthogonality · 2025-02-11T01:12:10.758Z · LW · GW

I mean, I don't know how it works in full, that's a lofty and complex question. One reason to think it's possible is that there's a really big difference between the kind of variation and selection we do in our heads with ideas and the kind evolution does with organisms. (Our ideas die so we don't have to and so forth.) I do feel like some thoughts change some aspects of some of my values, but these are generally "endorsed by more abstract but more stable meta-values", and I also feel like I can learn e.g. most new math without changing any values. Where "values" is, if nothing else, cashed out as "what happens to the universe in the long run due to my agency" or something (it's more confusing when there's peer agents). Mateusz's point is still relevant; there's just lots of different ways the universe can go, and you can choose among them.

Comment by TsviBT on TsviBT's Shortform · 2025-02-10T09:29:27.635Z · LW · GW

I quite dislike earplugs. Partly it's the discomfort, which maybe those can help with; but partly I just don't like being closed away from hearing what's around me. But maybe I'll try those, thanks (even though the last 5 earplugs were just uncomfortable contra promises).

Yeah, I mean I think the music thing is mainly nondistraction. The quiet of night is great for thinking, which doesn't help the sleep situation.

Comment by TsviBT on TsviBT's Shortform · 2025-02-09T09:09:51.585Z · LW · GW

Yep! Without cybernetic control (I mean, melatonin), I have a non-24-hour schedule, and I believe this contributes >10% of that.

Comment by TsviBT on TsviBT's Shortform · 2025-02-08T13:28:56.543Z · LW · GW

(1) was my guess. Another guess is that there's a magazine "GQ".

Comment by TsviBT on TsviBT's Shortform · 2025-02-08T13:00:08.564Z · LW · GW

Ohhh. Thanks. I wonder why I did that.

Comment by TsviBT on TsviBT's Shortform · 2025-02-08T10:26:47.704Z · LW · GW

No yeah that's my experience too, to some extent. But I would say that I can do good mathematical thinking there, including correctly truth-testing; just less good at algebra, and as you say less good at picking up an unfamiliar math concept.

Comment by TsviBT on TsviBT's Shortform · 2025-02-08T09:48:22.264Z · LW · GW

(These are 100% unscientific, just uncritical subjective impressions for fun. CQ = cognitive capacity quotient, like generally good at thinky stuff)

  • Overeat a bit, like 10% more than is satisfying: -4 CQ points for a couple hours.
  • Overeat a lot, like >80% more than is satisfying: -9 CQ points for 20 hours.
  • Sleep deprived a little, like stay up really late but without sleep debt: +5 CQ points.
  • Sleep debt, like a couple days not enough sleep: -11 CQ points.
  • Major sleep debt, like several days not enough sleep: -20 CQ points.
  • Oversleep a lot, like 11 hours: +6 CQ points.
  • Ice cream (without having eaten ice cream in the past week): +5 CQ points.
  • Being outside: +4 CQ points.
  • Being in a car: -8 CQ points.
  • Walking in the hills: +9 CQ points.
  • Walking specifically up a steep hill: -5 CQ points.
  • Too much podcasts: -8 CQ points for an hour.
  • Background music: -6 to -2 CQ points.
  • Kinda too hot: -3 CQ points.
  • Kinda too cold: +2 CQ points.

(stimulants not listed because they tend to pull the features of CQ apart; less good at real thinking, more good at relatively rote thinking and doing stuff)

Comment by TsviBT on So You Want To Make Marginal Progress... · 2025-02-08T07:24:52.843Z · LW · GW

When they're nearing the hotel, Alice gets the car's attention. And she's like, "Listen guys, I have been lying to you. My real name is Mindy. Mindy the Middlechainer.".

Comment by TsviBT on ozziegooen's Shortform · 2025-02-05T19:31:55.811Z · LW · GW

I'm saying that just because we know algorithms that will successfully leverage data and compute to set off an intelligence explosion (...ok I just realized you wrote TAI but IDK what anyone means by anything other than actual AGI), doesn't mean we know much about how they leverage it and how that influences the explody-guy's long-term goals.

Comment by TsviBT on ozziegooen's Shortform · 2025-02-05T18:50:43.476Z · LW · GW

I assume that at [year(TAI) - 3] we'll have a decent idea of what's needed

Why?? What happened to the bitter lesson?

Comment by TsviBT on evhub's Shortform · 2025-02-05T13:45:47.797Z · LW · GW

Isn't this what the "coherent" part is about? (I forget.)

Comment by TsviBT on Nick Land: Orthogonality · 2025-02-05T09:09:29.362Z · LW · GW

A start of one critique is:

It simply means Darwinian processes have no limits that matter to us.

Not true! Roughly speaking, we can in principle just decide to not do that. A body can in principle have an immune system that doesn't lose to infection; there could in principle be a world government that picks the lightcone's destiny. The arguments about novel understanding implying novel values might be partly right, but they don't really cut against Mateusz's point.

Comment by TsviBT on Nick Land: Orthogonality · 2025-02-05T07:55:22.786Z · LW · GW

Reason to care about engaging /acc:

https://www.lesswrong.com/posts/HE3Styo9vpk7m8zi4/evhub-s-shortform?commentId=kDjrYXCXgNvjbJfaa

I've recently been thinking that it's a mistake to think of this type of thing--"what to do after the acute risk period is safed"--as being a waste of time / irrelevant; it's actually pretty important, specifically because you want people trying to advance AGI capabilities to have an alternative, actually-good vision of things. A hypothesis I have is that many of them are in a sense genuinely nihilistic/accelerationist; "we can't imagine the world after AGI, so we can't imagine it being good, so it cannot be good, so there is no such thing as a good future, so we cannot be attached to a good future, so we should accelerate because that's just what is happening".

Comment by TsviBT on Nick Land: Orthogonality · 2025-02-05T07:54:25.830Z · LW · GW

I strong upvoted, not because it's an especially helpful post IMO, but because I think /acc needs better critique, so there should be more communication. I suspect the downvotes are more about the ideological misalignment than the quality.

Given the quality of the post, I think it would not be remotely rude to respond with a comment like "These are are well-tread topics; you should read X and Y and Z if you want to participate in a serious discussion about this.". But no one wrote that comment, and what would X, Y, Z be?? One could probably correct some misunderstandings in the post this way just by linking to the LW wiki on Orthogonality or whatever, but I personally wouldn't know what to link to, to actually counter the actual point.

Comment by TsviBT on evhub's Shortform · 2025-02-05T06:59:35.470Z · LW · GW

(Interesting. FWIW I've recently been thinking that it's a mistake to think of this type of thing--"what to do after the acute risk period is safed"--as being a waste of time / irrelevant; it's actually pretty important, specifically because you want people trying to advance AGI capabilities to have an alternative, actually-good vision of things. A hypothesis I have is that many of them are in a sense genuinely nihilistic/accelerationist; "we can't imagine the world after AGI, so we can't imagine it being good, so it cannot be good, so there is no such thing as a good future, so we cannot be attached to a good future, so we should accelerate because that's just what is happening".)

Comment by TsviBT on Do you have High-Functioning Asperger's Syndrome? · 2025-02-05T05:31:35.593Z · LW · GW

It's not clear why anyone would want to claim a self-diagnosis of that, since little about it is 'egosyntonic', as the psychiatrists say.

Since a friend mentioned I might be schizoid, I've been like "...yeah? somewhat? maybe? seems mixed? aren't I just avoidant? but I feel more worried about relating than about being rejected?", though I'm not very motivated to learn a bunch about it. So IDK. But anyway, re/ egosyntonicity:

  • Compared to avoidant, schizoid seems vaguely similar, but less pathetic; less needy or cowardly.
  • Schizoid has some benefits of "disagreeability, but not as much of an asshole". Thinking for yourself, not being taken in by common modes of being.
  • Schizoid is maybe kinda like "I have a really really high bar for who I want to relate to", which is kinda high-status.
Comment by TsviBT on Thread for Sense-Making on Recent Murders and How to Sanely Respond · 2025-02-04T21:11:56.272Z · LW · GW

Just FYI none of what you said responds to anything I said, AFAICT. Are you just arguing "Ziz is bad"? My comment is about what causes people to end up the way Ziz ended up, which is relevant to your question "Is there a lesson to learn?".

By the way, is there an explanation somewhere what actually happened? (Not just what Ziz believed.)

Somewhere on this timeline I think https://x.com/jessi_cata/with_replies

Comment by TsviBT on Viliam's Shortform · 2025-02-04T14:06:04.473Z · LW · GW

The rest of the team should update about listening to her concerns.

I believe (though my memory in general is very very fuzzy) that I was the one who most pushed for Ziz and Gwen to be at that workshop. (I don't recall talking with Anna about that at all before the workshop, and don't see anything about that in my emails on a quick look, though I could very easily have forgotten. I do recall her saying at the workshop that she thought Ziz shouldn't be there.) I did indeed update later (also based on other things) quite strongly that I really truly cannot detect various kinds of pathology, and therefore have to be much more circumspect, deferent to others about this, not making lots of decisions like this, etc. (I do think there was good reason to have Ziz at the workshop though, contra others; Ziz was indeed smart and an interesting original insightful thinker, which is in short supply. I'm also unclear on what evidence Anna had at the time, and the extent to which the community's behavior with Ziz was escalatorily causing Ziz's eventual fate.)

Anna seems like a good judge of character,

Uh, for the record, I would definitely take it very seriously, probably as pretty strong or very strong evidence, if Anna says that someone is harmful / evil / etc.; but I would not take it as strong evidence of someone not being so, if she endorses them.

Comment by TsviBT on Thread for Sense-Making on Recent Murders and How to Sanely Respond · 2025-02-04T13:46:37.402Z · LW · GW

There's a lot more complexity, obviously, but one thing that sticks out to me is this paragraph, from https://sinceriously.blog-mirror.com/net-negative/ :

I described how I felt like I was the only one with my values in a world of flesh eating monsters, how it was horrifying seeing the amoral bullet biting consistency of the rationality community, where people said it was okay to eat human babies as long as they weren’t someone else’s property if I compared animals to babies. How I was constantly afraid that their values would leak into me and my resolve would weaken and no one would be judging futures according to sentient beings in general. How it was scary Eliezer Yudkowsky seemed to use “sentient” to mean “sapient”. How I was constantly afraid if I let my brain categorize them as my “in-group” then I’d lose my values.

This is one among several top hypothesis-parts for something at the core of how Ziz, and by influence other Zizians, gets so far gone from normal structures of relating. It is indeed true that normal people (I mean, including the vast majority of rationalists) live deep in an ocean of {algorithm, stance, world, god}-sharing with people around them. And it's true that this can infect you in various ways, erode structures in you, erode values. So you can see how someone might think it's a good idea to become oppositional to many normal structures of relating; and how that can be in a reinforcing feedback loop with other people's reactions to your oppositionality and rejection.

(As an example of another top hypothesis-part: I suspect Ziz views betrayal of values (the blackmail payout thing) and betrayal of trust (the behavior of Person A) sort of as described here: https://sideways-view.com/2016/11/14/integrity-for-consequentialists/ In other words, if someone's behavior is almost entirely good, but then in some subtle sneaky ways or high-stakes ways bad, that's an extreme mark against them. Many would agree with the high-stakes part of (my imagined) Ziz's stance here, but many fewer would agree so strongly with the subtle sneaky part.)

If that's a big part of what was going on, it poses a general question (which is partly a question of community behavior and partly a question of mental technology for individuals): How to make it more feasible to get the goods of being in a community, without the bads of value erosion?

Comment by TsviBT on Views on when AGI comes and on strategy to reduce existential risk · 2025-02-03T08:00:12.462Z · LW · GW

really smart people

Differences between people are less directly revelative of what's important in human intelligence. My guess is that all or very nearly all human children have all or nearly all the intelligence juice. We just, like, don't appreciate how much a child is doing in constructing zer world.

the current models have basically all the tools a moderately smart human have, with regards to generating novel ideas

Why on Earth do you think this? (I feel like I'm in an Asch Conformity test, but with really really high production value. Like, after the experiment, they don't tell you what the test was about. They let you take the card home. On the walk home you ask people on the street, and they all say the short line is long. When you get home, you ask your housemates, and they all agree, the short line is long.)

I don't see what's missing that a ton of training on a ton of diverse, multimodal tasks + scaffoldin + data flywheel isn't going to figure out.

My response is in the post.

Comment by TsviBT on Views on when AGI comes and on strategy to reduce existential risk · 2025-02-03T02:40:49.017Z · LW · GW

I'm curious if you have a sense from talking to people.

More recently I've mostly disengaged (except for making kinda-shrill LW comments). Some people say that "concepts" aren't a thing, or similar. E.g. by recentering on performable tasks, by pointing to benchmarks going up and saying that the coarser category of "all benchmarks" or similar is good enough for predictions. (See e.g. Kokotajlo's comment here https://www.lesswrong.com/posts/oC4wv4nTrs2yrP5hz/what-are-the-strongest-arguments-for-very-short-timelines?commentId=QxD5DbH6fab9dpSrg, though his actual position is of course more complex and nuanced.) Some people say that the training process is already concept-gain-complete. Some people say that future research, such as "curiosity" in RL, will solve it. Some people say that the "convex hull" of existing concepts is already enough to set off FURSI (fast unbounded recursive self-improvement).

(though I feel confused about how to update on the conjunction of those, and the things LLMs are good at — all the ways they don't behave like a person who doesn't understand X, either, for many X.)

True; I think I've heard some various people discussing how to more precisely think of the class of LLM capabilities, but maybe there should be more.

if that's less sample-efficient than what humans are doing, it's not apparent to me that it can't still accomplish the same things humans do, with a feasible amount of brute force

It's often awkward discussing these things, because there's sort of a "seeing double" that happens. In this case, the "double" is:

"AI can't FURSI because it has poor sample efficiency...

  1. ...and therefore it would take k orders of magnitude more data / compute than a human to do AI research."
  2. ...and therefore more generally we've not actually gotten that much evidence that the AI has the algorithms which would have caused both good sample efficiency and also the ability to create novel insights / skills / etc."

The same goes mutatis mutandis for "can make novel concepts".

I'm more saying 2. rather than 1. (Of course, this would be a very silly thing for me to say if we observed the gippities creating lots of genuine novel useful insights, but with low sample complexity (whatever that should mean here). But I would legit be very surprised if we soon saw a thing that had been trained on 1000x less human data, and performs at modern levels on language tasks (allowing it to have breadth of knowledge that can be comfortably fit in the training set).)

can't still accomplish the same things humans do

Well, I would not be surprised if it can accomplish a lot of the things. It already can of course. I would be surprised if there weren't some millions of jobs lost in the next 10 years from AI (broadly, including manufacturing, driving, etc.). In general, there's a spectrum/space of contexts / tasks, where on the one hand you have tasks that are short, clear-feedback, and common / stereotyped, and not that hard; on the other hand you have tasks that are long, unclear-feedback, uncommon / heterogenous, and hard. The way humans do things is that we practice the short ones in some pattern to build up for the variety of long ones. I expect there to be a frontier of AIs crawling from short to long ones. I think at any given time, pumping in a bunch of brute force can expand your frontier a little bit, but not much, and it doesn't help that much with more permanently ratcheting out the frontier.

AI that's narrowly superhuman on some range of math & software tasks can accelerate research

As you're familiar with, if you have a computer program that has 3 resources bottlenecks A (50%), B (25%), and C (25%), and you optimize the fuck out of A down to ~1%, you ~double your overall efficiency; but then if you optimize the fuck out of A again down to .1%, you've basically done nothing. The question to me isn't "does AI help a significant amount with some aspects of AI research", but rather "does AI help a significant and unboundedly growing amount with all aspects of AI research, including the long-type tasks such as coming up with really new ideas".

AI is transformative enough to motivate a whole lot of sustained attention on overcoming its remaining limitations

This certainly makes me worried in general, and it's part of why my timelines aren't even longer; I unfortunately don't expect a large "naturally-occurring" AI winter.

seems bizarre if whatever conceptual progress is required takes multiple decades

Unfortunately I haven't addressed your main point well yet... Quick comments:

  • Strong minds are the most structurally rich things ever. That doesn't mean they have high algorithmic complexity; obviously brains are less algorithmically complex than entire organisms, and the relevant aspects of brains are presumably considerably simpler than actual brains. But still, IDK, it just seems weird to me to expect to make such an object "by default" or something? Craig Venter made a quasi-synthetic lifeform--but how long would it take us to make a minimum viable unbounded invasive organic replicator actually from scratch, like without copying DNA sequences from existing lifeforms?
  • I think my timelines would have been considered normalish among X-risk people 15 years ago? And would have been considered shockingly short by most AI people.
  • I think most of the difference is in how we're updating, rather than on priors? IDK.
Comment by TsviBT on Views on when AGI comes and on strategy to reduce existential risk · 2025-02-03T01:53:40.935Z · LW · GW

It's a good question. Looking back at my example, now I'm just like "this is a very underspecified/confused example". This deserves a better discussion, but IDK if I want to do that right now. In short the answer to your question is

  • I at least would not be very surprised if gippity-seek-o5-noAngular could do what I think you're describing.
  • That's not really what I had in mind, but I had in mind something less clear than I thought. The spirit is about "can the AI come up with novel concepts", but the issue here is that "novel concepts" are big things, and their material and functioning and history are big and smeared out.

I started writing out a bunch of thoughts, but they felt quite inadequate because I knew nothing about the history of the concept of angular momentum; so I googled around a tiny little bit. The situation seems quite awkward for the angular momentum lesion experiment. What did I "mean to mean" by "scrubbed all mention of stuff related to angular momentum"--presumably this would have to include deleting all subsequent ideas that use angular moment in their definitions, but e.g. did I also mean to delete the notion of cross product?

It seems like angular momentum was worked on in great detail well before the cross product was developed at all explicitly. See https://arxiv.org/pdf/1511.07748 and https://en.wikipedia.org/wiki/Cross_product#History. Should I still expect gippity-seek-o5-noAngular to notice the idea if it doesn't have the cross product available? Even if not, what does and doesn't this imply about this decade's AI's ability to come up with novel concepts?

(I'm going to mull on why I would have even said my previous comment above, given that on reflection I believe that "most" concepts are big and multifarious and smeared out in intellectual history. For some more examples of smearedness, see the subsection here: https://tsvibt.blogspot.com/2023/03/explicitness.html#the-axiom-of-choice)

Comment by TsviBT on The Failed Strategy of Artificial Intelligence Doomers · 2025-02-02T18:25:15.662Z · LW · GW

I can't tell what you mean by much of this (e.g. idk what you mean by "pretty simple heuristics" or "science + engineering SI" or "self-play-ish regime"). (Not especially asking you to elaborate.) Most of my thoughts are here, including the comments:

https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce

I would take a bet with you about what we expect to see in the next 5 years.

Not really into formal betting, but what are a couple Pareto[impressive, you're confident we'll see within 5 years] things?

But more than that, what kind of epistemology do you think I should be doing that I'm not?

Come on, you know. Actually doubt, and then think it through.

I mean, I don't know. Maybe you really did truly doubt a bunch. Maybe you could argue me from 5% omnicide in next ten years to 50%. Go ahead. I'm speaking from informed priors and impressions.

Comment by TsviBT on The Failed Strategy of Artificial Intelligence Doomers · 2025-02-02T18:10:55.193Z · LW · GW

Jessica I'm less sure about. Sam, from large quantities of insights in many conversations. If you want something more legible, I'm what, >300 ELO points better than you at math; Sam's >150 ELO points better than me at math if I'm trained up, now probably more like >250 or something.

Not by David's standard though, lol.

Comment by TsviBT on The Failed Strategy of Artificial Intelligence Doomers · 2025-02-02T12:08:58.091Z · LW · GW

Broadly true, I think.

almost any X that is not trivially verifiable

I'd probably quibble a lot with this.

E.g. there are many activities that many people engage in frequently--eating, walking around, reading, etc etc. Knowledge and skill related to those activities is usually not vibes-based, or only half vibes-based, or something, even if not trivially verifiable. For example, after a few times accidentally growing mold on some wet clothes or under a sink, very many people learn not to leave areas wet.

E.g. anyone who studies math seriously must learn to verify many very non-trivial things themselves. (There will also be many things they will believe partly based on vibes.)

I don't think AI timelines are an unusual topic in that regard.

In that regard, technically, yes, but it's not very comparable. It's unusual in that it's a crucial question that affects very many people's decisions. (IIRC, EVERY SINGLE ONE of the >5 EA / LW / X-derisking adjacent funder people that I've talked to about human intelligence enhancement says "eh, doesn't matter, timelines short".) And it's in an especially uncertain field, where consensus should much less strongly be expected to be correct. And it's subject to especially strong deference and hype dynamics and disinformation. For comparison, you can probably easily find entire communities in which the vibe is very strongly "COVID came from the wet market" and others where it's very strongly "COVID came from the lab". You can also find communities that say "AGI a century away". There are some questions where the consensus is right for the right reasons and it's reasonable to trust the consensus on some class of beliefs. But vibes-based reasoning is just not robust, and nearly all the resources supposedly aimed at X-derisking in general are captured by a largely vibes-based consensus.

Comment by TsviBT on The Failed Strategy of Artificial Intelligence Doomers · 2025-02-02T08:52:23.422Z · LW · GW

Oh ok lol. Ok on a quick read I didn't see too much in this comment to disagree with.

(One possible point of disagreement is that I think you plausibly couldn't gather any set of people alive today and solve the technical problem; plausibly you need many, like many hundreds, of people you call geniuses. Obviously "hundreds" is made up, but I mean to say that the problem, "come to understand minds--the most subtle/complex thing ever--at a pretty deep+comprehensive level", is IMO extremely difficult, like it's harder than anything humanity has done so far by a lot, not just an ordinary big science project. Possibly contra Soares, IDK.)

(Another disagreement would be

[Scott] has unarguably done a large amount of the most valuable work in the area in the past decade

I don't actually think logical induction is that valuable for the AGI alignment problem, to the point where random philosophy is on par in terms of value to alignment, though I expect most people to disagree with this. It's just a genius technical insight in general.)

Comment by TsviBT on The Failed Strategy of Artificial Intelligence Doomers · 2025-02-02T08:35:01.104Z · LW · GW

I think people are doing those checks?

No. You can tell because they can't have an interesting conversation about it, because they don't have surrounding mental content (such as analyses of examples that stand up to interrogation, or open questions, or cruxes that aren't stupid). (This is in contrast to several people who can have an interesting conversation about, even if I think they're wrong and making mistakes and so on.)

But I did think about those ideas, and evaluate if they seemed true.

Of course I can't tell from this sentence, but I'm pretty skeptical both of you in particular and of other people in the broad reference class, that most of them have done this in a manner that really does greatly attenuate the dangers of deference.

Comment by TsviBT on The Failed Strategy of Artificial Intelligence Doomers · 2025-02-02T08:29:40.037Z · LW · GW

Of course not. I mean, any reasonable standard? Garrabrant induction, bro. "Produces deep novel (ETA: important difficult) insight"

Comment by TsviBT on The Failed Strategy of Artificial Intelligence Doomers · 2025-02-01T23:27:26.727Z · LW · GW

the arguments for short timelines are definitely weaker than their proponents usually assume, but they aren't totally vibes based

Each person with short timelines can repeat sentences that were generated by a legitimate reason to expect short timelines, but many of them did not generate any of those sentences themselves as the result of trying to figure out when AGI would come; their repeating those sentences is downstream of their timelines. In that sense, for many such people, short timelines actually are totally vibes based.

Comment by TsviBT on The Failed Strategy of Artificial Intelligence Doomers · 2025-02-01T23:13:13.811Z · LW · GW

aren't academic superstars or geniuses

IDK if this is relevant to much, but anyway, given the public record, saying that Scott Garrabrant isn't a genius is just incorrect. Sam Eisenstat is also a genius. Also Jessica Taylor I think. (Pace other members of AF such as myself.)

Comment by TsviBT on Fertility Will Never Recover · 2025-01-30T11:50:24.119Z · LW · GW

If compute is linear in space, then in the obvious way of doing things, you have your Nth kid in your th year.

Comment by TsviBT on Fertility Will Never Recover · 2025-01-30T09:24:21.137Z · LW · GW

How many are excited and aiming for 3+ children?
 


Given modern technology and old style community, raising 5--7 would be a joy, IDK what you're talking about. (Not a parent, could be wrong.)

Comment by TsviBT on Views on when AGI comes and on strategy to reduce existential risk · 2025-01-30T08:42:49.761Z · LW · GW

I'd probably allow something like the synthetic data generation used for AlphaGeometry (Fig. 3) except in base ZFC and giving away very little human math inside the deduction engine

IIUC yeah, that definitely seems fair; I'd probably also allow various other substantial "quasi-mathematical meta-ideas" to seep in, e.g. other tricks for self-generating a curriculum of training data.

But I wouldn't be surprised if like >20% of the people on LW who think A[G/S]I happens in like 2-3 years thought that my thing could totally happen in 2025 if the labs were aiming for it (though they might not expect the labs to aim for it), with your things plausibly happening later

Mhm, that seems quite plausible, yeah, and that does make me want to use your thing as a go-to example.

whether such a system would prove Cantor's theorem (stated in base ZFC) (imo this would still be pretty crazy to see)?

This one I feel a lot less confident of, though I could plausibly get more confident if I thought about the proof in more detail.

Part of the spirit here, for me, is something like: Yes, AIs will do very impressive things on "highly algebraic" problems / parts of problems. (See "Algebraicness".) One of the harder things for AIs is, poetically speaking, "self-constructing its life-world", or in other words "coming up with lots of concepts to understand the material it's dealing with, and then transitioning so that the material it's dealing with is those new concepts, and so on". For any given math problem, I could be mistaken about how algebraic it is (or, how much of its difficulty for humans is due to the algebraic parts), and how much conceptual progress you have to do to get to a point where the remaining work is just algebraic. I assume that human math is a big mix of algebraic and non-algebraic stuff. So I get really surprised when an AlphaMath can reinvent most of the definitions that we use, but I'm a lot less sure about a smaller subset because I'm less sure if it just has a surprisingly small non-algebraic part. (I think that someone with a lot more sense of the math in general, and formal proofs in particular, could plausibly call this stuff in advance significantly better than just my pretty weak "it's hard to do all of a wide variety of problems".)

Comment by TsviBT on The Game Board has been Flipped: Now is a good time to rethink what you’re doing · 2025-01-30T02:22:14.153Z · LW · GW

we already have AI that does every qualitative kind of thing you say AIs qualitatively can't do

As I mentioned, my response is here https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce#_We_just_need_X__intuitions:

just because an idea is, at a high level, some kind of X, doesn't mean the idea is anything like the fully-fledged, generally applicable version of X that one imagines when describing X

I haven't heard a response / counterargument to this yet, and many people keep making this logic mistake, including AFAICT you.

Comment by TsviBT on The Game Board has been Flipped: Now is a good time to rethink what you’re doing · 2025-01-30T02:19:52.585Z · LW · GW

requiring the benchmarks to be when the hardest things are solved

My definition is better than yours, and you're too triggered or something to think about it for 2 minutes and understand what I'm saying. I'm not saying "it's not AGI until it kills us", I'm saying "the simplest way to tell that something is an AGI is that it kills us; now, AGI is whatever that thing is, and could exist some time before it kills us".

Comment by TsviBT on The Game Board has been Flipped: Now is a good time to rethink what you’re doing · 2025-01-30T02:16:20.099Z · LW · GW

I tried to explain it in DM and you dismissed the evidence,

What do you mean? According to me we barely started the conversation, you didn't present evidence, I tried to explain that to you, we made a bit of progress on that, and then you ended the conversation.

Comment by TsviBT on Views on when AGI comes and on strategy to reduce existential risk · 2025-01-29T23:43:21.285Z · LW · GW

human proofs, problems, or math libraries

(I'm not sure whether I'm supposed to nitpick. If I were nitpicking I'd ask things like: Wait are you allowing it to see preexisting computer-generated proofs? What counts as computer generated? Are you allowing it to see the parts of papers where humans state and discuss propositions and just cutting out the proofs? Is this system somehow trained on a giant human text corpus, but just without the math proofs?)

But if you mean basically "the AI has no access to human math content except a minimal game environment of formal logic, plus whatever abstract priors seep in via the training algorithm+prior, plus whatever general thinking patterns in [human text that's definitely not mathy, e.g. blog post about apricots]", then yeah, this would be really crazy to see. My points are trying to be, not minimally hard, but at least easier-ish in some sense. Your thing seems significantly harder (though nicely much more operationalized); I think it'd probably imply my "come up with interesting math concepts"? (Note that I would not necessary say the same thing if it was >25% of IMO problems; there I'd be significantly more unsure, and would defer to you / Sam, or someone who has a sense for the complexity of the full proofs there and the canonicalness of the necessary lemmas and so on.)

Comment by TsviBT on The Game Board has been Flipped: Now is a good time to rethink what you’re doing · 2025-01-29T22:40:59.631Z · LW · GW

You refered to " others' definition (which is similar but doesn't rely on the game over clause) ", and I'm saying no, it's not relevantly similar, and it's not just my definition minus doom.

Comment by TsviBT on The Game Board has been Flipped: Now is a good time to rethink what you’re doing · 2025-01-29T12:09:50.049Z · LW · GW

I also dispute that genuine HLMI refers to something meaningfully different from my definition. I think people are replacing HLMI with "thing that can do all stereotyped, clear-feedback, short-feedback tasks", and then also claiming that this thing can replace many human workers (probably true of 5 or 10 million, false of 500 million) or cause a bunch of unemployment by making many people 5x effective (maybe, IDK), and at that point IDK why we're talking about this, when X-risk is the important thing.

Comment by TsviBT on The Game Board has been Flipped: Now is a good time to rethink what you’re doing · 2025-01-29T08:37:05.643Z · LW · GW

nearly everyone I know or have heard of who was expecting longer timelines has updated significantly toward short timelines (<5 years).


You're in an echo chamber. They don't have very good reasons for thinking this. https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce 

Comment by TsviBT on TsviBT's Shortform · 2025-01-28T04:05:02.931Z · LW · GW

It is still the case that some people don't sign up for cryonics simply because it takes work to figure out the process / financing. If you do sign up, it would therefore be a public service to write about the process.

Comment by TsviBT on Yudkowsky on The Trajectory podcast · 2025-01-26T04:48:38.354Z · LW · GW
  1. You people are somewhat crazy overconfident about humanity knowing enough to make AGI this decade. https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce

  2. One hope on the scale of decades is that strong germline engineering should offer an alternative vision to AGI. If the options are "make supergenius non-social alien" and "make many genius humans", it ought to be clear that the latter is both much safer and gets most of the hypothetical benefits of the former.

Comment by TsviBT on Is there such a thing as an impossible protein? · 2025-01-25T05:36:09.533Z · LW · GW

How many proteins are left after these 60 seconds?

I wonder if there's a large-ish (not by percentage, but still) class of mechanically self-destructing proteins? E.g. suppose you have something like this:

RRRRRAARRRRRRAAAEEEEEEEXXXXXXXXXXXXXXXEEEEEEEE

where R could be any basic amino, E any acidic one, A any neutral one. And then the Xs are some sequence that eventually forms a strong extended thing. So the idea is that you get strong bonds between the second R island and the first E island, and between the first R island and the second E island. Then the X segment pulls the two Es apart, ripping the protein between the two R islands. like:

ARRRRRA <<--------------->> ARRRRRR

A.| | | | | | ......................................... | | | | | |

AEEEEEEXXXXXXXXXXXXEEEEEEE

This is 100% made up, no idea if anything like it could happen.