Posts

Overview of strong human intelligence amplification methods 2024-10-08T08:37:18.896Z
TsviBT's Shortform 2024-06-16T23:22:54.134Z
Koan: divining alien datastructures from RAM activations 2024-04-05T18:04:57.280Z
What could a policy banning AGI look like? 2024-03-13T14:19:07.783Z
A hermeneutic net for agency 2024-01-01T08:06:30.289Z
What is wisdom? 2023-11-14T02:13:49.681Z
Human wanting 2023-10-24T01:05:39.374Z
Hints about where values come from 2023-10-18T00:07:58.051Z
Time is homogeneous sequentially-composable determination 2023-10-08T14:58:15.913Z
Telopheme, telophore, and telotect 2023-09-17T16:24:03.365Z
Sum-threshold attacks 2023-09-08T17:13:37.044Z
Fundamental question: What determines a mind's effects? 2023-09-03T17:15:41.814Z
Views on when AGI comes and on strategy to reduce existential risk 2023-07-08T09:00:19.735Z
The fraught voyage of aligned novelty 2023-06-26T19:10:42.195Z
Provisionality 2023-06-19T11:49:06.680Z
Explicitness 2023-06-12T15:05:04.962Z
Wildfire of strategicness 2023-06-05T13:59:17.316Z
The possible shared Craft of deliberate Lexicogenesis 2023-05-20T05:56:41.829Z
A strong mind continues its trajectory of creativity 2023-05-14T17:24:00.337Z
Better debates 2023-05-10T19:34:29.148Z
An anthropomorphic AI dilemma 2023-05-07T12:44:48.449Z
The voyage of novelty 2023-04-30T12:52:16.817Z
Endo-, Dia-, Para-, and Ecto-systemic novelty 2023-04-23T12:25:12.782Z
Possibilizing vs. actualizing 2023-04-16T15:55:40.330Z
Expanding the domain of discourse reveals structure already there but hidden 2023-04-09T13:36:28.566Z
Ultimate ends may be easily hidable behind convergent subgoals 2023-04-02T14:51:23.245Z
New Alignment Research Agenda: Massive Multiplayer Organism Oversight 2023-04-01T08:02:13.474Z
Descriptive vs. specifiable values 2023-03-26T09:10:56.334Z
Shell games 2023-03-19T10:43:44.184Z
Are there cognitive realms? 2023-03-12T19:28:52.935Z
Do humans derive values from fictitious imputed coherence? 2023-03-05T15:23:04.065Z
Counting-down vs. counting-up coherence 2023-02-27T14:59:39.041Z
Does novel understanding imply novel agency / values? 2023-02-19T14:41:40.115Z
Please don't throw your mind away 2023-02-15T21:41:05.988Z
The conceptual Doppelgänger problem 2023-02-12T17:23:56.278Z
Control 2023-02-05T16:16:41.015Z
Structure, creativity, and novelty 2023-01-29T14:30:19.459Z
Gemini modeling 2023-01-22T14:28:20.671Z
Non-directed conceptual founding 2023-01-15T14:56:36.940Z
Dangers of deference 2023-01-08T14:36:33.454Z
The Thingness of Things 2023-01-01T22:19:08.026Z
[link] The Lion and the Worm 2022-05-16T20:40:22.659Z
Harms and possibilities of schooling 2022-02-22T07:48:09.542Z
Rituals and symbolism 2022-02-10T16:00:14.635Z
Index of some decision theory posts 2017-03-08T22:30:05.000Z
Open problem: thin logical priors 2017-01-11T20:00:08.000Z
Training Garrabrant inductors to predict counterfactuals 2016-10-27T02:41:49.000Z
Desiderata for decision theory 2016-10-27T02:10:48.000Z
Failures of throttling logical information 2016-02-24T22:05:51.000Z
Speculations on information under logical uncertainty 2016-02-24T21:58:57.000Z

Comments

Comment by TsviBT on Consciousness as a conflationary alliance term for intrinsically valued internal experiences · 2024-11-22T18:25:12.018Z · LW · GW

I'm curious how satisfied people seemed to be with the explanations/descriptions of consciousness that you elicited from them. E.g., on a scale from

"Oh! I figured it out; what I mean when I talk about myself being consciousness, and others being conscious or not, I'm referring to affective states / proprioception / etc.; I feel good about restricting away other potential meanings."

to

"I still have no idea, maybe it has something to do with X, that seems relevant, but I feel there's a lot I'm not understanding."

where did they tend to land, and what was the variance?

Comment by TsviBT on lemonhope's Shortform · 2024-11-22T17:33:28.274Z · LW · GW

We agree this is a crucial lever, and we agree that the bar for funding has to be in some way "high". I'm arguing for a bar that's differently shaped. The set of "people established enough in AGI alignment that they get 5 [fund a person for 2 years and maybe more depending how things go in low-bandwidth mentorship, no questions asked] tokens" would hopefully include many people who understand that understanding constraints is key and that past research understood some constraints.

build on past agent foundations research

I don't really agree with this. Why do you say this?

a lot of wasted effort if you asked for out-of-paradigm ideas.

I agree with this in isolation. I think some programs do state something about OOP ideas, and I agree that the statement itself does not come close to solving the problem.

(Also I'm confused about the discourse in this thread (which is fine), because I thought we were discussing "how / how much should grantmakers let the money flow".)

Comment by TsviBT on lemonhope's Shortform · 2024-11-21T17:41:41.049Z · LW · GW

upskilling or career transition grants, especially from LTFF, in the last couple of years

Interesting; I'm less aware of these.

How are they falling short?

I'll answer as though I know what's going on in various private processes, but I don't, and therefore could easily be wrong. I assume some of these are sort of done somewhere, but not enough and not together enough.

  • Favor insightful critiques and orientations as much as constructive ideas. If you have a large search space and little traction, a half-plane of rejects is as or more valuable than a guessed point that you knew how to even generate.
  • Explicitly allow acceptance by trajectory of thinking, assessed by at least a year of low-bandwidth mentorship; deemphasize agenda-ish-ness.
  • For initial exploration periods, give longer commitments with less required outputs; something like at least 2 years. Explicitly allow continuation of support by trajectory.
  • Give a path forward for financial support for out of paradigm things. (The Vitalik fellowship, for example, probably does not qualify, as the professors, when I glanced at the list, seem unlikely to support this sort of work; but I could be wrong.)
  • Generally emphasize judgement of experienced AGI alignment researchers, and deemphasize judgement of grantmakers.
  • Explicitly asking for out of paradigm things.
  • Do a better job of connecting people. (This one is vague but important.)

(TBC, from my full perspective this is mostly a waste because AGI alignment is too hard; you want to instead put resources toward delaying AGI, trying to talk AGI-makers down, and strongly amplifying human intelligence + wisdom.)

Comment by TsviBT on lemonhope's Shortform · 2024-11-21T16:36:31.156Z · LW · GW

grantmakers have tried pulling that lever a bunch of times

What do you mean by this? I can think of lots of things that seem in some broad class of pulling some lever that kinda looks like this, but most of the ones I'm aware of fall greatly short of being an appropriate attempt to leverage smart young creative motivated would-be AGI alignment insight-havers. So the update should be much smaller (or there's a bunch of stuff I'm not aware of).

Comment by TsviBT on What are the good rationality films? · 2024-11-21T01:19:22.929Z · LW · GW

(FWIW this was my actual best candidate for a movie that would fit, but I remembered so few details that I didn't want to list it.)

Comment by TsviBT on What are the good rationality films? · 2024-11-20T08:25:12.062Z · LW · GW

I'm struggling to think of any. Some runners-up:

Comment by TsviBT on What are the good rationality films? · 2024-11-20T06:55:05.277Z · LW · GW

Cf. Moneyball.

Comment by TsviBT on What are Emotions? · 2024-11-15T04:50:59.953Z · LW · GW

Emotions are hardwired stereotyped syndromes of hardwired blunt-force cognitive actions. E.g. fear makes your heart beat faster and puts an expression on your face and makes you consider negative outcomes more and maybe makes you pay attention to your surroundings. So it doesn't make much sense to value emotions, but emotions are good ways of telling that you value something; e.g. if you feel fear in response to X, probably X causes something you don't want, or if you feel happy when / after doing Y, probably Y causes / involves something you want.

Comment by TsviBT on Daniel Kokotajlo's Shortform · 2024-11-14T03:04:25.428Z · LW · GW

we've checked for various forms of funny business and our tools would notice if it was happening.

I think it's a high bar due to the nearest unblocked strategy problem and alienness.

I agree that when AGI R&D starts to 2x or 5x due to AI automating much of the process, that's when we need the slowdown/pause)

If you start stopping proliferation when you're a year away from some runaway thing, then everyone has the tech that's one year away from the thing. That makes it more impossible that no one will do the remaining research, compared to if the tech everyone has is 5 or 20 years away from the thing.

Comment by TsviBT on Daniel Kokotajlo's Shortform · 2024-11-12T23:43:41.840Z · LW · GW

10 more years till interpretability? That's crazy talk. What do you mean by that and why do you think it? (And if it's a low bar, why do you have such a low bar?)

"Pre-AGI we should be comfortable with proliferation" Huh? Didn't you just get done saying that pre-AGI AI is going to contribute meaningfully to research (such as AGI research)?

Comment by TsviBT on Scissors Statements for President? · 2024-11-12T13:39:30.906Z · LW · GW

I think you might have been responding to

Susan could try to put focal attention on the scissor origins; but one way that would be difficult is that she'd get pushback from her community.

which I did say in a parenthetical, but I was mainly instead saying

Susan's community is a key substrate for the scissor origins, maybe more than Susan's interaction with Robert. Therefore, to put focal attention on the scissor origins, a good first step might be looking at her community--how it plays the role of one half of a scissor statement.

Your reasons for hope make sense.

hope/memory of the previous society that (Susan and Tusan and Vusan) and (Robert and Sobert and Tobert) all shared, which she has some hope of reaccessing here

Anecdata: In my case it would be mostly a hope, not a memory. E.g. I don't remember a time when "I understand what you're saying, but..." was a credible statement... Maybe it never was? E.g. I don't remember a time when I would expect people to be sufficiently committed to computing "what would work for everyone to live together" that they kept doing so in political contexts.

Comment by TsviBT on Goal: Understand Intelligence · 2024-11-09T23:34:17.902Z · LW · GW

(generic comment that may not apply too much to Mayer's work in detail, but that I think is useful for someone to hear:) I agree with the basic logic here. But someone trying to follow this path should keep in mind that there's philosophically thorniness here.

A bit more specifically, the questions one asks about "how intelligence works" will always be at risk of streetlighting. As an example/analogy, think of someone trying to understand how the mind works by analyzing mental activity into "faculties", as in: "So then the object recognition faculty recognizes the sofa and the doorway, and it extracts their shapes, and sends their shapes to the math faculty, which performs a search for rotations that allow the sofa to pass through the doorway, and when it finds one it sends that to the executive faculty, which then directs the motor-planning faculty to make an execution plan, and that plan is sent to the motor factulty...". This person may or may not be making genuine progress on something; but either way, if they are trying to answer questions like "which faculties are there and how do they interoperate to perform real-world tasks", they're missing a huge swath of key questions. (E.g.: "how does the sofa concept get produced in the first place? how does the desire to not damage the sofa and the door direct the motor planner? where do those desires come from, and how do they express themselves in general, and how do they respond to conflict?")

Some answers to "how intelligence works" are very relevant, and some are not very relevant, to answering fundamental questions of alignment, such as what determines the ultimate effects of a mind.

Comment by TsviBT on What are the primary drivers that caused selection pressure for intelligence in humans? · 2024-11-08T13:49:16.736Z · LW · GW

Intelligence also has costs and has components that have to be invented, which explains why not all species are already human-level smart. One of the questions here is which selection pressures were so especially and exceptionally strong in the case of humans, that humans fell off the cliff.

Comment by TsviBT on What are the primary drivers that caused selection pressure for intelligence in humans? · 2024-11-08T08:06:16.689Z · LW · GW

IDK, fields don't have to have names, there's just lots of work on these topics. You could start here https://en.wikipedia.org/wiki/Evolutionary_anthropology and google / google-scholar around.

See also https://www.youtube.com/watch?v=tz-L2Ll85rM&list=PL1B24EADC01219B23&index=556 (I'm linking to the whole playlist, linking to a random old one because those are the ones I remember being good, IDK about the new ones).

Comment by TsviBT on Scissors Statements for President? · 2024-11-08T06:04:17.559Z · LW · GW

My hope is that this can become more feasible if we can provide accurate patterns for how the scissors-generating-process is trying to trick Susan(/Robert). And that if Susan is trying to figure out how she and Robert were tricked, by modeling the tricking process, this can somehow help undo the trick, without needing to empathize at any point with "what if candidate X is great."

This is clarifying...

Does it actually have much to do with Robert? Maybe it would be more helpful to talk with Tusan and Vusan, who are also A-blind, B-seeing, candidate Y supporters. They're the ones who would punish non-punishers of supporting candidate X / talking about A. (Which Susan would become, if she were talking to an A-seer without pushing back, let alone if she could see into her A-blindspot.) You could talk to Robert about how he's embedded in threats of punishment for non-punishment of supporting candidate Y / talking about B, but that seems more confusing? IDK.

Comment by TsviBT on Scissors Statements for President? · 2024-11-08T05:44:04.711Z · LW · GW

I think I agree, but

  • It's hard to get clear enough on your values. In practice (and maybe also in theory) it's an ongoing process.
  • Values aren't the only thing going on. There are stances that aren't even close to being either a value, a plan, or a belief. An example is a person who thinks/acts in terms of who they trust, and who seems good; if a lot of people that they know who seem good also think some other person seems good, then they'll adopt that stance.
Comment by TsviBT on An alternative approach to superbabies · 2024-11-08T05:29:00.726Z · LW · GW

I don't care about doing this bet. We can just have a conversation though, feel free to DM me.

Comment by TsviBT on An alternative approach to superbabies · 2024-11-08T05:28:30.603Z · LW · GW

(e.g. 1 billon dollars and a few very smart geniuses going into trying to make communication with orcas work well)

That would give more like a 90% chance of superbabies born in <10 years.

Comment by TsviBT on What are the primary drivers that caused selection pressure for intelligence in humans? · 2024-11-08T05:27:01.438Z · LW · GW
  • Fighting wars with neighboring tribes
  • Extractive foraging
  • Persistence hunting (which involves empathy, imagination (cf cave paintings), and tracking)
  • Niche expansion/travel (i.e. moving between habitat types)
  • In particular, sometimes entering harsh habitats puts various pressures
  • Growing up around people with cultural knowledge (advantage to altriciality, language, learning, imitation, intent-sharing)
  • Altriciality demands parents coordinate
  • Children's learning ability incentivizes parents to learn to teach well

etc.

There's a whole research field on this FYI.

Comment by TsviBT on An alternative approach to superbabies · 2024-11-07T05:24:58.726Z · LW · GW

I'm not gonna read the reddit post because

  • it's an eyebleed wall of text,
  • the author spent hours being excited about this stuff without bothering to learn that we have ~20 billion cortical neurons, not 20 trillion,
  • yeah.

I don't know whether orcas are supersmart. A couple remarks:

  • I don't think it makes that much sense to just look at cortical neuron counts. Big bodies ask for many neurons, including cortical motor neurons. Do cetaceans have really big motor cortices? Visual cortices? Olfactory bulbs? Keyword "allometry". Yes, brains are plastic, but that doesn't mean orcas are actually ever doing higher mathematics with their brains.
  • Scale matters, but I doubt it's very close to being the only thing! Humans likely had genetic adaptations for neuroanatomical phenotypes selected-for by some of: language; tool-making; persisting transient mental content; intent-inference; intent-sharing; mental simulation; prey prediction; deception; social learning; teaching; niche construction/expansion/migration. Orcas have a few of these. But how many, how much, for how long, in what range of situations and manifestations? Or do you think a cow brain scaled to 40 billion neurons would be superhuman?
  • Culture matters. The Greeks could be great philosophers... But could a kid living in 8000 BCE, who gets to text message with an advanced alien civilization of kinda dumb people, become a cutting edge philosopher in the alien culture? Even though almost everyone ze interacts with is preagricultural, preliterate? I dunno, maybe? Still seems kinda hard actually?
  • Regardless of all this, talking to orcas would be super cool, go for it lol.
  • Superbabies is good. It would actually work. It's not actually that hard. There's lots of investment already in component science/tech. Orcas doesn't scale. No one cares about orcas. There's not hundreds of scientists and hundreds of millions in orca communications research. Etc. The sense of this plan being weird is a good sense to investigate further. It's possible for superficial weirdness to be wrong, but don't dismiss the weirdness out of hand.
Comment by TsviBT on An alternative approach to superbabies · 2024-11-07T04:58:46.021Z · LW · GW

I appreciate you being relatively clear about this, but yeah, I think it's probably better to spend more time learning facts and thinking stuff through, compared to writing breathless LW posts. More like a couple weeks rather than a couple days. But that's just my stupid opinion. The thing is, there's probably gonna be like ten other posts in the reference class of this post, and they just... don't leave much of a dent in things? There's a lot that needs serious thinking-through, let's get to work on that! But IDK, maybe someone will be inspired by this post to think through orca stuff more thoroughly.

Comment by TsviBT on Scissors Statements for President? · 2024-11-07T04:49:40.575Z · LW · GW

IIUC, I agree with your vision being desirable. (And, IDK, it's sort of plausible that you can basically do it with a good toolbox that could be developed straightforwardly-ish.)

But there might be a gnarly, fundamental-ish "levers problem" here:

  • It's often hard to do [the sort of empathy whereby you see into your blindspot that they can see]
  • without also doing [the sort of empathy that leads to you adopting some of their values, or even blindspots].

(A levers problem is analogous to a buckets problem, but with actions instead of beliefs. You have an available action VW which does both V and W, but you don't have V and W available as separate actions. V seems good to do and W seems bad to do, so you're conflicted, aahh.)

I would guess that what we call empathy isn't exactly well-described as "a mental motion whereby one tracks and/or mirrors the emotions and belief-perspective of another". The primordial thing--the thing that comes first evolutionarily and developmentally, and that is simpler--is more like "a mental motion whereby one adopts whatever aspects of another's mind are available for adoption". Think of all the mysterious bonding that happens when people hang out, and copying mannerisms, and getting a shoulder-person, and gaining loyalty. This is also far from exactly right. Obviously you don't just copy everything, it matters what you pay attention to and care about, and there's probably more prior structure, e.g. an emphasis on copying aspects that are important for coordinating / synching up values. IDK the real shape of primordial empathy.

But my point is just: Maybe, if you deeply empathize with someone, then by default, you'll also adopt value-laden mental stances from them. If you're in a conflict with someone, adopting value-laden mental stances from them feels and/or is dangerous.

To say it another way, you want to entertain propositions from another person. But your brain doesn't neatly separate propositions from values and plans. So entertaining a proposition is also sort of questioning your plans, which bleeds into changing your values. Empathy good enough to show you blindspots involves entertaining propositions that you care about and that you disagree with.

Or anyway, this was my experience of things, back when I tried stuff like this.

Comment by TsviBT on Advisors for Smaller Major Donors? · 2024-11-06T14:41:12.639Z · LW · GW

Well, anyone who wants could pay me to advise them about giving to decrease X-risk by creating smarter humans. Funders less constrained by PR would of course be advantaged in that area.

Comment by TsviBT on Scissors Statements for President? · 2024-11-06T11:06:59.967Z · LW · GW

IDK, but I'll note that IME, calling for empathy for "the other side" (in either direction) is received with incuriosity / indifference at best, often hostility.

One thing that stuck with me is one of those true crime Youtube videos, where at some stage of the interrogation, the investigator stops being nice, and instead will immediately and harshly contradict anything that the suspect Bob is saying to paint a story where he's innocent. The commentator claimed that the reason the investigator does this is to avoid giving Bob confidence: if Bob's statements hung in the air unchallenged, Bob might think he's successfully creating a narrative and getting that narrative bought. Even if the investigator is not in danger of being fooled (e.g. because she already has video evidence contradicting some of Bob's statements), Bob might get more confident and spend more time lying instead of just confessing.

A conjecture is that for Susan, empathizing with Robert seems like giving room for him to gain more political steam; and the deeper the empathy, the more room you're giving Robert.

Comment by TsviBT on johnswentworth's Shortform · 2024-10-28T11:53:17.958Z · LW · GW

Closeness is the operating drive, but it's not the operating telos. The drive is towards some sort of state or feeling--of relating, standing shoulder-to-shoulder looking out at the world, standing back-to-back defending against the world; of knowing each other, of seeing the same things, of making the same meaning; of integrated seeing / thinking. But the telos is tikkun olam (repairing/correcting/reforming the world)--you can't do that without a shared idea of better.

As an analogy, curiosity is a drive, which is towards confusion, revelation, analogy, memory; but the telos is truth and skill.

In your example, I would say that someone could be struggling with "moral responsibility" while also doing a bunch of research or taking a bunch of action to fix what needs to be fixed; or they could be struggling with "moral responsibility" while eating snacks and playing video games. Vibes are signals and signals are cheap and hacked.

Comment by TsviBT on johnswentworth's Shortform · 2024-10-28T05:24:19.573Z · LW · GW

Hm. This rings true... but also I think that selecting [vibes, in this sense] for attention also selects against [things that the other person is really committed to]. So in practice you're just giving up on finding shared commitments. I've been updating that stuff other than shared commitments is less good (healthy, useful, promising, etc.) than it seems.

Comment by TsviBT on johnswentworth's Shortform · 2024-10-28T05:21:09.830Z · LW · GW

Ok but how do you deal with the tragedy of the high dimensionality of context-space? People worth thinking with have wildly divergent goals--and even if you share goals, you won't share background information.

Comment by TsviBT on Overview of strong human intelligence amplification methods · 2024-10-27T04:16:47.059Z · LW · GW

Are you claiming that this would help significantly with conceptual thinking? E.g., doing original higher math research, or solving difficult philosophical problems? If so, how would it help significantly? (Keep in mind that you should be able to explain how it brings something that you can't already basically get. So, something that just regular old Gippity use doesn't get you.)

Comment by TsviBT on johnswentworth's Shortform · 2024-10-24T02:09:41.458Z · LW · GW

I didn't read this carefully--but it's largely irrelevant. Adult editing probably can't have very large effects because developmental windows have passed; but either way the core difficulty is in editor delivery. Germline engineering does not require better gene targets--the ones we already have are enough to go as far as we want. The core difficulty there is taking a stem cell and making it epigenomically competent to make a baby (i.e. make it like a natural gamete or zygote).

Comment by TsviBT on Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now) · 2024-10-24T02:01:56.554Z · LW · GW

Ben's responses largely cover what I would have wanted to say. But on a meta note: I wrote specifically

I think a hypothesis that does have to be kept in mind is that some people don't care.

I do also think the hypothesis is true (and it's reasonable for this thread to discuss that claim, of course).

But the reason I said it that way, is that it's a relatively hard hypothesis to evaluate. You'd probably have to have several long conversations with several different people, in which you successfully listen intensely to who they are / what they're thinking / how they're processing what you say. Probably only then could you even have a chance at reasonably concluding something like "they actually don't care about X", as distinct from "they know something that implies X isn't so important here" or "they just don't get that I'm talking about X" or "they do care about X but I wasn't hearing how" or "they're defensive in this moment, but will update later" or "they just hadn't heard why X is important (but would be open to learning that)", etc.

I agree that it's a potentially mindkilly hypothesis. And because it's hard to evaluate, the implicature of assertions about it is awkward--I wanted to acknowledge that it would be difficult to find a consensus belief state, and I wanted to avoid implying that the assertion is something we ought to be able to come to consensus about right now. And, more simply, it would take substantial work to explain the evidence for the hypothesis being true (in large part because I'd have to sort out my thoughts). For these reasons, my implied request is less like "let's evaluate this hypothesis right now", and more like "would you please file this hypothesis away in your head, and then if you're in a long conversation, on the relevant topic with someone in the relevant category, maybe try holding up the hypothesis next to your observations and seeing if it explains things or not".

In other words, it's a request for more data and a request for someone to think through the hypothesis more. It's far from perfectly neutral--if someone follows that request, they are spending their own computational resources and thereby extending some credit to me and/or to the hypothesis.

Comment by TsviBT on Why I quit effective altruism, and why Timothy Telleen-Lawton is staying (for now) · 2024-10-23T02:40:13.816Z · LW · GW

don't see the downstream impacts of their choices,

This could be part of it... but I think a hypothesis that does have to be kept in mind is that some people don't care. They aren't trying to follow action-policies that lead to good outcomes, they're doing something else. Primarily, acting on an addiction to Steam. If a recruitment strategy works, that's a justification in and of itself, full stop. EA is good because it has power, more people in EA means more power to EA, therefore more people in EA is good. Given a choice between recruiting 2 agents and turning them both into zombies, vs recruiting 1 agent and keeping them an agent, you of course choose the first one--2 is more than 1.

Comment by TsviBT on The Hidden Complexity of Wishes · 2024-10-22T15:58:50.692Z · LW · GW

The main difficulty, if there is one, is in "getting the function to play the role of the AGI values," not in getting the AGI to compute the particular function we want in the first place.

Right, that is the problem (and IDK of anyone discussing this who says otherwise).

Another position would be that it's probably easy to influence a few bits of the AI's utility function, but not others. For example, it's conceivable that, by doing capabilities research in different ways, you could increase the probability that the AGI is highly ambitious--e.g. tries to take over the whole lightcone, tries to acausally bargain, etc., rather than being more satisficy. (IDK how to do that, but plausibly it's qualitatively easier than alignment.) Then you could claim that it's half a bit more likely that you've made an FAI, given that an FAI would probably be ambitious. In this case, it does matter that the utility function is complex.

Comment by TsviBT on The Hidden Complexity of Wishes · 2024-10-21T07:49:03.102Z · LW · GW

Here's an argument that alignment is difficult which uses complexity of value as a subpoint:

  • A1. If you try to manually specify what you want, you fail.

  • A2. Therefore, you want something algorithmically complex.

  • B1. When humanity makes an AGI, the AGI will have gotten values via some process; that process induces some probability distribution over what values the AGI ends up with.

  • B2. We want to affect the values-distribution, somehow, so that it ends up with our values.

  • B3. We don't understand how to affect the values-distribution toward something specific.

  • B4. If we don't affect the value-distribution toward something specific, then the values-distribution probably puts large penalties for absolute algorithmic complexity; any specific utility function with higher absolute algorithmic complexity will be less likely to be the one that the AGI ends up with.

  • C1. Because of A2 (our values are algorithmically complex) and B4 (a complex utility function is unlikely to show up in an AGI without us skillfully intervening), an AGI is unlikely to have our values without us skillfully intervening.

  • C2. Because of B3 (we don't know how to skillfully intervene on an AGI's values) and C1, an AGI is unlikely to have our values.

I think that you think that the argument under discussion is something like:

  • (same) A1. If you try to manually specify what you want, you fail.

  • (same) A2. Therefore, you want something algorithmically complex.

  • (same) B1. When humanity makes an AGI, the AGI will have gotten values via some process; that process induces some probability distribution over what values the AGI ends up with.

  • (same) B2. We want to affect the values-distribution, somehow, so that it ends up with our values.

  • B'3. The greater the complexity of our values, the harder it is to point at our values.

  • B'4. The harder it is to point at our values, the more work or difficulty is involved in B2.

  • C'1. By B'3 and B'4: the greater the complexity of our values, the more work or difficulty is involved in B2 (determining the AGI's values).

  • C'2. Because of A2 (our values are algorithmically complex) and C'1, it would take a lot of work to make an AGI pursue our values.

These are different arguments, which make use of the complexity of values in different ways. You dispute B'3 on the grounds that it can be easy to point at complex values. B'3 isn't used in the first argument though.

Comment by TsviBT on How to have Polygenically Screened Children · 2024-10-18T20:51:11.375Z · LW · GW

I am quite interested in how (dangers from) cell division are different in the embryonic stage as compared to at a later stage.

I don't know much about this, but two things (that don't directly answer your question):

  • Generally, cells accumulate damage over time.
    • This happens both genetically and epigenetically. Genetically, damage accumulates (I think the main cause is cosmic rays hitting DNA that's exposed for transcription and knocking nucleic acids out? Maybe also other copying errors?), so that adult somatic cells have (I think) several hundred new mutations that they weren't born with. Epigenetically, I imagine that various markers that should be there get lost over time for some reason (I think this is a major hypothesis about the sort of mechanism behind various forms of aging).
    • This means that generally, ESCs are more healthy than adult somatic cells.
  • One major function of the reproductive system is to remove various forms of damage.
    • You can look up gametogenesis (oogenesis, spermatogenesis). Both processes are complicated, in that they involve many distinct steps, various checks of integrity (I think oocytes + their follicles are especially stringently checked?), and a lot of attrition (a fetus has several million oocytes; an adult woman ovulates at most a few hundred oocytes in her lifetime, without exogenous hormones as in IVF).
    • So, ESCs (from an actual embryo, rather than from some longer-term culture) will be heavily selected for genetic (and epigenetic?) integrity. Mutations that would have been severely damaging to development will have been weeded out. (Though there will also be many miscarriages.)
Comment by TsviBT on How to have Polygenically Screened Children · 2024-10-18T20:05:35.867Z · LW · GW

That's a reasonable point... But I don't think we can just count number of divisions either? For one thing, there are several populations of stem cells in an adult. For another, people who are 50% bigger than other people don't live 2/3 as long (right? though maybe that's not the prediction?). I think maybe embryonic stem cells protect their telomeres--not sure.

Comment by TsviBT on How to have Polygenically Screened Children · 2024-10-18T16:32:24.488Z · LW · GW

Wouldn't it age them by at most 1 day (which is about how long mitosis takes)?

Comment by TsviBT on Overview of strong human intelligence amplification methods · 2024-10-18T10:16:04.250Z · LW · GW

I'm not sure how well curated and indexed most information is.

Working memory allows for looking at the whole picture at once better with the full might of human intelligence (which is better at many things than LLMs), while removing frictions that come from delays and effort expended in search for data and making calculations.

How specifically would you use BCIs to improve this situation?

Comment by TsviBT on Overview of strong human intelligence amplification methods · 2024-10-17T21:29:20.838Z · LW · GW

Curated reservoirs of practical and theoretical information, well indexed, would be very useful to super geniuses.

You don't actually need to hook them up physically. Having multiple people working on different parts of a problem lets them all bounce ideas off each other.

But both of these things are basically available currently, so apparently our current level isn't enough. LLMs + google (i.e. what Perplexity is trying to be) are already a pretty good index; what would a BCI add?

Overall: The goal should be to create a number of these people, then let them plan out the next round if their intelligence doesn't do it.

I commented on a similar topic here: https://www.lesswrong.com/posts/jTiSWHKAtnyA723LE/overview-of-strong-human-intelligence-amplification-methods?commentId=uZg9s2FfP7E7TMTcD

Comment by TsviBT on The Hidden Complexity of Wishes · 2024-10-17T18:55:14.084Z · LW · GW

That's incorrect, but more importantly it's off topic. The topic is "what does the complexity of value have to do with the difficulty of alignment". Barnett AFAIK in this comment is not saying (though he might agree, and maybe he should be taken as saying so implicitly or something) "we have lots of ideas for getting an AI to care about some given values". Rather he's saying "if you have a simple pointer to our values, then the complexity of values no longer implies anything about the difficulty of alignment because values effectively aren't complex anymore".

Comment by TsviBT on The Hidden Complexity of Wishes · 2024-10-17T12:33:05.891Z · LW · GW

Alice: I want to make a bovine stem cell that can be cultured at scale in vats to make meat-like tissue. I could use directed evolution. But in my alternate universe, genome sequencing costs $1 billion per genome, so I can't straightforwardly select cells to amplify based on whether their genome looks culturable. Currently the only method I have is to do end-to-end testing: I take a cell line, I try to culture a great big batch, and then see if the result is good quality edible tissue, and see if the cell line can last for a year without mutating beyond repair. This is very expensive, but more importantly, it doesn't work. I can select for cells that make somewhat more meat-like tissue; but when I do that, I also heavily select for other very bad traits, such as forming cancer-like growths. I estimate that it takes on the order of 500 alleles optimized relative to the wild type to get a cell that can be used for high-quality, culturable-at-scale edible tissue. Because that's a large complex change, it won't just happen by accident; something about our process for making the cells has to put those bits there.

Bob: In a recent paper, a polygenic score for culturable meat is given. Since we now have the relevant polygenic score, we actually have a short handle for the target: namely, a pointer to an implementation of this polygenic score as a computer program.

Alice: That seems of limited relevance. It's definitely relevant in that, if I grant the premise that this is actually the right polygenic score (which I don't), we now know what exactly we would put in the genome if we could. That's one part of the problem solved, but it's not the part I was talking about. I'm talking about the part where I don't know how to steer the genome precisely enough to get anywhere complex.

Bob: You've been bringing up the complexity of the genomic target. I'm saying that actually the target isn't that complex, because it's just a function call to the PGS.

Alice: Ok, yes, we've greatly decreased the relative algorithmic complexity of the right genome, in some sense. It is indeed the case that if I ran a computer program randomly sampled from strings I could type into a python file, it would be far more likely to output the right genome if I have the PGS file on my computer compared to if I don't. True. But that's not very relevant because that's not the process we're discussing. We're discussing the process that creates a cell with its genome, not the process that randomly samples computer programs weighted by [algorithmic complexity in the python language on my computer]. The problem is that I don't know how to interface with the cell-creation process in a way that lets me push bits of selection into it. Instead, the cell-creation process just mostly does its own thing. Even if I do end-to-end phenotype selection, I'm not really steering the core process of cell-genome-selection.

Bob: I understand, but you were saying that the complexity of the target makes the whole task harder. Now that we have the PGS, the target is not very complex; we just point at the PGS.

Alice: The point about the complexity is to say that cells growing in my lab won't just spontaneously start having the 500 alleles I want. I'd have to do something to them--I'd have to know how to pump selection power into them. It's some specific technique I need to have but don't have, for dealing with cells. It doesn't matter that the random-program complexity has decreased, because we're not talking about random programs, we're talking about cell-genome-selection. Cell-genome-selection is the process where I don't know how to consistently pump bits into, and it's the process that doesn't by chance get the 500 alleles. It's the process against which I'm measuring complexity.

Comment by TsviBT on TsviBT's Shortform · 2024-10-16T23:30:11.922Z · LW · GW

Protip: You can prevent itchy skin from being itchy for hours by running it under very hot water for 5-30 seconds. (Don't burn yourself; I use tap water with some cold water, and turn down the cold water until it seems really hot.)

Comment by TsviBT on Overview of strong human intelligence amplification methods · 2024-10-16T23:20:35.516Z · LW · GW

I don't know enough to evaluate your claims, but more importantly, I can't even just take your word for everything because I don't actually know what you're saying without asking a whole bunch of followup questions. So hopefully we can hash some of this out on the phone.

Comment by TsviBT on Overview of strong human intelligence amplification methods · 2024-10-16T22:52:48.332Z · LW · GW

exogenous driving of a fraction of cortical tissue to result in suffering of the subjects

My reason is that suffering in general seems related to [intentions pushing hard, but with no traction or hope]. A subspecies of that is [multiple drives pushing hard against each other, with nobody pulling the rope sideways]. A new subspecies would be "I'm trying to get my brain tissue to do something, but it's being externally driven, so I'm just scrabbling my hands futilely against a sheer blank cliff wall." and "Bits of my mind are being shredded because I create them successfully by living and demanding stuff of my brain, but the the bits are exogenously driven / retrained and forget to do what I made them to do.".

Comment by TsviBT on Overview of strong human intelligence amplification methods · 2024-10-16T22:34:25.144Z · LW · GW

It is quite impractical. A weird last ditch effort to save the world. It wouldn't be scalable, you'd be enhancing just a handful of volunteers who would then hopefully make rapid progress on alignment.

Gotcha. Yeah, I think these strategies probably just don't work.

It seems less problematic to me than a single ordinary pig farm, since you'd be treating these pigs unusually well.

The moral differences are:

  • Humanized neurons.
  • Animals with parts of their brains being exogenously driven; this could cause large amounts of suffering.
  • Animals with humanized thinking patterns (which is part of how the scheme would be helpful in the first place).

Weird that you'd feel good about letting the world get destroyed in order to have one fewer pig farm in it.

Where did you get the impression that I'd feel good about, or choose, that? My list of considerations is a list of considerations.

That said, I think morality matters, and ignoring morality is a big red flag.

Separately, even if you're pretending to be a ruthless consequentialist, you still want to track morality and ethics and ickyness, because it's a very strong determiner of whether or not other people will want to work on something, which is a very strong determiner of success or failure.

Comment by TsviBT on leogao's Shortform · 2024-10-16T22:19:51.119Z · LW · GW

the possibility that a necessary ingredient in solving really hard problems is spending a bunch of time simply not doing any explicit reasoning

I have a pet theory that there are literally physiological events that take minutes, hours, or maybe even days or longer, to happen, which are basically required for some kinds of insight. This would look something like:

  • First you do a bunch of explicit work trying to solve the problem. This makes a bunch of progress, and also starts to trace out the boundaries of where you're confused / missing info / missing ideas.

  • You bash your head against that boundary even more.

    • You make much less explicit progress.
    • But, you also leave some sort of "physiological questions". I don't know the neuroscience at all, but to make up a story to illustrate what sort of thing I mean: One piece of your brain says "do I know how to do X?". Some other pieces say "maybe I can help". The seeker talks to the volunteers, and picks the best one or two. The seeker says "nah, that's not really what I'm looking for, you didn't address Y". And this plays out as some pattern of electrical signals which mean "this and this and this neuron shouldn't have been firing so much" (like a backprop gradient, kinda), or something, and that sets up some cell signaling state, which will take a few hours to resolve (e.g. downregulating some protein production, which will eventually make the neuron a bit less excitable by changing the number of ion pumps, or decreasing the number of synaptic vesicles, or something).
  • Then you chill, and the physiological questions mostly don't do anything, but some of them answer themselves in the background; neurons in some small circuit can locally train themselves to satisfy the question left there exogenously.

See also "Planting questions".

Comment by TsviBT on Overview of strong human intelligence amplification methods · 2024-10-16T22:05:28.575Z · LW · GW

Pigs

That's creative. But

  • It seems immoral, maybe depending on details. Depending on how humanized the neurons are, and what you do with the pigs (especially the part where human thinking could get trained into them!), you might be creating moral patients and then maiming and torturing them.
  • It has a very high ick factor. I mean, I'm icked out by it; you're creating monstrosities.
  • I assume it has a high taboo factor.
  • It doesn't seem that practical. I don't immediately see an on-ramp for the innovation; in other words, I don't see intermediate results that would be interesting or useful, e.g. in an academic or commercial context. That's in contrast to germline engineering or brain-brain interfaces, which have lots of component technologies and partial successes that would be useful and interesting. Do you see such things here?
  • Further, it seems far far less scalable than other methods. That means you get way less adoption, which means you get way fewer geniuses. Also, importantly, it means that complaints about inequality become true. With, say, germline engineering, anyone who can lease-to-own a car can also have genetically super healthy, sane, smart kids. With networked-modified-pig-brain-implant-farm-monster, it's a very niche thing only accessible to the rich and/or well-connected. Or is there a way this eventually results in a scalable strong intelligence boost?

You probably get something like 10000x as much brain tissue per dollar using pigs than neurons in a petri dish.

That's compelling though, for sure.

On the other hand, the quality is going to be much lower compared to human brains. (Though presumably higher quality compared to in vitro brain tissue.) My guess is that quality is way more important in our context. I wouldn't think so as strongly if connection bandwidth were free; in that case, plausibly you can get good work out of the additional tissue. Like, on one end of the spectrum of "what might work", with low-quality high-bandwidth, you're doing something like giving each of your brain's microcolumns an army of 100 additional, shitty microcolumns for exploration / invention / acceleration / robustness / fan-out / whatever. On the other end, you have high-quality low-bandwidth: multiple humans connected together, and it's maybe fine that bandwidth is low because both humans are capable of good thinking on their own. But low-quality low-bandwith seems like it might not help much--it might be similar to trying to build a computer by training pigs to run around in certain patterns.

How important is it to humanize the neurons, if the connections to humans will be remote by implant anyway? Why use pigs rather than cows? (I know people say pigs are smarter, but that's also more of a moral cost; and I'm wondering if that actually matters in this context. Plausibly the question is really just, can you get useful work out of an animal's brain, at all; and if so, a normal cow is already "enough".)

Comment by TsviBT on Overview of strong human intelligence amplification methods · 2024-10-16T21:41:04.374Z · LW · GW

My studies of the compute graph of the brain

This is interesting, but I don't understand what you're trying to say and I'm skeptical of the conclusion. How does this square with half the brain being myelinated axons? Are you talking about adult brains or child brains? If you're up for it, maybe let's have a call at some point.

Comment by TsviBT on Overview of strong human intelligence amplification methods · 2024-10-16T21:36:55.609Z · LW · GW

Growing brain tissue in a vat is relatively hard and expensive compared to growing brain tissue in an animal. Also, it's going to be less well-ordered neural nets, which matters a lot. Well organized cortical microcolumns work well, disordered brain tissue works much less well.

Yep, I agree. I vaguely alluded to this by saying "The main additional obstacle [...] is growing cognitively useful tissue in vitro."; what I have in mind is stuff like:

  • Well-organized connectivity, as you say.
  • Actually emulating 5-minute and 5-day behavior of neurons--which I would guess relies on being pretty neuron-like, including at the epigenetic level. IIUC current in vitro neural organoids are kind of shitty--epigenetically speaking they're definitely more like neurons than like hepatocytes, but they're not very close to being neurons.
  • Appropriate distribution of cell types (related to well-organized connectivity). This adds a whole additional wrinkle. Not only do you have to produce a variety of epigenetic states, but also you have to have them be assorted correctly (different regions, layers, connections, densities...). E.g. the right amount of glial cells...
Comment by TsviBT on Overview of strong human intelligence amplification methods · 2024-10-16T21:26:50.390Z · LW · GW

The experiments that have been tried in humans have been extremely conservative, aiming to fix problems in the most well-understood but least-relevant-to-intelligence areas of the brain (sensory input, motor output). [....] This is not evidence that the tech itself is actually this limited.

Your characterization of the current state of research matches my impressions (though it's good to hear from someone who knows more). My reasons for thinking BCIs are weaksause have never been about that, though. The reasons are that:

  • I don't see any compelling case for anything you can do on a computer which, when you hook it up to a human brain, makes the human brain very substantially better at solving philosophical problems. I can think of lots of cool things you can do with a good BCI, and I'm sure you and others can think of lots of other cool things, but that's not answering the question. Do you see a compelling case? What is it? (To be more precise, I do see compelling cases for the few areas I mentioned: prosthetic intrabrain connectivity and networking humans. But those both seem quite difficult technically, and would plausibly be capped in their success by connection bandwidth, which is technically difficult to increase.)
  • It doesn't seem like we understand nearly as much about intelligence compared to evolution (in a weak sense of "understand", that includes stuff encoded in the human genome cloud). So stuff that we'll program in a computer will be qualitatively much less helpful for real human thinking, compared to just copying evolution's work. (If you can't see that LLMs don't think, I don't expect to make progress debating that here.)
Comment by TsviBT on Overview of strong human intelligence amplification methods · 2024-10-16T21:15:35.482Z · LW · GW

You could induce an increase in skull size by putting a special helmet on the infant that kept a slight negative air pressure outside the skull.

BTW, do not maim children in the name of X-risk reduction (or in any other name).