Posts

Comments

Comment by espoire on Core Pathways of Aging · 2024-08-09T16:12:33.452Z · LW · GW

If there's any reason to suspect grant-givers to be uninformed on the topic, or biased against it, crowd-sourcing a sum of that size sounds possible.

Comment by espoire on AI Alignment Metastrategy · 2024-08-01T05:11:03.113Z · LW · GW

Agreed.

...which would imply that dangers should be minimal from either slow augmentation which has time to become ubiquitous in the gene pool, or from limited augmentation that does not exceed a few standard deviations from the current mean. Assuming, of course, that our efforts don't cause unwanted values shift.

I think all currently progressing human enhancement projects of which I am aware are not expecting gains so large as to be dangerous, and therefore worthy of support.

Comment by espoire on The Redaction Machine · 2024-07-25T07:19:33.140Z · LW · GW

I'd imagine that could be arranged. I live with an unusually fast rate of forgetting. With effort, I suspect my condition could be reverse-engineered and replicated.

After a couple years, I can re-experience something not knowing where the whole plot goes, but always knowing where the current scene will go. In games, I experience that with the plot, but still have my muscle memory. Great time for re-plays on hard mode; by this time I have almost all the skills, and almost none of the plot spoilers.

After about 5 years, I'll remember roughly how something made me feel overall, but little else. This is often about the time when I seek out a re-exposure for the things I remember as being unusually high quality.

It's unclear how long it takes exactly -- matters are often unclear for me where they rely on my memory as a key input -- but after some amount of time, my level of recall fades to "vaguely familiar" and then I completely forget that I've seen a thing at all. I'd estimate on the order of a decade or so.

I have the exact superpower people so often jokingly wish for around great media. As you can imagine from the nature of memory loss, I completely fail to appreciate my situation.

Comment by espoire on Generalizing From One Example · 2024-07-20T07:51:34.486Z · LW · GW

Wow, thanks for sharing! I had been taking my ability to imagine sounds completely for granted and now I find myself appreciating this ability.

Comment by espoire on Working hurts less than procrastinating, we fear the twinge of starting · 2024-07-02T18:39:06.516Z · LW · GW

It's semi-fictional evidence, but there's a rule in the Neverworld tabletop roleplaying game core rules that says something to the effect of success restores willpower points. This implies that at least someone else long ago had the same observation.

Comment by espoire on Working hurts less than procrastinating, we fear the twinge of starting · 2024-07-02T18:32:52.402Z · LW · GW

I do have that problem with swimming. I share the tendency that Eliezer points out, but I think we are both atypical in this shared way, rather than that Eliezer is on to a new explanation for a ubiquitous mental phenomenon.

Comment by espoire on Try to solve the hard parts of the alignment problem · 2024-05-28T07:08:41.992Z · LW · GW

The "sharp left turn" refers to a breakdown in alignment caused by capabilities gain.

An example: the sex drive was a pretty excellent adaptation at promoting inclusive genetic fitness, but when humans capabilities expanded far enough, we invented condoms. "Inventing condoms" is not the sort of behavior that an agent properly aligned with the "maximize inclusive genetic fitness" goal ought to execute.

At lower levels of capability, proxy goals may suffice to produce aligned behavior. The hypothesis is that most or all proxy goals will suddenly break down at some level of capability or higher, as soon as the agent is sufficiently powerful to find strategies that come close enough to maximizing the proxy.

This can cause many AI plans to fail, because most plans (all known so far?) fail to ensure the agent is actually pursuing the implementor's true goal, and not just a proxy goal.

Comment by espoire on The Cartoon Guide to Löb's Theorem · 2024-03-07T15:10:36.557Z · LW · GW

I think the missing piece is that it's really hard to formally-specify a scale of physical change.

I think the notion of "minimizing change" is secretly invoking multiple human brain abilities, which I suspect will each turn out to be very difficult to formalize. Given partial knowledge of a current situation S: (1) to predict the future states of the world if we take some hypothetical action, (2) to invent a concrete default / null action appropriate to S, and (3) to informally feel which of two hypothetical worlds is more or less "changed" with respect to the predicted null-action world.

I think (1) (2) and (3) feel so introspectively unobtrusive because we have no introspective access into them; they're cognitive black-boxes. We just see that their outputs are nearly always available when we need them, and fail to notice the existence of the black-boxes entirely.

You'll also require an additional ability, a stronger form of (3) which I'm not sure even humans implement: (4) given two hypothetical worlds H1 and H2, and the predicted null-action world W0, compute the ratio difference(H1, W0) / difference(H2, W0), without dangerous corner-cases.

If you can formally specify (1) (2) and (4), then yes! I then think you can use that to construct a utility function that won't obsess (won't "tile the universe") using the plan you described -- though I recommend investing more effort than my 30-minute musings to prove safety, if you seem poised to actually implement this plan.

Some issues I foresee:

  • Humans are imperfect at (1) and (2), and the (1)- and (2)-outputs are critical to not just ensuring non-obsession, but also to the intelligence quality of the AI overall. While formalizing human (1) and (2) algorithms may enable human-level general AI (a big win in its own right), superhuman AI will require non-human formalizations for (1) and (2). Inventing non-human formalizations here feels difficult and risky -- though perhaps unavoidable.

  • The hypothetical world states in (4) are very-very-high-dimensional objects, so corner-cases in (4) seem non-trivial to rule-out. A formalization of the human (3)-implementation might be sufficient for some viable alternative plan, in which case the difficulty of formalizing (3) is bounded-above by the difficulty of reverse-engineering the human (3) neurology. By contrast, inventing an inhuman (4) could be much more difficult and risky. This may be weak evidence that plans merely requiring (3) ought to be preferred over plans requiring (4).

Comment by espoire on Changing Emotions · 2024-02-07T06:39:13.335Z · LW · GW

I'm transgender myself, currently a few years into transition, and I actually experienced some of the issues you predicted above.

I did need to relearn basic locomotion as my body shape changed over months. I started hormone replacement in early winter, and when I resumed distance running in the late spring, I was surprised to discover that I needed to relearn how to run. My gait was different enough that running took actual focus just to avoid falling down.

I also experienced a pretty bizarre period of about a year where my body had changed substantially, but my sensory map of my body hadn't. That issue eventually corrected itself, and as it did, I became unable to remember what it felt like to have my original configuration. A bunch of old memories lost that detail, though the remainder of those memories remain intact.

...and that's just from bodily changes.

I strongly agree with your thesis. Altering the mind is hard. Faced with a mismatch between my body and my mind, changing the mind to match my body or vice versa would have been equally good solutions. Changing the body is so much easier, which is why I chose that path.