Lorxus's Shortform 2024-05-18T17:57:19.721Z
(Geometrically) Maximal Lottery-Lotteries Are Probably Not Unique 2024-05-10T16:00:08.217Z
(Geometrically) Maximal Lottery-Lotteries Exist 2024-05-03T19:29:01.775Z
My submission to the ALTER Prize 2023-09-30T16:07:35.190Z
Untangling Infrabayesianism: A redistillation [PDF link; ~12k words + lots of math] 2023-08-01T12:42:35.744Z


Comment by Lorxus on D&D.Sci Alchemy: Archmage Anachronos and the Supply Chain Issues Evaluation & Ruleset · 2024-06-19T20:13:40.555Z · LW · GW

Perhaps, but don't make a virtue of not using the more powerful tools, the objective is to find the truth, not to find it with handicaps...

I'm obviously seeking out more powerful tools, too - I just haven't got them yet. I don't think it's intrinsically good to stick to less powerful tools, but I do think that it's intrinsically good to be able to fall back to those tools if you can still win.

And when I need to go out and find truth for real, I don't deny myself tools, and I rarely go it alone. But this is not that. 

Comment by Lorxus on D&D.Sci Alchemy: Archmage Anachronos and the Supply Chain Issues Evaluation & Ruleset · 2024-06-19T03:48:03.129Z · LW · GW

...Lorxus managed to get a perfect score with relatively little in the way of complicated methods/tools...

I have struckthrough part of the previous comment, given the edit. I need no longer stand by it as a complaint.

Comment by Lorxus on D&D.Sci Alchemy: Archmage Anachronos and the Supply Chain Issues Evaluation & Ruleset · 2024-06-19T03:45:13.721Z · LW · GW

If the Big Bad is disguised as your innkeeper while the real innkeeper is tied up in the cellar, I think I can say 'The innkeeper tells you it'll be six silver for a room', I don't think I need to say 'The man who introduced himself to you as the innkeeper.'

Perhaps, but you could also simply say "Yeah, the guy at the counter tells you the room will be 6 silver."

Comment by Lorxus on D&D.Sci Alchemy: Archmage Anachronos and the Supply Chain Issues Evaluation & Ruleset · 2024-06-18T03:43:58.785Z · LW · GW

...bearing this out, it looks like Lorxus managed to get a perfect score with relatively little actual Data Science just by thinking about what it might mean that including lots of ingredients led to Magical Explosions and including few ingredients led to Inert Glop.

Not quite true! That's where I started to break through, but after that I noticed the Mutagenic Ooze issue as well. It also took me a lot of very careful graceful use of pivot tables. Gods beyond, that table chugged. (And if I can pull the same Truth from the void with less powerful tools, should that not mark me as more powerous in the Art? :P)

I guess I'm not clear on what "actual Data Science" would involve, if not making hypotheses and then conducting observational-experiments? I figured out the MO mechanic specifically by looking at brews that coded for pairs of potions, for the major example. The only thing that would have changed if I'd known SQL would be speed, I suspect.

...and documented his thought process very well, thank you Lorxus!

Always a pleasure! I had a lot of fun with this one. I was a little annoyed by the undeclared bonus objective - I would have wanted any indication at all in the problem statement that anything was not as it appeared. I did notice the correspondence in (i.a.) the Farsight Potion but in the absence of any reason to suspect that the names were anything but fluff, I abstracted away anything past the ingredients being a set of names. Maybe be minimally more obvious? At any rate I'd be happy to be just as detailed in future, if that's something you want. 

Comment by Lorxus on TsviBT's Shortform · 2024-06-17T01:47:34.486Z · LW · GW

I'd go stronger than just "not for certain, not forever", and I'd worry you're not hearing my meaning (agree or not).

That's entirely possible. I've thought about this deeply for entire tens of minutes, after all. I think I might just be erring (habitually)  on the side of caution in qualities of state-changes I describe expecting to see from systems I don't fully understand. OTOH... I have a hard time believing that even (especially?) an extremely capable mind would find it worthwhile to repeatedly rebuild itself from the ground up, such that few of even the ?biggest?/most salient features of a mind stick around for long at all.

Comment by Lorxus on TsviBT's Shortform · 2024-06-17T01:28:49.667Z · LW · GW

You might complain that the reason it doesn't solve stability is just that the thing doesn't have goal-pursuits.

Not so - I'd just call it the trivial case and implore us to do better literally at all!

Apart from that, thanks - I have a better sense of what you meant there. "Deep change" as in "no, actually, whatever you pointed to as the architecture of what's Really Going On... can't be that, not for certain, not forever."

Comment by Lorxus on TsviBT's Shortform · 2024-06-17T00:47:16.586Z · LW · GW

Say more about point 2 there? Thinking about 5 and 6 though - I think I now maybe have a hopeworthy intuition worth sharing later.

Comment by Lorxus on My AI Model Delta Compared To Christiano · 2024-06-16T00:40:12.478Z · LW · GW

At a meta level, I find it pretty funny that so many smart people seem to disagree on the question of whether questions usually have easily verifiable answers.

And at a twice-meta level, that's strong evidence for questions not generically having verifiable answers (though not for them generically not having those answers).

Comment by Lorxus on The Leopold Model: Analysis and Reactions · 2024-06-14T22:38:52.573Z · LW · GW

A reckless China-US race is far less inevitable than Leopold portrayed in his situational awareness report. We’re not yet in a second Cold War, and as things get crazier and leaders get more stressed, a “we’re all riding the same tiger” mentality becomes plausible.

I don't really get why people keep saying this. They do realize that the US's foreign policy starting in ~2010 has been to treat China as an adversary, right? To the extent that they arguably created the enemy they feared within just a couple of years? And that China is not in fact going to back down because it'd be really, really nice of them if they did, or because they're currently on the back foot with respect to AI?

At some point, "what if China decides that the west's chip advantage is unacceptable and glasses Taiwan and/or Korea about it" becomes a possible future outcome worth tracking. It's not a nice or particularly long one, but "flip the table" is always on the table.

Leopold’s is just one potential unfolding, but a strikingly plausible one. Reading it feels like getting early access to Szilard’s letter in 1939.

What, and that triggered no internal valence-washing alarms in you?


Getting a 4.18 means that a majority of your grades were A+, and that is if every grade was no worse than an A. I got plenty of As, but I got maybe one A+. They do not happen by accident.

One knows how the game is played; and is curious on whether he took Calc I at Columbia (say). Obviously not sufficient, but there's kinds and kinds of 4.18 GPAs.

Comment by Lorxus on Spatial attention as a “tell” for empathetic simulation? · 2024-06-12T12:41:02.420Z · LW · GW

If we momentarily pay attention to something about our own feelings, consciousness, and state of mind, then (I claim) our spatial attention is at that moment centered somewhere in our own bodies—more specifically, in modern western culture, it’s very often the head, but different cultures vary. Actually, that’s a sufficiently interesting topic that I’ll go on a tangent: here’s an excerpt from the book Impro by Keith Johnstone:

The placing of the personality in a particular part of the body is cultural. Most Europeans place themselves in the head, because they have been taught that they are the brain. In reality of course the brain can’t feel the concave of the skull, and if we believed with Lucretius that the brain was an organ for cooling the blood, we would place ourselves somewhere else. The Greeks and Romans were in the chest, the Japanese a hand’s breadth below the navel, Witla Indians in the whole body, and even outside it. We only imagine ourselves as ‘somewhere’.

Meditation teachers in the East have asked their students to practise placing the mind in different parts of the body, or in the Universe, as a means of inducing trance.… Michael Chekhov, a distinguished acting teacher…suggested that students should practise moving the mind around as an aid to character work. He suggested that they should invent ‘imaginary bodies’ and operate them from ‘imaginary centres’…

Johnstone continues from here, discussing at length how moving the implicit spatial location of introspection seems to go along with rebooting the personality and sense-of-self. Is there a connection to the space-referenced implementation of innate social drives that I’m hypothesizing in this post? I’m not sure—food for thought. Also possibly related: Julian Jaynes’s Origin of Consciousness in the Breakdown of the Bicameral Mind, and the phenomenon of hallucinated voices.

@WhatsTrueKittycat Potentially useful cogtech for both meditation and mental-proscenium-training.

Comment by Lorxus on "Metastrategic Brainstorming", a core building-block skill · 2024-06-11T23:48:51.171Z · LW · GW

@WhatsTrueKittycat (meta?-)cogtech worth looking at, for effectiveness, elegance, and sheer breadth of applicability.

Comment by Lorxus on [Valence series] 3. Valence & Beliefs · 2024-06-11T21:16:14.764Z · LW · GW

Here's the thing - I don't really think it does work all that well in a milder setting, at least not until you've gone through the hypervigilant hell of the full-flavor version and only then got your anxiety back down. If you can't set that dial to "placid equanimity" or anything in the same zipcode, and you don't crank that dial all the way to near-max (to the point where it eventually just plain burns-in), then I posit that you won't actually end up sufficiently desperate to find all your plan's important flaws, and may well fail immediately to coalesce (if it's set way too low) or just plain get overwhelmed and shut down/quit too soon (if it's set only a little too low). You need to end up - at least at the start - in the land of anxiety-beyond-anxiety, apprised of the certain knowledge that there exists no correct direction but forwards but that all the wrong directions look a little like "forwards", too.

Comment by Lorxus on [Valence series] 3. Valence & Beliefs · 2024-06-11T14:43:34.092Z · LW · GW

OK I've definitely been misunderstood here. I'm using the impersonal-you to describe what other things have to be true for powering murphy-jitsu with anxious-rumination to work at all, partially based off personal experience.

Comment by Lorxus on [Valence series] 5. “Valence Disorders” in Mental Health & Personality · 2024-06-11T13:41:46.469Z · LW · GW

I don’t understand what you have in mind here. Why would a slight negative bias turn into a big negative bias? What causes the snowball? Sometimes I feel kinda lousy, and then the next day I feel great, right?

Sure, but if you're a little kid, I predict that your spread of valences is larger than for an adult, and if anything prone to some polarization; additionally, you might not yet even think you should distinguish "things are going poorly for me" from "I am bad". Additionally - you end up thinking about yourself in the context of the negative-valenced thing, and your self-concept takes a hit. (I predict that it's probably equally easy in principle to make a little kid enduringly manic, but that world conditions and the greater ease of finding suffering over pleasure means you get depression more often.)


I’m not sure; that’s not so obvious to me. You seem to be referring to irritability and anger, which are different from valence. They’re “moods”, I guess?

I think I've been misunderstood here. I'm talking about having someone blocking the aisle in a grocery store if you're negative-biased vs positive-biased on valence. If you're positive-biased, oh well, whatever, you'll find another way around, or even maybe take the risk of asking them politely to move. If you're negative-biased, though, screw this, screw this whole situation, screw that inconsiderate jerk for blocking the one aisle I need to get at, no I'm not going to go ask them to move - they have no reason to listen to me - have you lost your mind?

Rather than, say, bursting into rage, which I agree is not something negative valence would predict.

Irritability is, umm, I guess low-level anger, and/or being quick to anger?

Not really how I'm trying to use that here. I'm trying to gesture at the downstream effects of having a mind that experiences negatively-biased valences - being quicker to reject a situation, or to give up, or to permit contagious negative valences to spread to entities only sort of involved with whatever's going on.

Comment by Lorxus on What if a tech company forced you to move to NYC? · 2024-06-11T03:34:36.984Z · LW · GW

Counterpoint: "so you're saying I could guarantee taking every single last one of those motherfuckers to the grave with me?"

Comment by Lorxus on What happens to existing life sentences under LEV? · 2024-06-11T03:20:06.813Z · LW · GW

My genuine best guess is "they don't, actually, get offered longevity extension; prisoners can't even expect to get life-saving prescribed medicine or obviously indicated treatments (eg to get a broken leg splinted), let alone HRT or anything elective; also, approximately no one has incentive to let prisoners get (presumably expensive) longevity extension therapy, unless it's to make damn sure they serve every last one of those 30,000 years."

Comment by Lorxus on D&D.Sci Alchemy: Archmage Anachronos and the Supply Chain Issues · 2024-06-11T03:08:33.875Z · LW · GW

Potentially? I'd be worried that that would be too obvious of us and he'd notice immediately. I think I weakly prefer giving him

actual Barkskin using the two woods and three magically charged ingredients coding for no potion

instead - no use complaining about getting what you asked for!

Comment by Lorxus on [Intro to brain-like-AGI safety] 5. The “long-term predictor”, and TD learning · 2024-06-11T00:48:28.729Z · LW · GW

Thus, pretty much any instance where an experimenter has measured that a dopamine neuron is correlated with some behavioral variable, it’s probably consistent with my picture too.

I don't think this is nearly as good a sign as you seem to think. Maybe I haven't read closely enough, but surely we shouldn't be excited by the fact that your model doesn't constrain its expectation of dopaminergic neuronal firing any more or any differently than existing observations have? Like, I'd expect to have plausible-seeming neuronal firing that your model predicts not to happen, or something deeply weird about the couple of exceptional cases of dopaminergic neuronal firing that your model doesn't predict, or maybe some weird second-order effect where yes actually it looks like my model predicts this perfectly but actually it's the previous two distributional-overlap-failures "cancelling out", but "my model can totally account for all the instances of dopaminergic neuronal firing we've observed" makes me worried.

Comment by Lorxus on A Theory of Laughter · 2024-06-10T23:00:29.342Z · LW · GW


  • (A) IF my hypothalamus & brainstem are getting some evidence that I’m in danger
    • (the “evidence” here would presumably be some of the same signals that, by themselves, would tend to cause physiological arousal / increase my heart rate / activate my sympathetic nervous system)
  • (B) AND my hypothalamus & brainstem are simultaneously getting stronger evidence that I’m safe
    • (the “evidence” here would presumably be some of the same signals that, by themselves, would tend to activate my parasympathetic nervous system)
  • (C) AND my hypothalamus & brainstem have evidence that I’m in a social situation
  • (D) THEN I will emit innate play signals (e.g. laughter in humans), and also I will feel more energetic (on the margin), and more safe, less worried, etc.

This makes me wonder about PTSD/trauma responses. If I shake your model of laughter here a bit, it does produce the etiology that trauma can damage or destroy (B) (I felt this safe just before our position got overrun and almost everyone died) or (C) (social situations are horribly dangerous! that's where we got assaulted!), which produces a lot of the rest; it also suggests that if you wanted to treat trauma responses maximally effectively, you should go figure out which of (B) and (C) got damaged, and specifically target interventions to fix them. (Or possibly also something about (A) getting overly strongly procced with respect to (B)? But my guess would be that strengthening (B) would be easier-per-unit-effect than weakening (A) and have fewer/less-bad side effects.)

Comment by Lorxus on [Valence series] 3. Valence & Beliefs · 2024-06-10T22:44:29.201Z · LW · GW

(Per the previous subsection, anxious rumination can work, and I think really does work for some people, but that’s not an ideal strategy, for many reasons. Isn’t there any other way?)

It only works if one of the things you're most afraid of, most anxious about, most desperate to avoid ending up having happen, is self-delusion/having an [inaccurate/unclear/dagnerous incomplete] view of the world/being wrong in public. (This has painful second-order effects and also tends to get caught chasing its own tail; these failure modes require additional cognitive technology to mitigate.)

Then, you have extreme negative valence on "just stop thinking about the bad thing that might happen" and weak-but-surprisingly-at-all positive valence on "I should really really make sure to dot every i and cross every t in stress-testing this plan". 

Comment by Lorxus on [Valence series] 5. “Valence Disorders” in Mental Health & Personality · 2024-06-10T21:39:39.435Z · LW · GW

I do actually want to hear some more about (clinically-attested) aspects of depression you think seem match Valence Theory well, and also about any you think match it poorly. There's definitely some prediction there about depression being horrifyingly trivially easy to induce in (eg) young children, who - due to a difficult home life, or the cruelty of children, or a neurodivergence - end up with thoughts very slightly negatively biased in valence... and then the snowball's off and away down the mountainside.

Here's another thing you didn't note that Valence theory gets right - it predicts irritability as a major symptom of depression, especially irritability towards otherwise valence-neutral randos. You, An Depressive, are a tiny bit skewed valence-negative towards everything, and that means your thoughts about the randos you encounter are skewed a tiny bit valence-negative. But on the margin, that turns valence-neutral randos into that fucking asshole who won't get out of the way in the grocery store.

Comment by Lorxus on [Valence series] 5. “Valence Disorders” in Mental Health & Personality · 2024-06-10T21:36:23.142Z · LW · GW

There seems to be a close link between energy levels, belief in ones ability to achieve ones goals, and confidence.

It seems to me like this might be tangled up in the whole thing where (IIRC) the feeling of tiredness/fatigue/"whole-body burn" during exercise is generally not because your muscles really are giving out or you're about to damage yourself, but because your brain doesn't think that the risk of harm and very certain tiredness and vulnerability are worth it. It even points towards why depressed people (among others) tend not to enjoy exercise as much and often give up earlier or do less, even if they benefit from it just as much.

Comment by Lorxus on My AI Model Delta Compared To Yudkowsky · 2024-06-10T20:55:44.057Z · LW · GW

If I've understood you correctly, you consider your only major delta with Elizer Yudkowsky to be whether or not natural abstractions basically always work or reliably exist harnessably, to put it in different terms. Is that a fair restatement? 

If so, I'm (specifically) a little surprised that that's all. I would have expected whatever reasoning the two of you did differently or whatever evidence the two of you weighted differently (or whatever else) would have also given you some other (likely harder to pin down) generative-disagreements (else maybe it's just really narrow really strong evidence that one of you saw and the other didn't???).

Maybe that's just second-order though. But I would still like to hear what the delta between NADoom!John and EY still is, if there is one. If there isn't, that's surprising, too, and I'd be at least a little tempted to see what pairs of well-regarded alignment researchers still seem to agree on (and then if there are nonobvious commonalities there).

Also, to step back from the delta a bit here -

  • Why are you as confident as you are - more confident than the median alignment researcher, I think - about natural abstractions existing to a truly harnessable extent?
  • What makes you be ~85% sure that even really bizarrely[1] trained AIs will have internal ontologies that humanish ontologies robustly and faithfully map into? Are there any experiments, observations, maxims, facts, or papers you can point to?
  • What non-obvious things could you see that would push that 85ish% up or down; what reasonably-plausible (>1-2%, say) near-future occurrences would kill off the largest blocks of your assigned probability mass there?
  1. ^

    For all we know, all our existing training methods are really good at producing AIs with alien ontologies, and there's some really weird unexpected procedure you need to follow that does produce nice ontology-sharing aligned-by-default AIs. I wouldn't call it likely, but if we feel up to positing that possibility at all, we should also be willing to posit the reverse.

Comment by Lorxus on Good ways to monetarily profit from the increasing demand for power? · 2024-06-10T17:16:52.734Z · LW · GW

Recently, it's become clear to me that power will be much more of a bottleneck than GPUs (and therefore even more valuable).

Please show your work.

Comment by Lorxus on Announcing ILIAD — Theoretical AI Alignment Conference · 2024-06-10T16:27:56.834Z · LW · GW

Also: if I get accepted to come to ILIAD I am going to make delicious citrus sodas.[1] Maybe I could even run a pair of panels about that?[2] That seemed extremely out of scope though so I didn't put it in the application.

  1. ^

    Better than you've had before. Like, ever. Yes I am serious, I've got lost lore. Also, no limit on the flavor as long as it's a citrus fruit we can go and physically acquire on-site. Also, no need at all for a stove or heating element.

  2. ^

    There is a crucially important time-dependent step on the scale of hours, so a matched pair of panels would be the best format.

Comment by Lorxus on Game Theory without Argmax [Part 1] · 2024-06-10T14:08:40.243Z · LW · GW

Likewise, higher-order game theory promises a normalisation of game theory.

I don't know why, but this smells correct to me.

I have the same intuition - I agree that this smells correct. I suspect that the answer has something to do with the thing where when you take measurements (for whatever that should mean) of a system, what that actually looks like is some spectactularly non-injective map from [systems of interest in the same grouping as the system we're looking at] to , which inevitably destroys some information, by noninjectivity.

So I agree that it smells right, to quantify over the actual objects of interest and then maybe you apply information-destroying maps that take you outside spacetime (into math) rather than applying your quantifiers outside of spacetime after you've already applied your map and then crossing your fingers that the map is as regular/well-behaved as it needs to be.

Comment by Lorxus on Sev, Sevteen, Sevty, Sevth · 2024-06-08T16:22:57.880Z · LW · GW

I do sort of the same thing, except I'm pretty sure I'm pronouncing the "e" in "sen" as a long "e" in the literal linguistic vowel-length sense. I wonder if this is how English gets phonemic vowel length back?

Comment by Lorxus on D&D.Sci Alchemy: Archmage Anachronos and the Supply Chain Issues · 2024-06-08T00:27:53.902Z · LW · GW

Giving this a try. I'll document my progress towards a solution.

(2024-6-7) [Poking at the foundations?]

Looks like you only get magical explosions and mutagenic ooze with >=4 ingredients, and also like potions are determined by a specific pair of ingredients. List is the next spoiler:

Farsight: Beholder Eye and Eye of Newt
Firebreathing: Dragon Spleen and Dragon's Blood
Fire Resistance: Crushed Ruby and Dragon Scale
Glibness: Dragon Tongue and Powdered Silver
Growth: Giant Toe and Redwood Sap
Invisibility: Crushed Diamond and Ectoplasm
Necromantic Power: Beech Bark and Oaken Twigs
Rage: Badger Skull and Demon Claw
Regeneration: Troll Blood and Vampire Fang

Barkskin: Crushed Onyx and Ground Bone - both of which we have!

So... why aren't we just done? Why don't we just tell Anachronos to add Crushed Onyx and Ground Bone and have done with it?

(2024-6-8) [Early analysis, first solution proposal]

Well: I still don't know yet what makes a brew succeed or fail.

My working hypothesis is that a major thing determining what happens when you try to brew a potion is what pairs of ingredients are found in it.

I suspect it's even deterministic - i.e., the same subset of potion ingredients results in the same outcome every time - but I haven't yet actually checked that - I'm doing all this in Google Sheets and my data-manipulation-fu is pathetically weak. (Probably someone could check this easily and I look dumb; I'll poke at looking for identical ingredient-sets resulting in different outcomes later.) Lorxus from like an hour later sez: "Nope, decidedly nondeterministic, given the filter-based checking I did later. Still gotta figure out what causes it and why."

For instance, I suspect that Inert Glop is the default outcome in some sense, i.e., you get Inert Glop IF no pair of ingredients in the ingredient-set codes for any potion AND no more interesting failure condition is met. Acidic Slurry would be the next-to-default, where nothing more interesting happened BUT something was different such that the result of the brew was AS and not IG.

I suspect this specifically because AS and IG are the two simplest results - each show up in the result table with only 3 ingredients.

Poking a little more with some better use of filters, it looks like my earlier theory was maybe incomplete. It's true that e.g. Barkskin Potion requires Crushed Onyx and Ground Bone, but that's not actually sufficient.

Sometimes you add those to a brew and you get nothing. An earlier working hypothesis I had that maybe Magical Explosions happen to a brew precisely when it codes for two or more potions got falsified - there's plenty of brews with e.g. Crushed Onyx and Ground Bone and also Dragon Spleen and Dragon's Blood. Some of those make Barkskin Potions. Some of them make Firebreathing potions. Sometimes you get a Mutagenic Ooze or a Magical Explosion.

So maybe that falsified theory was less falsified than I thought, and there's something about a failure rate (and likely also type) dependent on how many of the valid potions your brew codes for? Maybe something like "if a brew codes for at least two valid potions, then there's a 25% chance of a ME, a 25% chance of a MO, and equal chances of every other valid potion your brew codes for."

I didn't note this above but notably, four ingredients don't code for any of the potions: Angel Feather, Crushed Sapphire, Faerie Tears, and Quicksilver. I can't legibly explain why, but I feel like they'll still matter - have an effect on failure types or rates, or be required as part of a potion base, or maybe even just very clearly have no effect at all.

As a terminological note you may or may not have picked up on by now: I'm going to use "brew" for some subset of potion ingredients, "potion" for one of the 10 potions we can brew, and say that a brew "codes for a potion" if a subset of the brew is a pair that looks to me to determine which of the 10 potions get brewed. The "outcomes of a brew" is the set of things that have observably resulted from a given brew.

Did a quick observational-experiment:

Among brews consisting of Crushed Onyx, Dragon Spleen, Dragon's Blood, Ground Bone, and Powdered Silver, the outcomes consisted of 4 Barkskin, 9 Firebreathing, 20 Inert Glop, 8 Mutagenic Ooze, and 0 Magical Explosion.

Among brews consisting of Beech Bark, Beholder Eye, Eye of Newt, Troll Blood, and Vampire Fang, the outcomes consisted of 3 Farsight, 4 Regeneration, 0 Inert Glop, 11 Mutagenic Ooze, and 0 Magical Explosion.

(this is the point at which if my data-manipulation-fu were stronger, I'd probably just generate all possible results for all brews of the form [Putative Coding-Pair for Potion 1] + [Putative Coding-Pair for Potion 2] + [Arbitrary Other Ingredient] and look at the spread of results; I would strongly expect that to be something like "sometimes you get potion 1, sometimes you get potion 2, and there's patterns in the ratios between them and the failure rates; the failure rates, summed up, are always greater than the chance of getting any individual potion and are generally greater than getting any successful potion at all".)

...actually, that means that I (think that I) understand this problem well enough to make an initial solution-proposal! 

OK OK less of a solution-proposal, more of what has turned into another observational-experiment. What happens if we filter by requiring Crushed Onyx and Ground Bone (like we have to) and forbidding the ingredients we don't have any of anyway?[1] Basically, let's just look at the observational-space of brews that already seem kind of like a good idea anyway - what happens in all of those possible-worlds?

We get 201 brews, and we should already spot poor Anachronos's problem: none of the reliably working 3-ingredient brews that make Barkskin are found among them.

Perhaps worse, of the 4-ingredient brews, 19 make Barkskin and 44 fail as Inert Glop; a lot of these are the specific brew {[Crushed Onyx, Ground Bone], Demon Claw, Vampire Fang}, which is responsible for all of the successes and 10 of the failures for unclear reasons. He was right to call through the void at right angles to try to find Math-Doers, damn the risks!

Another promising-looking brew among the observed ones is the 5-ingredient brew {[Crushed Onyx, Ground Bone], Giant's Toe, (Troll Blood, Vampire Fang)}, which makes Barkskin 18 times, Regeneration 22 times, and Mutagenic Ooze 35 times. Maybe Anachronos likes those odds and a Regen potion will do in a pinch?

So I guess I'd say if I were under truly tight time constraints I'd present Anachronos with those two options - [a ~2/3 chance of straight out success and a ~1/3 chance of harmless failure] vs [a ~1/4 chance of straight out sucess, a ~1/4 chance of lesser failure, and a ~1/2 chance of dangerous failure] and probably push him towards the former of the two - {[Crushed Onyx, Ground Bone], Demon Claw, Vampire Fang}. I have the luxury of existing outside of his time, though, so I'm going to keep trying.

Interestingly - the modified brew {[Crushed Onyx, Ground Bone], Demon Claw, Troll Blood} only gets tested once - a failure. That's kind of a shame - I'd predict it to be pretty good given the lack of potion-code collision with Regen (which the 5-ingredient brew suffers from). That means there's something more at work here. {[Crushed Onyx, Ground Bone], Giant's Toe, Vampire Fang} never even gets tested at all!

Note to future Lorxus - run this same process with the three other potions that we could possibly brew at all - Growth, Rage, and Regen.

{[Giant's Toe, Redwood Sap], Quicksilver, Vampire Fang}: 24 Growth, 29 IG
{[Giant's Toe, Redwood Sap], Ground Bone, Oaken Twig, Quicksilver, Vampire Fang}: 19 Growth, 16 IG

{[Badger Skull, Demon Claw], Quicksilver, Troll Blood}: 21 Rage, 13 IG
(notably, modified brews with only Quicksilver or Troll Blood fail more often, and fail altogether if they have nothing else - this looks to be a pattern that holds up of secondary ingredients? "Require A and B; require at least one of C and D plus something else - maybe one of the four miscellaneous ones?")

{[Troll Blood, Vampire Fang], Crushed Diamond, Giant's Toe, Oaken Twig}
{[Troll Blood, Vampire Fang], Crushed Diamond, Demon Claw, Ground Bone}
(neither of these brews ever fail???)

(2024-6-9) [More analysis, contains a final answer]

Ah, the joys of existing outside of (Anachronos's) space-time!

I think I've squeezed about as much as I can from pure associational observation, and need to actually buckle down and figure out why brews succeed or fail. Here's my working model:

  • There are precisely ten codeable potions: Barkskin, Farsight, Firebreathing, Fire Resistance, Glibness, Growth, Invisibility, Necromantic Power, Rage, and Regeneration.
    • For each of these potions, there's two ingredients that are absolutely required (the potion's code, in my terms). I list them off above.
  • If a brew codes for two potions, it has a 1/4 chance of producing potion A, a 1/4 chance of producing potion B, and a 1/2 chance of melting down into Mutagenic Ooze.
    • I also hypothesize one of three things:
      • If a brew codes for n>2 potions, it has a 1/2n chance of producing each of those potions, and a 1/2 chance of melting down into Mutagenic Ooze.
      • If a brew codes for n>2 potions, it has a 1/n+2 chance of producing each of those potions, and a 2/n+2 chance of melting down into Mutagenic Ooze.
      • If a brew codes for three or more potions, then whenever it would turn out as Mutagenic Ooze, it instead turns out as a Magical Explosion. (This one I think is much less likely, given that none of the observed brews that we could replicate result in a ME.)
  • A successful brew cannot, for whatever reason, consist of any fewer than 3 ingredients.
    • 4-5 in particular seems to be a sweet spot: 2 ingredients for the code, and another 2-3 for ??reasons?? which also don't code for any different potion.
  • No brew is guaranteed to succeed. Every potion has some base chance of failing as Inert Glop.
    • As a corollary, we should thus reject any brew with a success rate less than that of {[Crushed Onyx, Ground Bone], Demon Claw, Vampire Fang}, that is, ~2/3.
    • This is likely wrong though! {[Troll Blood, Vampire Fang], Crushed Diamond, Giant's Toe, Oaken Twig} and {[Troll Blood, Vampire Fang], Crushed Diamond, Demon Claw, Ground Bone} both seem to be no-fail brew for Regen...
  • A brew which codes for no potion... gods I wish my data-manipulation-fu were stronger
    • guaranteed to be Acidic Slurry? Maybe?
      • No, that can't be right - {Ground Bone, Oaken Twigs, Quicksilver, Vampire Fang} doesn't code for anything and yet we see it in the IG list.
    • ...will turn out as IG if it lacks some factor and as AS if it has that factor?
      • Yeah this isn't a guess so much as an entire guessing schema.
        • Which still might not even be correct.

So... maybe I want to go looking for relations in or commonalities among those other 2-3 ingredients? Or possibly there's specific peculiar properties to each ingredient that I need to keep track of? ...Actually that sounds a lot more plausible when I think about it that way.

(After a lot more thinking)

Y'know... there's enough test cases that the proportions I found in the last recipe should really be a lot closer to exact than they are. What's up with that?

Also, I still have no idea why MEs happen. What's up with those?

A quick look at the set of observed brews we could go make right now that resulted in MEs:
{Crushed Onyx, Demon Claw, Giant's Toe, Oaken Twig, Redwood Sap, Troll Blood, Vampire Fang}: 16 MEs, 13 MOs, 4 Growth, 9 Regen
{Demon Claw, Giant's Toe, Troll Blood, Vampire Fang}: 1 ME/1 brew (!)

So maybe there's something special about these four ingredients? You (can) get a ME if you include all four, with the rate of that increasing as the proportion of "magically charged" ingredients in a brew increases?

It'd explain why we probably can't just tell Anachronos to brew {Crushed Onyx, Ground Bone}: it wouldn't have sufficient "magical charge"; in this model, a potion goes off (in the sense of "is not IG") never/sometimes/always if it has 0-1/2/3+ magically charged ingredients.

The problem is that I'll need to expand my search to the rest of the dataset (of things we can't possibly brew now) to figure out whether Crushed Onyx is magically charged, given that it's not in the codes for Growth or Regen (because codes are nonoverlapping) - though if that theory is correct, the existence of sometimes-working 3-ingredient brews that can make Barkskin or IG suggests that Crushed Onyx probably is magically charged (and if so, good gods, Anachronos, what were you doing brewing the same maximally overcharged potion that many times?). Also, countervailing that is the fact that {[Crushed Onyx, Ground Bone], Demon Claw, Vampire Fang} doesn't always actually work, which suggests that it's only the smaller set of four (out of what we have left).

Assuming Crushed Onyx is magically charged: {[Crushed Onyx, Ground Bone], Giant's Toe, Demon Claw}.
Assuming Crushed Onyx is not magically charged: {[Crushed Onyx, Ground Bone], Giant's Toe, Demon Claw, Troll Blood} and {[Crushed Onyx, Ground Bone], Giant's Toe, Demon Claw, Vampire Fang} should both work equally well - all we need to do is avoid code-clash with Regen.

If I had to give a solution right now - given the fact that Troll Blood and Vampire Fang are absent from all of the (4) {[Crushed Onyx, Ground Bone], Giant's Toe, Demon Claw}-containing brews, and those tend to fail as IG (or MO from code-clash) - I'm pretty sure Crushed Onyx is magically uncharged.

So I'd tell Anachronos to brew {[Crushed Onyx, Ground Bone], Giant's Toe, Demon Claw, Troll Blood} or {[Crushed Onyx, Ground Bone], Giant's Toe, Demon Claw, Vampire Fang}, whichever [he feels better about]/[is cheaper]/[he likes better]/[he has better stocks of]/[he can buy more easily]/[idk my dude flip a coin]. Either should work with certainty.

(after reading simon's observation that maybe Barkskin and Necromantic Power are swapped around) If we do in fact doubt Anachronos here, then we can easily put together a 5-ingredient brew that would guarantee getting him actual Barkskin: {[Beech Bark, Oaken Twig], Giant's Toe, Demon Claw, [Troll Blood XOR Vampire Fang]}

  1. ^

    For reference, these are: Angel Feather, Beholder Eye, Crushed Ruby, Crushed Sapphire, all four of the Dragon gibblies, Ectoplasm, Eye of Newt, Faerie Tears, and Powdered Silver.

Comment by Lorxus on Weeping Agents · 2024-06-06T21:48:00.261Z · LW · GW

Agents are mechanisms by which the future influences the past.

Better to say - "Agents are a mechanism by which possible futures can influence their own logical pasts."

Comment by Lorxus on Announcing ILIAD — Theoretical AI Alignment Conference · 2024-06-05T23:31:50.849Z · LW · GW

It's the Independently-Led Interactive Alignment Discussion, surely.

Comment by Lorxus on Announcing ILIAD — Theoretical AI Alignment Conference · 2024-06-05T23:30:08.149Z · LW · GW

Comment by Lorxus on OpenAI: Fallout · 2024-05-29T20:58:40.505Z · LW · GW

The wording of that canary is perhaps less precise and less broad than you wanted it to be in many possible worlds. Given obvious possible inferences one could reasonably make from the linguistic pragmatics - and what's left out - you are potentially passively representing a(n overly?) polarized set of possible worlds you claim to maybe live in, and may not have thought about the full ramifications of that.

Comment by Lorxus on jacquesthibs's Shortform · 2024-05-22T00:07:04.854Z · LW · GW

I am very very vaguely in the Natural Abstractions area of alignment approaches. I'll give this paper a closer read tomorrow (because I promised myself I wouldn't try to get work done today) but my quick quick take is - it'd be huge if true, but there's not much more than that there yet, and it also has no argument that even if representations are converging for now, that it'll never be true that (say) adding a whole bunch more effectively-usable compute means that the AI no longer has to chunk objectspace into subtypes rather than understanding every individual object directly.

Comment by Lorxus on AI #64: Feel the Mundane Utility · 2024-05-21T12:54:25.912Z · LW · GW

This was fun to see:

More than a quarter of the applications answered it anyway.

I wonder how many of that 25% simply missed the note. People make mistakes like this all the time. And I also wonder how many people noticed this before feeding it to their AI.

It's fun to see until you think about how many of that 25% - already jobless and kinda depressed and with very little ability to tell Contra to take any exploitative or manipulative hiring practices and shove them - saw the note and concluded it was an oversight, an obvious arbitrary filter for applicants who wanted to put in less effort, or some other stupid-clever scheme by Contra's hiring managers; and thus decided that it'd be safest to just answer it anyway.

Comment by Lorxus on OpenAI: Exodus · 2024-05-21T12:07:22.023Z · LW · GW

Would you be added to a list of bad kids?

That would seem to be the "nice" outcome here, yes.

What is the typical level of shadiness of American VCs?

If you're asking that question, I claim that you already suspect the answer and should stop fighting it.

Comment by Lorxus on OpenAI: Exodus · 2024-05-20T15:04:39.977Z · LW · GW

A wise man does not cut the ‘get the AI to do what you want it to do’ department when it is working on AIs it will soon have trouble controlling. When I put myself in ‘amoral investor’ mode, I notice this is not great, a concern that most of the actual amoral investors have not noticed.

My actual expectation is that for raising capital and doing business generally this makes very little difference. There are effects in both directions, but there was overwhelming demand for OpenAI equity already, and there will be so long as their technology continues to impress.

No one ever got fired buying IBM OpenAI. ML is flashy and investors seem to care less about gears-level understanding of why something is potentially profitable than whether they can justify it. It seems to work out well enough for them.

What about employee relations and ability to hire? Would you want to work for a company that is known to have done this? I know that I would not. What else might they be doing? What is the company culture like?

Here's a sad story of a plausible possible present: OAI fires a lot of people who care more-than-average about AI safety/NKE/x-risk. They (maybe unrelatedly) also have a terrible internal culture such that anyone who can leave, does. People changing careers to AI/ML work are likely leaving careers that were even worse, for one reason or another - getting mistreated as postdocs or adjuncts in academia has gotta be one example, and I can't speak to it but it seems like repeated immediate moral injury in defense or finance might be another. So... those people do not, actually, care, or at least they can be modelled as not caring because anyone who does care doesn't make it through interviews.

What else might they be doing? Can't be worse than callously making the guidance systems for the bombs for blowing up schools or hospitals or apartment blocks. How bad is the culture? Can't possibly be worse than getting told to move cross-country for a one-year position and then getting talked down to and ignored by the department when you get there.

It pays well if you have the skills, and it looks stable so long as you don't step out of line. I think their hiring managers are going to be doing brisk business.

Comment by Lorxus on OpenAI: Exodus · 2024-05-20T14:52:17.375Z · LW · GW

If OpenAI and Sam Altman want to fix this situation, it is clear what must be done as the first step. The release of claims must be replaced, including retroactively, by a standard release of claims. Daniel’s vested equity must be returned to him, in exchange for that standard release of claims. All employees of OpenAI, both current employees and past employees, must be given unconditional release from their non-disparagement agreements, all NDAs modified to at least allow acknowledging the NDAs, and all must be promised in writing the unconditional ability to participate as sellers in all future tender offers. 

Then the hard work can begin to rebuild trust and culture, and to get the work on track. 

Alright - suppose they don't. What then?

I don't think I misstep in positing that we (for however you want to construe "we") should model OAI as - jointly but independently - meriting zero trust and functioning primarily to make Sam Altman personally more powerful. I'm also pretty sure that asking Sam to pretty please be nice and do the right thing is... perhaps strategically counterindicated.

Suppose you, Zvi (or anyone else reading this! yes, you!) were Unquestioned Czar of the Greater Ratsphere, with a good deal of money, compute, and soft power, but basically zero hard power. Sam Altman has rejected your ultimatum to Do The Right Thing and cancel the nondisparagements, modify the NDAs, not try to sneakily fuck over ex-employees when they go to sell and are made to sell for a dollar per PPU, etc, etc.

What's the line?

Comment by Lorxus on D&D.Sci (Easy Mode): On The Construction Of Impossible Structures [Evaluation and Ruleset] · 2024-05-20T13:41:41.902Z · LW · GW

I really liked this one! I'd kept wanting to jump in on a DnD.Science thing for a while (both because it looked fun and I'm trying to improve myself in ways strongly consistent with learning more about how to do data science) and this was a perfect start. IMO you totally should run easier and/or shorter puzzles sometimes going forward, and maybe should mark ones particularly amenable to a first-timer as being so.

Comment by Lorxus on Lorxus's Shortform · 2024-05-18T17:57:19.805Z · LW · GW

Wait, some of y'all were still holding your breaths for OpenAI to be net-positive in solving alignment?

After the whole "initially having to be reminded alignment is A Thing"? And going back on its word to go for-profit? And spinning up a weird and opaque corporate structure? And people being worried about Altman being power-seeking? And everything to do with the OAI board debacle? And OAI Very Seriously proposing what (still) looks to me to be like a souped-up version of Baby Alignment Researcher's Master Plan B (where A involves solving physics and C involves RLHF and cope)? That OpenAI? I just want to be very sure. Because if it took the safety-ish crew of founders resigning to get people to finally pick up on the issue... it shouldn't have. Not here. Not where people pride themselves on their lightness.

Comment by Lorxus on romeostevensit's Shortform · 2024-05-18T13:05:09.276Z · LW · GW

Any recommendations on how I should do that? You may assume that I know what a gas chromatograph is and what a Petri dish is and why you might want to use either or both of those for data collection, but not that I have any idea of how to most cost-effectively access either one as some rando who doesn't even have a MA in Chemistry.

Comment by Lorxus on romeostevensit's Shortform · 2024-05-17T22:27:20.079Z · LW · GW

Surely so! Hit me up if you ever end doing this - I'm likely getting the Lumina treatment in a couple months.

Comment by Lorxus on D&D.Sci (Easy Mode): On The Construction Of Impossible Structures · 2024-05-17T13:56:31.721Z · LW · GW

Finally, I get to give one a try! I'll edit this post with my analysis and strategy. But first, a clarifying question - are the new plans supposed to be lacking costs?

First off, it looks to me like you only get impossible structures if you were apprenticed to "Bloody Stupid" Johnson or Peter Stamatin, or if you're self-taught. No love for Dr. Seuss, Escher, or Penrose. Also, while being apprenticed to either of those two lunatics guarantees you an impossible structure, being self-taught looks to do it only half the time. We can thus immediately reject plans B, C, F, J, and M.

Next, I started thinking about cost. Looks like nightmares are horrifyingly expensive - small wonder - and silver and glass are only somewhat better. Cheaper options for materials look to include wood, dreams, and steel. That rules out plan G as a good idea if I want to keep costs low, and makes suggestions about the other plans that I'll address later.

I'm not actually sure what the relationship is between [pair of materials] and [cost], but my snap first guess - given how nightmares dominate the expensive end of the past plans, how silver and glass seem to show up somewhat more often at the top end and wood/dreams/steel show up at the bottom end fairly reliably - is that it's some additive relation on secret prices by material, maybe modified by the type of structure?

A little more perusing at the Self-Taught crowd suggests that... they're kind of a crapshoot? I'm sure I'm going to feel like an idiot when there turns out to be some obvious relationship that predicts when their structures will turn out impossible, but it doesn't look to me like building type, blueprint quality, material, or final price are determinative.

Maybe it has something to do with that seventh data column in the past plans, which fell both before apprentice-status and after price, which I couldn't pry open more than a few pixels' crack and from which then issued forth endless surreal blasphemies, far too much space, and the piping of flutes; ia! ia! the swollen and multifarious geometries of Tindalos eagerly welcome a wayward and lonely fox home once more. yeah sorry no idea how this got here but I can't remove it

Regardless, I'd rather take the safe option here and limit my options to D, E, H, and K, the four plans which are: 1) drawn up by architects who apprenticed with either of the two usefully crazy masters (and not simply self-taught) and 2) not making use of Nightmares, because those are expensive.

For a bonus round, I'll estimate costs by comparing to whatever's closest from past projects. Using this heuristic, I think K is going to cost 60-80k, D and H (which are the same plan???) will both cost ~65k, and E is going to be stupid cheap (<5k). EDIT: also that means that the various self-taught people's plans are likely to be pretty cheap, given their materials, so... if this were a push-your-luck dealie based on trying to get as much value per dollar as possible, maybe it's even worth chancing it on the chancers (A, I, L, and N)?

Comment by Lorxus on Dyslucksia · 2024-05-15T12:16:58.265Z · LW · GW

On the object level I agree. On the meta level, though, making the seemingly-dumb object-level move (~here specifically) of announcing that you think that all minds are the same in some specific way means that people will come out of the woodwork to correct you, which results in everyone getting better models about what minds are like.

Comment by Lorxus on (Geometrically) Maximal Lottery-Lotteries Exist · 2024-05-13T13:08:28.325Z · LW · GW

I gave a short and unpolished response privately.

Comment by Lorxus on (Geometrically) Maximal Lottery-Lotteries Exist · 2024-05-12T00:00:44.543Z · LW · GW

Dang. I wasn't entirely sure whether you were firm on the definition of lottery-lottery dominance or if that was more speculative. I guess I wasn't clear that MLLs were specifically meant to be "majoritarianism but better"? Given that you meant for it to be, this post sure doesn't prove that they exist. You're absolutely right that you can cook up electorates where the majority-favored candidate isn't the Nash bargaining/Geometric MLL favored candidate.

Comment by Lorxus on ChristianKl's Shortform · 2024-05-11T15:45:13.136Z · LW · GW

The body uses up sodium and potassium as two major cations. You need them for neural firing to work, among many other things; it's the body's go-to for "I need a single-charge cation but sodium doesn't work for whatever reason". As such, you lose plenty in urine and sweat. Because modern table salt (i.e., neither rock salt nor better yet sea salt) contains basically no potassium, people can end up being slightly deficient because we do still get some from foods - lots of types of produce like tomatoes, root vegetables, and some fruits are rich in it, for instance.

Comment by Lorxus on (Geometrically) Maximal Lottery-Lotteries Exist · 2024-05-11T11:48:16.305Z · LW · GW

To avoid confusion: this post and my reply to it were also on a past version of this post; that version lacked any investigation of dominance criterion desiderata for lottery-lotteries.

Comment by Lorxus on Dyslucksia · 2024-05-11T11:46:01.895Z · LW · GW

Yeah, I myself subvocalize absolutely everything and I am still horrified when I sometimes try any "fast" reading techniques - those drain all of the enjoyment our of reading for me, as if instead of characters in a story I would imagine them as p-zombies.

I speed-read fiction, too. When I do, though, I'll stop for a bit whenever something or someone new is being described, to give myself a moment to picture it in a way that my mind can bring up again as set dressing.

Comment by Lorxus on Open Thread Spring 2024 · 2024-05-10T16:03:42.816Z · LW · GW

I'm neither of these users, but for temporarily secret reasons I care a lot about having the Geometric Rationality and Maximal Lottery-Lottery sequences be slightly higher-quality.

The reason is not secret anymore! I have finished and published a two-post sequence on maximal lottery-lotteries.

Comment by Lorxus on Dyslucksia · 2024-05-10T12:35:38.930Z · LW · GW

Anyway, my prediction is that non-dyslectics do not subvocalize - it's much too slow. You can't read faster than you speak in that case.

Maybe I'm just weird, but I totally do sometimes subvocalize, but incredibly quickly. Almost clipped or overlapping to an extent, in a way that can only really work inside your head? And that way it can go faster than you can physically speak. Why should your mental voice be limited by the limits of physical lips, tongue, and glottis, anyway?