## Posts

## Comments

**Diffractor**on "Infra-Bayesianism with Vanessa Kosoy" – Watch/Discuss Party · 2021-04-09T20:39:38.185Z · LW · GW

Re point 1, 2: Check this out. For the specific case of 0 to even bits, ??? to odd bits, I think solomonoff can probably get that, but not more general relations.

Re: point 3, Solomonoff is about stochastic environments that just take your action as an input, and aren't reading your policy. For infra-Bayes, you can deal with policy-dependent environments without issue, as you can consider hard-coding in every possible policy to get a family of stochastic environments, and UDT behavior naturally falls out as a result from this encoding. There's still some open work to be done on which sorts of policy-dependent environments like this are learnable (inferrable from observations), but it's pretty straightforward to cram all sorts of weird decision-theory scenarios in as infra-Bayes hypothesis, and do the right thing in them.

**Diffractor**on "Infra-Bayesianism with Vanessa Kosoy" – Watch/Discuss Party · 2021-04-09T19:33:24.063Z · LW · GW

Ah. So, low expected utility alone isn't too much of a problem. The amount of weight a hypothesis has in a prior after updating depends on the *gap* between the best-case values and worst-case values. Ie, "how much does it matter what happens here". So, the stuff that withers in the prior as you update are the hypotheses that are like "what happens now has negligible impact on improving the worst-case". So, hypotheses that are like "you are screwed no matter what" just drop out completely, as if it doesn't matter what you do, you might as well pick actions that optimize the *other* hypotheses that aren't quite so despondent about the world.

In particular, if all the probability distributions in a set are like "this thing that just happened was improbable", the hypothesis takes a big hit in the posterior, as all the a-measures are like "ok, we're in a low-measure situation now, what happens after this point has negligible impact on utility".

I still need to better understand how updating affects hypotheses which are a big set of probability distributions so there's always one probability distribution that's like "I correctly called it!".

The motivations for different g are:

If g is your actual utility function, then updating with g as your off-event utility function grants you dynamic consistency. Past-you never regrets turning over the reins to future you, and you act just as UDT would.

If g is the constant-1 function, then that corresponds to updates where you don't care at all what happens off-history (the closest thing to normal updates), and both the "diagonalize against knowing your own action" behavior in decision theory and the Nirvana trick pops out for free from using this update.

**Diffractor**on "Infra-Bayesianism with Vanessa Kosoy" – Watch/Discuss Party · 2021-04-09T19:24:50.920Z · LW · GW

"mixture of infradistributions" is just an infradistribution, much like how a mixture of probability distributions is a probability distribution.

Let's say we've got a prior , a probability distribution over indexed hypotheses.

If you're working in a vector space, you can take any countable collection of sets in said vector space, and mix them together according to a prior giving a weight to each set. Just make the set of all points which can be made by the process "pick a point from each set, and mix the points together according to the probability distribution "

For infradistributions as sets of probability distributions or a-measures or whatever, that's a subset of a vector space. So you have a bunch of sets , and you just mix the sets together according to , that gives you your set .

If you want to think about the mixture in the concave functional view, it's even nicer. You have a bunch of which are "hypothesis i can take a function and output what its worst-case expectation value is". The mixture of these, , is simply defined as . This is just mixing the functions together!

Both of these ways of thinking of mixtures of infradistributions are equivalent, and recover mixture of probability distributions as a special case.

**Diffractor**on "Infra-Bayesianism with Vanessa Kosoy" – Watch/Discuss Party · 2021-04-09T19:06:16.952Z · LW · GW

The concave functional view is "the thing you do with a probability distribution is take expectations of functions with it. In fact, it's actually possible to identify a probability distribution with the function mapping a function to its expectation. Similarly, the thing we do with an infradistribution is taking expectations of functions with it. Let's just look at the behavior of the function we get, and neglect the view of everything as a set of a-measures."

As it turns out, this view makes proofs a whole lot cleaner and tidier, and you only need a few conditions on a function like that for it to have a corresponding set of a-measures.

**Diffractor**on Stuart_Armstrong's Shortform · 2021-04-04T01:04:26.818Z · LW · GW

Sounds like a special case of crisp infradistributions (ie, all partial probability distributions have a unique associated crisp infradistribution)

Given some , we can consider the (nonempty) set of probability distributions equal to where is defined. This set is convex (clearly, a mixture of two probability distributions which agree with about the probability of an event will also agree with about the probability of an event).

Convex (compact) sets of probability distributions = crisp infradistributions.

**Diffractor**on Introduction To The Infra-Bayesianism Sequence · 2021-03-31T23:10:40.408Z · LW · GW

You're completely right that hypotheses with unconstrained Murphy get ignored because you're doomed no matter what you do, so you might as well optimize for just the other hypotheses where what you do matters. Your "-1,000,000 vs -999,999 is the same sort of problem as 0 vs 1" reasoning is good.

Again, you are making the serious mistake of trying to think about Murphy verbally, rather than thinking of Murphy as the personification of the "inf" part of the definition of expected value, and writing actual equations. is the available set of possibilities for a hypothesis. If you really want to, you can think of this as constraints on Murphy, and Murphy picking from available options, but it's highly encouraged to just work with the math.

For mixing hypotheses (several different sets of possibilities) according to a prior distribution , you can write it as an expectation functional via (mix the expectation functionals of the component hypotheses according to your prior on hypotheses), or as a set via (the available possibilities for the mix of hypotheses are all of the form "pick a possibility from each hypothesis, mix them together according to your prior on hypotheses")

This is what I meant by "a constraint on Murphy is picked according to this probability distribution/prior, then Murphy chooses from the available options of the hypothesis they picked", that set (your mixture of hypotheses according to a prior) corresponds to selecting one of the sets according to your prior , and then Murphy picking freely from the set .

Using (and considering our choice of what to do affecting the choice of , we're trying to pick the best function ) we can see that if the prior is composed of a bunch of "do this sequence of actions or bad things happen" hypotheses, the details of what you do sensitively depend on the probability distribution over hypotheses. Just like with AIXI, really.

Informal proof: if and (assuming ), then we can see that

and so, the best sequence of actions to do would be the one associated with the "you're doomed if you don't do blahblah action sequence" hypothesis with the highest prior. Much like AIXI does.

Using the same sort of thing, we can also see that if there's a maximally adversarial hypothesis in there somewhere that's just like "you get 0 reward, screw you" no matter what you do (let's say this is psi_0), then we have

And so, that hypothesis drops out of the process of calculating the expected value, for all possible functions/actions. Just do a scale-and-shift, and you might as well be dealing with the prior , which a-priori assumes you aren't in the "screw you, you lose" environment.

Hm, what about if you've just got two hypotheses, one where you're like "my knightian uncertainty scales with the amount of energy in the universe so if there's lots of energy available, things could e really bad, while if there's little energy available, Murphy can't make things bad" () and one where reality behaves pretty much as you'd expect it to(? And your two possible options would be "burn energy freely so Murphy can't use it" (the choice , attaining a worst-case expected utility of in and in ), and "just try to make things good and don't worry about the environment being adversarial" (the choice , attaining 0 utility in , 1 utility in ).

The expected utility of (burn energy) would be

And the expected utility of (act normally) would be

So "act normally" wins if , which can be rearranged as . Ie, you'll act normally if the probability of "things are normal" times the loss from burning energy when things are normal exceeds the probability of "Murphy's malice scales with amount of available energy" times the gain from burning energy in that universe.

So, assuming you assign a high enough probability to "things are normal" in your prior, you'll just act normally. Or, making the simplifying assumption that "burn energy" has similar expected utilities in both cases (ie, ), then it would come down to questions like "is the utility of burning energy closer to the worst-case where Murphy has free reign, or the best-case where I can freely optimize?"

And this is assuming there's just two options, the actual strategy selected would probably be something like "act normally, if it looks like things are going to shit, start burning energy so it can't be used to optimize against me"

Note that, in particular, the hypothesis where the level of attainable badness scales with available energy is very different from the "screw you, you lose" hypothesis, since there are actions you can take that do better and worse in the "level of attainable badness scales with energy in the universe" hypothesis, while the "screw you, you lose" hypothesis just makes you lose. And both of these are very different from a "you lose if you don't take this exact sequence of actions" hypothesis. *Murphy is not a physical being, it's a personification of an equation, thinking verbally about an actual Murphy doesn't help because you start confusing very different hypotheses, think purely about what the actual set of probability distributions ** corresponding to hypothesis ** looks like*. I can't stress this enough.

Also, remember, the goal is to maximize worst-case * expected* value, not worst-case value.

**Diffractor**on Introduction To The Infra-Bayesianism Sequence · 2021-03-25T02:54:33.564Z · LW · GW

There's actually an upcoming post going into more detail on what the deal is with pseudocausal and acausal belief functions, among several other things, I can send you a draft if you want. "Belief Functions and Decision Theory" is a post that hasn't held up nearly as well to time as "Basic Inframeasure Theory".

**Diffractor**on Introduction To The Infra-Bayesianism Sequence · 2021-03-24T19:17:02.796Z · LW · GW

If you use the Anti-Nirvana trick, your agent just goes "nothing matters at all, the foe will mispredict and I'll get -infinity reward" and rolls over and cries since all policies are optimal. Don't do that one, it's a bad idea.

For the concave expectation functionals: Well, there's another constraint or two, like monotonicity, but yeah, LF duality basically says that you can turn any (monotone) concave expectation functional into an inframeasure. Ie, all risk aversion can be interpreted as having radical uncertainty over some aspects of how the environment works and assuming you get worst-case outcomes from the parts you can't predict.

For your concrete example, that's why you have multiple hypotheses that are learnable. Sure, one of your hypotheses might have complete knightian uncertainty over the odd bits, but another hypothesis might not. Betting on the odd bits is advised by a more-informative hypothesis, for sufficiently good bets. And the policy selected by the agent would probably be something like "bet on the odd bits occasionally, and if I keep losing those bets, stop betting", as this wins in the hypothesis where some of the odd bits are predictable, and doesn't lose too much in the hypothesis where the odd bits are completely unpredictable and out to make you lose.

**Diffractor**on Introduction To The Infra-Bayesianism Sequence · 2021-03-22T05:33:01.912Z · LW · GW

Maximin, actually. You're maximizing your worst-case result.

It's probably worth mentioning that "Murphy" isn't an actual foe where it makes sense to talk about destroying resources lest Murphy use them, it's just a personification of the fact that we have a set of options, any of which could be picked, and we want to get the highest lower bound on utility we can for that set of options, so we assume we're playing against an adversary with perfectly opposite utility function for intuition. For that last paragraph, translating it back out from the "Murphy" talk, it's "wouldn't it be good to use resources in order to guard against worst-case outcomes within the available set of possibilities?" and this is just ordinary risk aversion.

For that equation , B can be *any* old set of probabilistic environments you want. You're not spending any resources or effort, a hypothesis just **is** a set of constraints/possibilities for what reality will do, a guess of the form "Murphy's operating under these constraints/must pick an option from this set."

You're completely right that for constraints like "environment must be a valid chess board", that's too loose of a constraint to produce interesting behavior, because Murphy is always capable of screwing you there.

This isn't too big of an issue in practice, because it's possible to mix together several infradistributions with a prior, which is like "a constraint on Murphy is picked according to this probability distribution/prior, then Murphy chooses from the available options of the hypothesis they picked". And as it turns out, you'll end up completely ignoring hypotheses where Murphy can screw you over no matter what you do. You'll choose your policy to do well in the hypotheses/scenarios where Murphy is more tightly constrained, and write the "you automatically lose" hypotheses off because it doesn't matter *what* you pick, you'll lose in those.

But there *is* a big unstudied problem of "what sorts of hypotheses are nicely behaved enough that you can converge to optimal behavior in them", that's on our agenda.

An example that might be an intuition pump, is that there's a very big difference between the hypothesis that is "Murphy can pick a coin of unknown bias at the start, and I have to win by predicting the coinflips accurately" and the hypothesis "Murphy can bias each coinflip individually, and I have to win by predicting the coinflips accurately". The important difference between those seems to be that past performance is indicative of future behavior in the first hypothesis and not in the second. For the first hypothesis, betting according to Laplace's law of succession would do well in the long run no matter *what* weighted coin Murphy picks, because you'll catch on pretty fast. For the second hypothesis, no strategy you can do can possibly help in that situation, because past performance isn't indicative of future behavior.

**Diffractor**on Dark Matters · 2021-03-17T21:46:49.097Z · LW · GW

I found this Quanta magazine article about it which seems to indicate that it fits the CMB spectrum well but required a fair deal of fiddling with gravity to do so, but I lamentably lack the physics capabilities to evaluate the original paper.

**Diffractor**on Dark Matters · 2021-03-15T12:03:14.546Z · LW · GW

If there's something wrong with some theory, isn't it quite odd that looking around at different parts of the universe seems to produce such a striking level of agreement on how much missing mass there is? If there was some out-of-left-field thing, I'd expect it to have confusing manifestations in many different areas and astronomers angsting about dramatically inconsistent measurements, I would *not* expect the CMB to end up explained away (and the error bars on those measurements are really really small) by the same 5:1 mix of non-baryonic matter vs baryonic matter the astronomers were postulating for everything else.

In other words, if you were starting out blind, the "something else will be found for a theory" bucket would *not* start out with most of its probability mass on "and in every respect, including the data that hasn't come in yet since it's the 1980's now, it's gonna look exactly like the invisible mass scenario". It's certainly not ruled out, but it has taken a bit of a beating.

Also, physics is not obligated to make things easy to find. Like how making a particle accelerator capable of reaching the GUT scale to test Grand Unified Theories takes a particle accelerator the size of a solar system.

**Diffractor**on Dark Matters · 2021-03-15T11:53:33.819Z · LW · GW

Yes, pink is gas and purple is mass, but also the gas there makes up the dominant component of the visible mass in the Bullet Cluster, far outweighing the stars.

Also, physicists have come up with a *whole lot* of possible candidates for dark matter particles. The supersymmetry-based ones took a decent kicking at the LHC, and I'm unsure of the motivations for some of the other ones, but the two that look most promising (to me, others may differ in opinion) are axions and sterile neutrinos, as those were conjectured to plug holes in the Standard Model, so they've got a stronger physics motivation than the rest. But again, it might be something no physicist saw coming.

For axions, there's something in particle physics called the strong CP problem, where there's no theoretical reason whatsoever why strong-force interactions shouldn't break CP symmetry. And yet, as far as we can tell, the CP-symmetry-breakingness of the strong-force interaction is precisely zero. Axions were postulated as a way to deal with this, and for certain mass ranges, they would work. They'd be extremely light particles.

And for sterile neutrinos, there's a weird thing we've noticed where all the other quarks and leptons can have left-handed or right-handed chirality, but neutrinos *only* come in the left-handed form, nobody's ever found a right-handed neutrino. Also, in the vanilla Standard Model, neutrinos are supposed to be massless. And as it turns out, if you introduce some right-handed neutrinos and do a bit of physics fiddling, something called the seesaw mechanism shows up, which has the two effects of making ordinary neutrinos very light (and they are indeed thousands of times lighter than any other elementary particle with mass), and the right-handed neutrinos very heavy (so it's hard to make them at a particle accelerator). Also, since the weak interaction (the major way we know neutrinos are a thing) is sensitive to chirality, the right-handed neutrinos don't really do much of anything besides have gravity and have slight interactions with neutrinos, with are already hard to detect. So that's another possibility.

**Diffractor**on Avoid Unnecessarily Political Examples · 2021-01-13T21:45:55.619Z · LW · GW

I'd go with number 2, because my snap reaction was "ooh, there's a "show personal blogposts" button?"

EDIT: Ok, I found the button. The problem with that button is that it looks identical to the other tags, and is at the right side of the screen when the structure of "Latest" draws your eyes to the left side of the screen. I'd make it a bit bigger and on the left side of the screen.

**Diffractor**on Belief Functions And Decision Theory · 2021-01-05T06:06:48.975Z · LW · GW

So, first off, I should probably say that a lot of the formalism overhead involved in *this post in particular* feels like the sort of thing that will get a whole lot more elegant as we work more things out, but "Basic inframeasure theory" still looks pretty good at this point and worth reading, and the basic results (ability to translate from pseudocausal to causal, dynamic consistency, capturing most of UDT, definition of learning) will still hold up.

Yes, your current understanding is correct, it's rebuilding probability theory in more generality to be suitable for RL in nonrealizable environments, and capturing a much broader range of decision-theoretic problems, as well as whatever spin-off applications may come from having the basic theory worked out, like our infradistribution logic stuff.

It copes with unrealizability because its hypotheses are not probability distributions, but sets of probability distributions (actually more general than that, but it's a good mental starting point), corresponding to properties that reality may have, without fully specifying everything. In particular, if an agent learns a class of belief functions (read: properties the environment may fulfill) is learned, this implies that for all properties within that class that the true environment fulfills (you don't know the true environment exactly), the infrabayes agent will match or exceed the expected utility lower bound that can be guaranteed if you know reality has that property (in the low-time-discount limit)

There's another key consideration which Vanessa was telling me to put in which I'll post in another comment once I fully work it out again.

Also, thank you for noticing that it took a lot of work to write all this up, the proofs took a while. n_n

**Diffractor**on Less Basic Inframeasure Theory · 2020-12-26T08:58:29.955Z · LW · GW

So, we've also got an analogue of KL-divergence for crisp infradistributions.

We'll be using and for crisp infradistributions, and and for probability distributions associated with them. will be used for the KL-divergence of infradistributions, and will be used for the KL-divergence of probability distributions. For crisp infradistributions, the KL-divergence is defined as

I'm not entirely sure why it's like this, but it has the basic properties you would expect of the KL-divergence, like concavity in both arguments and interacting well with continuous pushforwards and semidirect product.

Straight off the bat, we have:

**Proposition 1:**

Proof: KL-divergence between probability distributions is always nonnegative, by Gibb's inequality.

**Proposition 2:**

And now, because KL-divergence between probability distributions is 0 only when they're equal, we have:

**Proposition 3:** *If ** is the uniform distribution on **, then *

And the cross-entropy of any distribution with the uniform distribution is always , so:

**Proposition 4:** *is a concave function over* .

Proof: Let's use as our number in in order to talk about mixtures. Then,

Then we apply concavity of the KL-divergence for probability distributions to get:

**Proposition 5: **

At this point we can abbreviate the KL-divergence, and observe that we have a multiplication by 1, to get:

And then pack up the expectation

Then, with the choice of and fixed, we can move the choice of the all the way inside, to get:

Now, there's something else we can notice. When choosing , it doesn't matter what is selected, you want to take every and maximize the quantity inside the expectation, that consideration selects your . So, then we can get:

And pack up the KL-divergence to get:

And distribute the min to get:

And then, we can pull out that fixed quantity and get:

And pack up the KL-divergence to get:

**Proposition 6:**

To do this, we'll go through the proof of proposition 5 to the first place where we have an inequality. The last step before inequality was:

Now, for a direct product, it's like semidirect product but all the and are the same infradistribution, so we have:

Now, this is a constant, so we can pull it out of the expectation to get:

**Proposition 7:**

For this, we'll need to use the Disintegration Theorem (the classical version for probability distributions), and adapt some results from Proposition 5. Let's show as much as we can before showing this.

Now, hypothetically, if we had

then we could use that result to get

and we'd be done. So, our task is to show

for any pair of probability distributions and . Now, here's what we'll do. The and gives us probability distributions over , and the and are probability distributions over . So, let's take the joint distribution over given by selecting a point from according to the relevant distribution and applying . By the classical version of the disintegration theorem, we can write it either way as starting with the marginal distribution over and a semidirect product to , or by starting with the marginal distribution over and you take a semidirect product with some markov kernel to to get the joint distribution. So, we have:

for some Markov kernels . Why? Well, the joint distribution over is given by or respectively (you have a starting distribution, and lets you take an input in and get an output in ). But, breaking it down the other way, we start with the marginal distribution of those joint distributions on (the pushforward w.r.t. ), and can write the joint distribution as semidirect product going the other way. Basically, it's just two different ways of writing the same distributions, so that's why KL-divergence doesn't vary at all.

Now, it is also a fact that, for semidirect products (sorry, we're gonna let be arbitrary here and unconnected to the fixed ones we were looking at earlier, this is just a general property of semidirect products), we have:

To see this, run through the proof of Proposition 5, because probability distributions are special cases of infradistributions. Running up to right up before the inequality, we had

But when we're dealing with probability distributions, there's only one possible choice of probability distribution to select, so we just have

Applying this, we have:

The first equality is our expansion of semidirect product for probability distributions, second equality is the probability distributions being equal, and third equality is, again, expansion of semidirect product for probability distributions. Contracting the two sides of this, we have:

Now, the KL-divergence between a distribution and itself is 0, so the expectation on the left-hand side is 0, and we have

And bam, we have which is what we needed to carry the proof through.

**Diffractor**on CO2 Stripper Postmortem Thoughts · 2020-12-15T18:58:16.577Z · LW · GW

It is currently disassembled in my garage, will be fully tested when the 2.0 version is built, and the 2.0 version has had construction stalled for this year because I've been working on other projects. The 1.0 version did remove CO2 from a room as measured by a CO2 meter, but the size and volume made it not worthwhile.

**Diffractor**on John_Maxwell's Shortform · 2020-11-03T04:10:06.479Z · LW · GW

Potential counterargument: Second-strike capabilities are still relevant in the interstellar setting. You could build a bunch of hidden ships in the oort cloud to ram the foe and do equal devastation if the other party does it first, deterring a first strike even with tensions and an absence of communication. Further, while the "ram with high-relativistic objects" idea works pretty well for preemptively ending a civilization confined to a handful of planets, AI's would be able to colonize a bunch of little asteroids and KBO's and comets in the oort cloud, and the higher level of dispersal would lead to preemptive total elimination being less viable.

**Diffractor**on Introduction to Cartesian Frames · 2020-10-22T19:16:06.355Z · LW · GW

I will be hosting a readthrough of this sequence on MIRIxDiscord again, PM for a link.

**Diffractor**on The rationalist community's location problem · 2020-09-30T09:54:26.100Z · LW · GW

Reno has 90F daily highs during summer. Knocking 10 degrees off is a nonneglible improvement over Las Vegas, though.

**Diffractor**on Needed: AI infohazard policy · 2020-09-21T20:50:53.784Z · LW · GW

So, here's some considerations (not an actual policy)

It's instructive to look at the case of nuclear weapons, and the key analogies or disanalogies to math work. For nuclear weapons, the basic theory is pretty simple and building the hardware is the hard part, while for AI, the situation seems reversed. The hard part there is knowing what to do in the first place, not scrounging up the hardware to do it.

First, a chunk from Wikipedia

Most of the current ideas of the Teller–Ulam design came into public awareness after the DOE attempted to censor a magazine article by U.S. anti-weapons activist Howard Morland in 1979 on the "secret of the hydrogen bomb". In 1978, Morland had decided that discovering and exposing this "last remaining secret" would focus attention onto the arms race and allow citizens to feel empowered to question official statements on the importance of nuclear weapons and nuclear secrecy. Most of Morland's ideas about how the weapon worked were compiled from highly accessible sources—the drawings which most inspired his approach came from the

Encyclopedia Americana. Morland also interviewed (often informally) many former Los Alamos scientists (including Teller and Ulam, though neither gave him any useful information), and used a variety of interpersonal strategies to encourage informational responses from them (i.e., asking questions such as "Do they still use sparkplugs?" even if he wasn't aware what the latter term specifically referred to)....

When an early draft of the article, to be published inThe Progressivemagazine, was sent to the DOE after falling into the hands of a professor who was opposed to Morland's goal, the DOE requested that the article not be published, and pressed for a temporary injunction. After a short court hearing in which the DOE argued that Morland's information was (1). likely derived from classified sources, (2). if not derived from classified sources, itself counted as "secret" information under the "born secret" clause of the 1954 Atomic Energy Act, and (3). dangerous and would encourage nuclear proliferation...Through a variety of more complicated circumstances, the DOE case began to wane, as it became clear that some of the data they were attempting to claim as "secret" had been published in a students' encyclopedia a few years earlier....

Because the DOE sought to censor Morland's work—one of the few times they violated their usual approach of not acknowledging "secret" material which had been released—it is interpreted as being at least partially correct, though to what degree it lacks information or has incorrect information is not known with any great confidence.

So, broad takeaways from this: The Streisand effect is real. A huge part of keeping something secret is just having nobody suspect that there *is* a secret there to find. This is much trickier for nuclear weapons, which are of high interest to the state, while it's more doable for AI stuff (and I don't know how biosecurity has managed to stay so low-profile). This doesn't mean you can just wander around giving the rough sketch of the insight, in math, it's not too hard to reinvent things once you know what you're looking for. But, AI math does have a huge advantage in this it's a really broad field and hard to search through (I think my roommate said that so many papers get submitted to NeurIPS that you couldn't read through them all in time for the next NeurIPS conference), and, in order to reinvent something from scratch without having the fundamental insight, you need to be pointed in the *exact* right direction and even then you've got a good shot at missing it (see: the time-lag between the earliest neural net papers and the development of backpropagation, or, in the process of making the Infra-Bayes post, stumbling across concepts that could have been found months earlier if some time-traveler had said the right three sentences at the time.)

Also, secrets can get out through *really* dumb channels. Putting important parts of the H-bomb structure in a student's encyclopedia? Why would you do that? Well, probably because there's a lot of people in the government and people in different parts have different memories of which stuff is secret and which stuff isn't.

So, due to AI work being insight/math-based, security would be based a lot more on just... not telling people things. Or alluding to them. Although, there is an interesting possibility raised by the presence of so much other work in the field. For nuclear weapons work, things seem to be either secret or well-known among those interested in nuclear weapons. But AI has a big intermediate range between "secret" and "well-known". See all those Arxiv papers with like, 5 citations. So, for something that's kinda iffy (not serious enough (given the costs of the slowdown in research with full secrecy) to apply full secrecy, not benign enough to be comfortable giving a big presentation at NeurIPS about it), it might be possible to intentionally target that range. I don't think it's a binary between "full secret" and "full publish", there's probably intermediate options available.

Of course, if it's *known* that an organization is trying to fly under the radar with a result, you get the Streisand effect in full force. But, just as well-known authors may have pseudonyms, it's probably possible to just publish a paper on Arxiv (or something similar) under a pseudonym and not have it referenced anywhere by the organization as an official piece of research they funded. And it would be available for viewing and discussion and collaborative work in that form, while also (with high probability) remaining pretty low-profile.

Anyways, I'm gonna set a 10-minute timer to have thoughts about the guidelines:

Ok, the first thought I'm having is that this is probably a case where Inside View is just strictly better than Outside View. Making a policy ahead of time that can just be followed requires whoever came up with the policy to have a good classification in advance all the relevant categories of result and what to do with them, and that seems pretty dang hard to do especially because novel insights, almost by definition, are not something you expected to see ahead of time.

The next thought is that working something out for a while and then going "oh, this is roughly adjacent to something I wouldn't want to publish, when developed further" isn't *quite* as strong of an argument for secrecy as it looks like, because, as previously mentioned, even fairly basic additional insights (in retrospect) are pretty dang tricky to find ahead of time if you don't know what you're looking for. Roughly, the odds of someone finding the thing you want to hide scale with the number of people actively working on it, so that case seems to weigh in favor of publishing the result, but not actively publicizing it to the point where you can't befriend everyone else working on it. If one of the papers published by an organization could be built on to develop a serious result... well, you'd still have the problem of not knowing which paper it is, or what unremarked-on direction to go in to develop the result, if it was published as normal and not flagged as anything special. But if the paper got a whole bunch of publicity, the odds go up that someone puts the pieces together spontaneously. And, if you know everyone working on the paper, you've got a saving throw if someone runs across the thing.

There *is* a *very* strong argument for talking to several other people if you're unsure whether it'd be good to publish/publicize, because it reduces the problem of "person with laxest safety standards publicizes" to "organization with the laxest safety standards publicizes". This isn't a full solution, because there's still a coordination problem at the organization level, and it gives incentives for organizations to be really defensive about sharing their stuff, including safety-relevant stuff. Further work on the inter-organization level of "secrecy standards" is very much needed. But within an organization, "have personal conversation with senior personnel" sounds like the obvious thing to do.

So, current thoughts: There's some intermediate options available instead of just "full secret" or "full publish" (publish under pseudonym and don't list it as research, publish as normal but don't make efforts to advertise it broadly) and I haven't seen anyone mention that, and they seem preferable for results that would benefit from more eyes on them, that could also be developed in bad directions. I'd be skeptical of attempts to make a comprehensive policy ahead of time, this seems like a case where inside view on the details of the result would outperform an ahead-of-time policy. But, one essential aspect that *would* be critical on a policy level is "talk it out with a few senior people first to make the decision, instead of going straight for personal judgement", as that tamps down on the coordination problem considerably.

**Diffractor**on CO2 Stripper Postmortem Thoughts · 2020-09-08T20:02:28.910Z · LW · GW

Person in a room: - 35 g of O2/hr from room

Person in a room with a CO2 stripper: -35 g of O2/hr from room

How does the presence of a CO2 stripper do *anything at all* to the oxygen amount in the air?

**Diffractor**on Introduction To The Infra-Bayesianism Sequence · 2020-09-05T05:01:29.794Z · LW · GW

Do you think this problem is essentially different from "suppose Omega asks you for 10 bucks. You say no. Then Omega says "actually I flipped a fair coin that came up tails, if it had come up heads, I would have given you 100 dollars if I predicted you'd give me 10 dollars on tails"?

(I think I can motivate "reconsider choosing heads" if you're like "yeah, this is just counterfactual mugging with belated notification of what situation you're in, and I'd pay up in that circumstance")

**Diffractor**on Introduction To The Infra-Bayesianism Sequence · 2020-09-05T03:11:16.319Z · LW · GW

Maximin over outcomes would lead to the agent devoting all its efforts towards avoiding the worst outcomes, sacrificing overall utility, while maximin over expected value pushes towards policies that do acceptably on average in all of the environments that it may find itself in.

Regarding "why listen to past me", I guess to answer this question I'd need to ask about your intuitions on Counterfactual mugging. What would you do if it's one-shot? What would you do if it's repeated? If you were told about the problem beforehand, would you pay money for a commitment mechanism to make future-you pay up the money if asked? (for +EV)

**Diffractor**on Basic Inframeasure Theory · 2020-09-01T04:58:47.146Z · LW · GW

Yeah, looking back, I should probably fix the m- part and have the signs being consistent with the usual usage where it's a measure minus another one, instead of the addition of two signed measures, one a measure and one a negative measure. May be a bit of a pain to fix, though, the proof pages are extremely laggy to edit.

Wikipedia's definition can be matched up with our definition by fixing a partial order where iff there's a that's a sa-measure s.t. , and this generalizes to any closed convex cone. I lifted the definition of "positive functional" from Vanessa, though, you'd have to chat with her about it.

We're talking about linear functions, not affine ones. is linear, not affine (regardless of f and c, as long as they're in and , respectively). Observe that it maps the zero of to 0.

**Diffractor**on Basic Inframeasure Theory · 2020-09-01T04:45:29.013Z · LW · GW

We go to the trouble of sa-measures because it's possible to add a sa-measure to an a-measure, and get another a-measure where the expectation values of all the functions went up, while the new a-measure we landed at would be impossible to make by adding an a-measure to an a-measure.

Basically, we've gotta use sa-measures for a clean formulation of "we added all the points we possibly could to this set", getting the canonical set in your equivalence class.

Admittedly, you *could* intersect with the cone of a-measures again at the end (as we do in the next post) but then you wouldn't get the nice LF-duality tie-in.

Adding the cone of a-measures instead would correspond to being able to take expectation values of continuous functions in , instead of in [0,1], so I guess you could reformulate things this way, but IIRC the 0-1 normalization doesn't work as well (ie, there's no motive for why you're picking 1 as the thing to renormalize to 1 instead of, say, renormalizing 10 to 10). We've got a candidate other normalization for that case, but I remember being convinced that it doesn't work for belief functions, but for the Nirvana-as-1-reward-forever case, I remember getting really confused about the relative advantages of the two normalizations. And apparently, when working on the internal logic of infradistributions, this version of things works better.

So, basically, if you drop sa-measures from consideration you don't get the nice LF-duality tie in and you don't have a nice way to express how upper completion works. And maybe you could work with a-measures and use upper completion w.r.t. a different cone and get a slightly different LF-duality, but then that would make normalization have to work differently and we haven't really cleaned up the picture of normalization in that case yet and how it interacts with everything else. I remember me and Vanessa switched our opinions like 3 times regarding which upper completion to use as we kept running across stuff and going "wait no, I take back my old opinion, the other one works better with this".

**Diffractor**on Introduction To The Infra-Bayesianism Sequence · 2020-08-28T05:48:29.752Z · LW · GW

Can you elaborate on what you meant by locally distinguishing between hypotheses?

**Diffractor**on Coronavirus: Justified Practical Advice Thread · 2020-03-01T23:35:21.435Z · LW · GW

If hospitals are overwhelmed, it's valuable to have a component of the hospital treatment plan for pneumonia on-hand to treat either yourself or others who have it especially bad. One of these is oxygen concentrators, which are not sold out yet and are ~$400 on Amazon. This doesn't deal with especially severe cases, but for cases which fall in the "shortness of breath, low blood oxygen" class without further medical complications, it'd probably be useful if you can't or don't want to go to a hospital due to overload. https://www.who.int/publications-detail/clinical-management-of-severe-acute-respiratory-infection-when-novel-coronavirus-(ncov)-infection-is-suspected mentions oxygen treatment as the first thing to do for low blood oxygen levels.

**Diffractor**on (A -> B) -> A · 2020-02-15T06:35:15.343Z · LW · GW

I found a paper about this exact sort of thing. Escardo and Olivia call that type signature a "selection functional", and the type signature is called a "quantification functional", and there's several interesting things you can do with them, like combining multiple selection functionals into one in a way that looks reminiscent of game theory. (ie, if has type signature , and has type signature , then has type signature .

**Diffractor**on Counterfactual Induction · 2019-12-19T07:45:05.299Z · LW · GW

Oh, I see what the issue is. Propositional tautology given means , not . So yeah, when A is a boolean that is equivalent to via boolean logic alone, we can't use that A for the exact reason you said, but if A isn't equivalent to via boolean logic alone (although it may be possible to infer by other means), then the denominator isn't necessarily small.

**Diffractor**on Counterfactual Induction · 2019-12-18T20:21:02.188Z · LW · GW

Yup, a monoid, because and , so it acts as an identitity element, and we don't care about the order. Nice catch.

You're also correct about what propositional tautology given A means.

**Diffractor**on Counterfactual Induction (Algorithm Sketch, Fixpoint proof) · 2019-12-18T08:17:42.580Z · LW · GW

Yup! The subscript is the counterfactual we're working in, so you can think of it as a sort of conditional pricing.

The prices aren't necessarily unique, we set them anew on each turn, and there may be multiple valid prices for each turn. Basically, the prices are just set so that the supertrader doesn't earn money in any of the "possible" worlds that we might be in. Monotonicity is just "the price of a set of possibilities is greater than the price of a subset of possibilities"

**Diffractor**on Counterfactual Induction · 2019-12-18T06:58:05.392Z · LW · GW

If there's a short proof of from and a short proof of from and they both have relatively long disproofs, then counterfacting on , should have a high value, and counterfacting on , should have a high value.

The way to read is that the stuff on the left is your collection of axioms ( is a finite collection of axioms and just means we're using the stuff in as well as the statement as our axioms), and it proves some statement.

For the first formulation of the value of a statement, the value would be 1 if adding doesn't provide any help in deriving a contradiction from A. Or, put another way, the shortest way of proving , assuming A as your axioms, is to derive and use principle of explosion. It's "independent" of A, in a sense.

There's a technicality for "equivalent" statements. We're considering "equivalent" as "propositionally equivalent given A" (Ie, it's possible to prove an iff statement with only the statements in A and boolean algebra alone. For example, is a statement provable with only boolean algebra alone. If you can prove the iff but you can't do it with boolean algebra alone, it doesn't count as equivalent. Unless is propositionally equivalent to , then is not equivalent to , (because maybe is false and is true) which renders the equality you wrote wrong, as well as making the last paragraph incoherent.

In classical probability theory, holds iff is 0. Ie, if it's impossible for both things to happen, the probability of "one of the two things happen" is the same as the sum of the probabilities for event 1 and event 2.

In our thing, we only guarantee equality for when (assuming A). This is because (first two = by propositonally equivalent statements getting the same value, the third = by being propositionally equivalent to assuming , fourth = by being propositionally equivalent to , final = by unitarity. Equality may hold in some other cases, but you don't have a guarantee of such, even if the two events are disjoint, which is a major difference from standard probability theory.

The last paragraph is confused, as previously stated. Also, there's a law of boolean algebra that is the same as . Also, the intuition is wrong, should be less than , because "probability of event 1 happens" is greater than "probability that event 1 and event 2 happens".

Highlighting something and pressing ctrl-4 turns it to LaTeX.

**Diffractor**on CO2 Stripper Postmortem Thoughts · 2019-12-01T23:34:47.690Z · LW · GW

Yup, this turned out to be a *crucial consideration* that makes the whole project look a lot less worthwhile. If ventilation at a bad temperature is available, it's cheaper to just get a heat exchanger and ventilate away and eat the increased heating costs during winter than to do a CO2 stripper.

There's still a remaining use case for rooms without windows that aren't amenable to just feeding an air duct outside, but that's a lot more niche than my original expectations. Gonna edit the original post now.

**Diffractor**on CO2 Stripper Postmortem Thoughts · 2019-12-01T00:46:05.951Z · LW · GW

Also, a paper on extremely high-density algal photobioreactors quotes algal concentration by volume as being as high as 6% under optimal conditions. The dry mass is about 1/8 of the wet mass of algae, so that's 0.75% concentration by weight percent. If the algal inventory in your reactor is 9 kg dry mass (you'd need to waste about 3 kg/day of dry weight or 24 kg/day of wet weight, to keep up with 2 people worth of CO2, or a third of the algae each day), that's 1200 kg of water in your reactor. Since a gallon is about 4 kg of water, that's... 300 gallons, or 6 55-gallon drums, footprint 4 ft x 6 ft x 4 ft high, at a bare minimum (probably 3x that volume in practice), so we get the same general sort of result from a different direction.

I'd be quite surprised if you could do that in under a thousand dollars.

**Diffractor**on CO2 Stripper Postmortem Thoughts · 2019-12-01T00:30:44.731Z · LW · GW

[EDIT: I see numbers as high as 4 g/L/day quoted for algae growth rates, I updated the reasoning accordingly]

The numbers don't quite add up on an algae bioreactor for personal use. The stated growth rate for chlorella algae is 0.6 g/L/day, and there are about 4 liters in a gallon, so 100 gallons of algae solution is 400 liters is 240 g of algae grown per day, and since about 2/3ds of new biomass comes from CO2 via the 6CO2+6H2O->C6H12O6 reaction, that's 160 g of CO2 locked up per day, or... about 1/6 of a person worth of CO2 in a 24 hour period. [EDIT: 1 person worth of CO2 in a 24 hour period, looks more plausible]

Plants are inefficient at locking up CO2 relative to chemical reactions!

Also you wouldn't be able to just have the algae as a giant vat, because light has to penetrate in, so the resulting reactor to lock up 1/6 [EDIT: 1] of a person worth of CO2 would be substantially larger than the footprint of 2 55-gallon drums.

**Diffractor**on CO2 Stripper Postmortem Thoughts · 2019-12-01T00:21:36.060Z · LW · GW

I have the relevant air sensor, it'd be really hard to blind it because it makes noise, and the behavioral effects thing is a good idea, thank you.

It's not currently with me.

I think the next thing to do is build the 2.0 design, because it should perform better and will also be present with me, then test the empirical CO2 reduction and behavioral effects (although, again, blinding will be difficult), and reevaluate at that point.

**Diffractor**on So You Want to Colonize The Universe Part 5: The Actual Design · 2019-08-23T05:59:57.495Z · LW · GW

Good point on phase 6. For phase 3, smaller changes in velocity further out are fine, but I still think that even with less velocity changes, you'll still have difficulty finding an engine that gets sufficient delta-V that isn't fission/fusion/antimatter based. (also in the meantime I realized that neutron damage over those sorts of timescales are going to be *really* bad.) For phase 5, I don't think a lightsail would provide enough deceleration, because you've got inverse-square losses. Maybe you could decelerate with a lightsail in the inner stellar system, but I think you'd just breeze right through since the radius of the "efficiently slow down" sphere is too small relative to how much you slow down, and in the outer stellar system, light pressure is too low to slow you down meaningfully.

**Diffractor**on So You Want To Colonize The Universe Part 3: Dust · 2019-08-23T05:53:19.550Z · LW · GW

Very good point!

**Diffractor**on 87,000 Hours or: Thoughts on Home Ownership · 2019-07-08T05:05:39.761Z · LW · GW

I'd be extremely interested in the quantitative analysis you've done so far.

**Diffractor**on Open Problems Regarding Counterfactuals: An Introduction For Beginners · 2019-03-25T21:41:31.947Z · LW · GW

See if this works.

**Diffractor**on So You Want to Colonize The Universe Part 4: Velocity Changes and Energy · 2019-03-02T18:17:30.135Z · LW · GW

I'm talking about using a laser sail to get up to near c (0.1 g acceleration for 40 lightyears is pretty strong) in the first place, and slowing down by other means.

This trick is about using a laser sail for both acceleration and deceleration.

**Diffractor**on So You Want to Colonize The Universe Part 4: Velocity Changes and Energy · 2019-03-02T02:13:19.223Z · LW · GW

Yeah, I think the original proposal for a solar sail involved deceleration by having the central part of the sail detach and receive the reflected beam from the outer "ring" of the sail. I didn't do this because IIRC the beam only maintains coherence over 40 lightyears or so, so that trick would be for nearby missions.

**Diffractor**on So You Want To Colonize The Universe Part 3: Dust · 2019-02-28T21:31:33.143Z · LW · GW

For 1, the mental model for non-relativistic but high speeds should be "a shallow crater is instantaneously vaporized out of the material going fast" and for relativistic speeds, it should be the same thing but with the vaporization directed in a deeper hole (energy doesn't spread out as much, it keeps in a narrow cone) instead of in all directions. However, your idea of having a spacecraft as a big flat sheet and being able to tolerate having a bunch of holes being shot in it is promising. The main issue that I see is that this approach is incompatible with a lot of things that (as far as we know) can only be done with solid chunks of matter, like antimatter energy capture, or having sideways boosting-rockets, and once you start armoring the solid chunks in the floaty sail, you're sort of back in the same situation. So it seems like an interesting approach and it'd be cool if it could work but I'm not quite sure it can (not entirely confident that it couldn't, just that it would require a bunch of weird solutions to stuff like "how does your sheet of tissue boost sideways at 0.1% of lightspeed".

For 2, the problem is that the particles which are highly penetrating are either unstable (muons, kaons, neutrons...) and will fall apart well before arrival (and that's completely dodging the issue of making bulk matter out of them), or they are stable (neutrinos, dark matter), and don't interact with anything, and since they don't really interact with anything, this means they *especially *don't interact with themselves (well, at least we know this for neutrinos), so they can't hold together any structure, nor can they interact with matter at the destination. Making a craft out of neutrinos is **ridiculously** more difficult than making a craft out of room-temperature air. If they can go through a light-year of lead without issue, they aren't exactly going to stick to each other. Heck, I think you'd actually have better luck trying to make a spaceship out of pure light.

For 3, it's because in order to use ricocheting mass to power your starcraft, you need to already have some way of ramping the mass up to relativistic speeds so it can get to the rapidly retreating starcraft in the first place, and you need an awful lot of mass. Light already starts off at the most relativistic speed of all, and around a star you already have astronomical amounts of light available for free.

For 4, there sort of is, but mostly not. The gravity example has the problem of the speeding up of the craft when it has the two stars ahead of it perfectly counterbalancing the backwards deceleration when the two stars are behind it. For potentials like gravity or electrical fields or pretty much anything you'd want to use, there's an inverse-square law for them, which means that they aren't really relevant unless you're fairly close to a star. The one instance I can think of where something like your approach is the case is the electric sail design in the final part. In interstellar space, it brakes against the thin soup of protons as usual, but nearby a star, the "wind" of particles streaming out from the star acts as a more effective brake and it can sail on that (going out), or use it for better deceleration (coming in). Think of it as a sail slowing a boat down when the air is stationary, and slowing down even better when the wind is blowing against you.

**Diffractor**on So You Want to Colonize The Universe · 2019-02-28T21:10:48.895Z · LW · GW

Whoops, I guess I messed up on that setting. Yeah, it's ok.

**Diffractor**on So You Want to Colonize The Universe Part 4: Velocity Changes and Energy · 2019-02-28T01:07:23.941Z · LW · GW

Actually, no! The activation energy for the conversion of diamond to graphite is about 540 kJ/mol, and using the Arrhenius equation to get the rate constant for diamond-graphite conversion, with a radiator temperature of 1900 K, we get that after 10,000 years of continuous operation, 99.95% of the diamond will still be diamond. At room temperature, the diamond-to-carbon conversion rate is slow enough that protons will decay before any appreciable amount of graphite is made.

Even for a 100,000 year burn, 99.5% of the diamond will still be intact at 1900 K.

There isn't much room to ramp up the temperature, though. We can stick to around 99%+ of the diamond being intact up to around 2100 K, but 2200 K has 5% of the diamond converting, 2300 K has 15% converting, 2400K has 45%, and it's 80 and 99% conversion of diamond into graphite over 10,000 years for 2500 K and 2600 K respectively.

**Diffractor**on So You Want to Colonize The Universe · 2019-02-27T19:31:12.774Z · LW · GW

Agreed. Also, there's an incentive to keep thinking about how to go faster until the marginal gain in design by one day of thought speeds the rocket up by less than one day, instead of launching, otherwise you'll get overtaken, and agreeing on a coordinated plan ahead of time (you get this galaxy, I get that galaxy, etc...) to avoid issues with lightspeed delays.

**Diffractor**on So You Want to Colonize The Universe · 2019-02-27T19:28:57.684Z · LW · GW

Or maybe accepting messages from home (in rocket form or not) of "whoops, we were wrong about X, here's the convincing moral argument" and acting accordingly. Then the only thing to be worried about would be irreversible acts done in the process of colonizing a galaxy, instead of having a bad "living off resources" endstate.

**Diffractor**on So You Want to Colonize the Universe Part 2: Deep Time Engineering · 2019-02-27T18:08:20.973Z · LW · GW

Edited. Thanks for that. I guess I managed to miss both of those, I was mainly going off of the indispensable and extremely thorough Atomic Rockets site having extremely little discussion of intergalactic missions as opposed to interstellar missions.

It looks like there are some spots where me and Armstrong converged on the same strategy (using lasers to launch probes), but we seem to disagree about how big of a deal dust shielding is, how hard deceleration is, and what strategy to use for deceleration.

**Diffractor**on So You Want to Colonize The Universe · 2019-02-27T17:57:20.690Z · LW · GW

Yeah, Atomic Rockets was an incredibly helpful resource for me, I definitely endorse it for others.

**Diffractor**on What makes people intellectually active? · 2019-01-15T01:16:04.921Z · LW · GW

This doesn't quite seem right, because just multiplying probabilities only works when all the quantities are independent. However, I'd put higher odds on someone having the ability to recognize a worthwhile result conditional on them having an ability to work on a problem, then having the ability to recognize a worthwhile result, so the multiplication of probabilities will be higher than it seems at first.

I'm unsure whether this consideration affects whether the distribution would be lognormal or not.