Counterfactual Induction (Lemma 4) 2019-12-17T05:05:15.959Z · score: 4 (1 votes)
Counterfactual Induction (Algorithm Sketch, Fixpoint proof) 2019-12-17T05:04:25.054Z · score: 5 (1 votes)
Counterfactual Induction 2019-12-17T05:03:32.401Z · score: 23 (5 votes)
CO2 Stripper Postmortem Thoughts 2019-11-30T21:20:33.685Z · score: 121 (37 votes)
A Brief Intro to Domain Theory 2019-11-21T03:24:13.416Z · score: 19 (10 votes)
So You Want to Colonize The Universe Part 5: The Actual Design 2019-02-27T10:23:28.424Z · score: 17 (11 votes)
So You Want to Colonize The Universe Part 4: Velocity Changes and Energy 2019-02-27T10:22:46.371Z · score: 13 (8 votes)
So You Want To Colonize The Universe Part 3: Dust 2019-02-27T10:20:14.780Z · score: 17 (10 votes)
So You Want to Colonize the Universe Part 2: Deep Time Engineering 2019-02-27T10:18:18.209Z · score: 13 (8 votes)
So You Want to Colonize The Universe 2019-02-27T10:17:50.427Z · score: 16 (15 votes)
Failures of UDT-AIXI, Part 1: Improper Randomizing 2019-01-06T03:53:03.563Z · score: 15 (6 votes)
COEDT Equilibria in Games 2018-12-06T18:00:08.442Z · score: 15 (4 votes)
Oracle Induction Proofs 2018-11-28T08:12:38.306Z · score: 6 (2 votes)
Bounded Oracle Induction 2018-11-28T08:11:28.183Z · score: 30 (11 votes)
What are Universal Inductors, Again? 2018-11-07T22:32:57.364Z · score: 11 (5 votes)
When EDT=CDT, ADT Does Well 2018-10-25T05:03:40.366Z · score: 14 (4 votes)
Asymptotic Decision Theory (Improved Writeup) 2018-09-27T05:17:03.222Z · score: 29 (8 votes)
Reflective AIXI and Anthropics 2018-09-24T02:15:18.108Z · score: 19 (8 votes)
Cooperative Oracles 2018-09-01T08:05:55.899Z · score: 19 (11 votes)
VOI is Only Nonnegative When Information is Uncorrelated With Future Action 2018-08-31T05:13:11.916Z · score: 23 (10 votes)
Probabilistic Tiling (Preliminary Attempt) 2018-08-07T01:14:15.558Z · score: 15 (9 votes)
Conditioning, Counterfactuals, Exploration, and Gears 2018-07-10T22:11:52.473Z · score: 30 (6 votes)
Logical Inductor Tiling and Why it's Hard 2018-06-14T06:34:36.000Z · score: 2 (2 votes)
A Loophole for Self-Applicative Soundness 2018-06-11T07:57:26.000Z · score: 1 (1 votes)
Resource-Limited Reflective Oracles 2018-06-06T02:50:42.000Z · score: 4 (3 votes)
Logical Inductors Converge to Correlated Equilibria (Kinda) 2018-05-30T20:23:54.000Z · score: 2 (2 votes)
Logical Inductor Lemmas 2018-05-26T17:43:52.000Z · score: 0 (0 votes)
Two Notions of Best Response 2018-05-26T17:14:19.000Z · score: 0 (0 votes)
Doubts about Updatelessness 2018-05-03T05:44:12.000Z · score: 1 (1 votes)
Program Search and Incomplete Understanding 2018-04-29T04:32:22.125Z · score: 42 (14 votes)
No Constant Distribution Can be a Logical Inductor 2018-04-07T09:09:49.000Z · score: 13 (6 votes)
Musings on Exploration 2018-04-03T02:15:17.000Z · score: 1 (1 votes)
A Difficulty With Density-Zero Exploration 2018-03-27T01:03:03.000Z · score: 0 (0 votes)
Distributed Cooperation 2018-03-18T05:46:56.000Z · score: 2 (2 votes)
Passing Troll Bridge 2018-02-25T08:21:17.000Z · score: 1 (1 votes)
Further Progress on a Bayesian Version of Logical Uncertainty 2018-02-01T21:36:39.000Z · score: 3 (2 votes)
Strategy Nonconvexity Induced by a Choice of Potential Oracles 2018-01-27T00:41:04.000Z · score: 1 (1 votes)
Open Problems Regarding Counterfactuals: An Introduction For Beginners 2017-07-18T02:21:20.000Z · score: 20 (6 votes)


Comment by diffractor on Coronavirus: Justified Practical Advice Thread · 2020-03-01T23:35:21.435Z · score: 4 (5 votes) · LW · GW

If hospitals are overwhelmed, it's valuable to have a component of the hospital treatment plan for pneumonia on-hand to treat either yourself or others who have it especially bad. One of these is oxygen concentrators, which are not sold out yet and are ~$400 on Amazon. This doesn't deal with especially severe cases, but for cases which fall in the "shortness of breath, low blood oxygen" class without further medical complications, it'd probably be useful if you can't or don't want to go to a hospital due to overload. mentions oxygen treatment as the first thing to do for low blood oxygen levels.

Comment by diffractor on (A -> B) -> A · 2020-02-15T06:35:15.343Z · score: 2 (2 votes) · LW · GW

I found a paper about this exact sort of thing. Escardo and Olivia call that type signature a "selection functional", and the type signature is called a "quantification functional", and there's several interesting things you can do with them, like combining multiple selection functionals into one in a way that looks reminiscent of game theory. (ie, if has type signature , and has type signature , then has type signature .

Comment by diffractor on Counterfactual Induction · 2019-12-19T07:45:05.299Z · score: 1 (1 votes) · LW · GW

Oh, I see what the issue is. Propositional tautology given means , not . So yeah, when A is a boolean that is equivalent to via boolean logic alone, we can't use that A for the exact reason you said, but if A isn't equivalent to via boolean logic alone (although it may be possible to infer by other means), then the denominator isn't necessarily small.

Comment by diffractor on Counterfactual Induction · 2019-12-18T20:21:02.188Z · score: 1 (1 votes) · LW · GW

Yup, a monoid, because and , so it acts as an identitity element, and we don't care about the order. Nice catch.

You're also correct about what propositional tautology given A means.

Comment by diffractor on Counterfactual Induction (Algorithm Sketch, Fixpoint proof) · 2019-12-18T08:17:42.580Z · score: 1 (1 votes) · LW · GW

Yup! The subscript is the counterfactual we're working in, so you can think of it as a sort of conditional pricing.

The prices aren't necessarily unique, we set them anew on each turn, and there may be multiple valid prices for each turn. Basically, the prices are just set so that the supertrader doesn't earn money in any of the "possible" worlds that we might be in. Monotonicity is just "the price of a set of possibilities is greater than the price of a subset of possibilities"

Comment by diffractor on Counterfactual Induction · 2019-12-18T06:58:05.392Z · score: 2 (2 votes) · LW · GW

If there's a short proof of from and a short proof of from and they both have relatively long disproofs, then counterfacting on , should have a high value, and counterfacting on , should have a high value.

The way to read is that the stuff on the left is your collection of axioms ( is a finite collection of axioms and just means we're using the stuff in as well as the statement as our axioms), and it proves some statement.

For the first formulation of the value of a statement, the value would be 1 if adding doesn't provide any help in deriving a contradiction from A. Or, put another way, the shortest way of proving , assuming A as your axioms, is to derive and use principle of explosion. It's "independent" of A, in a sense.

There's a technicality for "equivalent" statements. We're considering "equivalent" as "propositionally equivalent given A" (Ie, it's possible to prove an iff statement with only the statements in A and boolean algebra alone. For example, is a statement provable with only boolean algebra alone. If you can prove the iff but you can't do it with boolean algebra alone, it doesn't count as equivalent. Unless is propositionally equivalent to , then is not equivalent to , (because maybe is false and is true) which renders the equality you wrote wrong, as well as making the last paragraph incoherent.

In classical probability theory, holds iff is 0. Ie, if it's impossible for both things to happen, the probability of "one of the two things happen" is the same as the sum of the probabilities for event 1 and event 2.

In our thing, we only guarantee equality for when (assuming A). This is because (first two = by propositonally equivalent statements getting the same value, the third = by being propositionally equivalent to assuming , fourth = by being propositionally equivalent to , final = by unitarity. Equality may hold in some other cases, but you don't have a guarantee of such, even if the two events are disjoint, which is a major difference from standard probability theory.

The last paragraph is confused, as previously stated. Also, there's a law of boolean algebra that is the same as . Also, the intuition is wrong, should be less than , because "probability of event 1 happens" is greater than "probability that event 1 and event 2 happens".

Highlighting something and pressing ctrl-4 turns it to LaTeX.

Comment by diffractor on CO2 Stripper Postmortem Thoughts · 2019-12-01T23:34:47.690Z · score: 11 (4 votes) · LW · GW

Yup, this turned out to be a crucial consideration that makes the whole project look a lot less worthwhile. If ventilation at a bad temperature is available, it's cheaper to just get a heat exchanger and ventilate away and eat the increased heating costs during winter than to do a CO2 stripper.

There's still a remaining use case for rooms without windows that aren't amenable to just feeding an air duct outside, but that's a lot more niche than my original expectations. Gonna edit the original post now.

Comment by diffractor on CO2 Stripper Postmortem Thoughts · 2019-12-01T00:46:05.951Z · score: 2 (2 votes) · LW · GW

Also, a paper on extremely high-density algal photobioreactors quotes algal concentration by volume as being as high as 6% under optimal conditions. The dry mass is about 1/8 of the wet mass of algae, so that's 0.75% concentration by weight percent. If the algal inventory in your reactor is 9 kg dry mass (you'd need to waste about 3 kg/day of dry weight or 24 kg/day of wet weight, to keep up with 2 people worth of CO2, or a third of the algae each day), that's 1200 kg of water in your reactor. Since a gallon is about 4 kg of water, that's... 300 gallons, or 6 55-gallon drums, footprint 4 ft x 6 ft x 4 ft high, at a bare minimum (probably 3x that volume in practice), so we get the same general sort of result from a different direction.

I'd be quite surprised if you could do that in under a thousand dollars.

Comment by diffractor on CO2 Stripper Postmortem Thoughts · 2019-12-01T00:30:44.731Z · score: 2 (2 votes) · LW · GW

[EDIT: I see numbers as high as 4 g/L/day quoted for algae growth rates, I updated the reasoning accordingly]

The numbers don't quite add up on an algae bioreactor for personal use. The stated growth rate for chlorella algae is 0.6 g/L/day, and there are about 4 liters in a gallon, so 100 gallons of algae solution is 400 liters is 240 g of algae grown per day, and since about 2/3ds of new biomass comes from CO2 via the 6CO2+6H2O->C6H12O6 reaction, that's 160 g of CO2 locked up per day, or... about 1/6 of a person worth of CO2 in a 24 hour period. [EDIT: 1 person worth of CO2 in a 24 hour period, looks more plausible]

Plants are inefficient at locking up CO2 relative to chemical reactions!

Also you wouldn't be able to just have the algae as a giant vat, because light has to penetrate in, so the resulting reactor to lock up 1/6 [EDIT: 1] of a person worth of CO2 would be substantially larger than the footprint of 2 55-gallon drums.

Comment by diffractor on CO2 Stripper Postmortem Thoughts · 2019-12-01T00:21:36.060Z · score: 2 (2 votes) · LW · GW

I have the relevant air sensor, it'd be really hard to blind it because it makes noise, and the behavioral effects thing is a good idea, thank you.

It's not currently with me.

I think the next thing to do is build the 2.0 design, because it should perform better and will also be present with me, then test the empirical CO2 reduction and behavioral effects (although, again, blinding will be difficult), and reevaluate at that point.

Comment by diffractor on So You Want to Colonize The Universe Part 5: The Actual Design · 2019-08-23T05:59:57.495Z · score: 1 (1 votes) · LW · GW

Good point on phase 6. For phase 3, smaller changes in velocity further out are fine, but I still think that even with less velocity changes, you'll still have difficulty finding an engine that gets sufficient delta-V that isn't fission/fusion/antimatter based. (also in the meantime I realized that neutron damage over those sorts of timescales are going to be *really* bad.) For phase 5, I don't think a lightsail would provide enough deceleration, because you've got inverse-square losses. Maybe you could decelerate with a lightsail in the inner stellar system, but I think you'd just breeze right through since the radius of the "efficiently slow down" sphere is too small relative to how much you slow down, and in the outer stellar system, light pressure is too low to slow you down meaningfully.

Comment by diffractor on So You Want To Colonize The Universe Part 3: Dust · 2019-08-23T05:53:19.550Z · score: 1 (1 votes) · LW · GW

Very good point!

Comment by diffractor on 87,000 Hours or: Thoughts on Home Ownership · 2019-07-08T05:05:39.761Z · score: 5 (4 votes) · LW · GW

I'd be extremely interested in the quantitative analysis you've done so far.

Comment by diffractor on Open Problems Regarding Counterfactuals: An Introduction For Beginners · 2019-03-25T21:41:31.947Z · score: 12 (4 votes) · LW · GW

See if this works.

Comment by diffractor on So You Want to Colonize The Universe Part 4: Velocity Changes and Energy · 2019-03-02T18:17:30.135Z · score: 2 (2 votes) · LW · GW

I'm talking about using a laser sail to get up to near c (0.1 g acceleration for 40 lightyears is pretty strong) in the first place, and slowing down by other means.

This trick is about using a laser sail for both acceleration and deceleration.

Comment by diffractor on So You Want to Colonize The Universe Part 4: Velocity Changes and Energy · 2019-03-02T02:13:19.223Z · score: 2 (2 votes) · LW · GW

Yeah, I think the original proposal for a solar sail involved deceleration by having the central part of the sail detach and receive the reflected beam from the outer "ring" of the sail. I didn't do this because IIRC the beam only maintains coherence over 40 lightyears or so, so that trick would be for nearby missions.

Comment by diffractor on So You Want To Colonize The Universe Part 3: Dust · 2019-02-28T21:31:33.143Z · score: 6 (5 votes) · LW · GW

For 1, the mental model for non-relativistic but high speeds should be "a shallow crater is instantaneously vaporized out of the material going fast" and for relativistic speeds, it should be the same thing but with the vaporization directed in a deeper hole (energy doesn't spread out as much, it keeps in a narrow cone) instead of in all directions. However, your idea of having a spacecraft as a big flat sheet and being able to tolerate having a bunch of holes being shot in it is promising. The main issue that I see is that this approach is incompatible with a lot of things that (as far as we know) can only be done with solid chunks of matter, like antimatter energy capture, or having sideways boosting-rockets, and once you start armoring the solid chunks in the floaty sail, you're sort of back in the same situation. So it seems like an interesting approach and it'd be cool if it could work but I'm not quite sure it can (not entirely confident that it couldn't, just that it would require a bunch of weird solutions to stuff like "how does your sheet of tissue boost sideways at 0.1% of lightspeed".

For 2, the problem is that the particles which are highly penetrating are either unstable (muons, kaons, neutrons...) and will fall apart well before arrival (and that's completely dodging the issue of making bulk matter out of them), or they are stable (neutrinos, dark matter), and don't interact with anything, and since they don't really interact with anything, this means they especially don't interact with themselves (well, at least we know this for neutrinos), so they can't hold together any structure, nor can they interact with matter at the destination. Making a craft out of neutrinos is ridiculously more difficult than making a craft out of room-temperature air. If they can go through a light-year of lead without issue, they aren't exactly going to stick to each other. Heck, I think you'd actually have better luck trying to make a spaceship out of pure light.

For 3, it's because in order to use ricocheting mass to power your starcraft, you need to already have some way of ramping the mass up to relativistic speeds so it can get to the rapidly retreating starcraft in the first place, and you need an awful lot of mass. Light already starts off at the most relativistic speed of all, and around a star you already have astronomical amounts of light available for free.

For 4, there sort of is, but mostly not. The gravity example has the problem of the speeding up of the craft when it has the two stars ahead of it perfectly counterbalancing the backwards deceleration when the two stars are behind it. For potentials like gravity or electrical fields or pretty much anything you'd want to use, there's an inverse-square law for them, which means that they aren't really relevant unless you're fairly close to a star. The one instance I can think of where something like your approach is the case is the electric sail design in the final part. In interstellar space, it brakes against the thin soup of protons as usual, but nearby a star, the "wind" of particles streaming out from the star acts as a more effective brake and it can sail on that (going out), or use it for better deceleration (coming in). Think of it as a sail slowing a boat down when the air is stationary, and slowing down even better when the wind is blowing against you.

Comment by diffractor on So You Want to Colonize The Universe · 2019-02-28T21:10:48.895Z · score: 2 (2 votes) · LW · GW

Whoops, I guess I messed up on that setting. Yeah, it's ok.

Comment by diffractor on So You Want to Colonize The Universe Part 4: Velocity Changes and Energy · 2019-02-28T01:07:23.941Z · score: 3 (3 votes) · LW · GW

Actually, no! The activation energy for the conversion of diamond to graphite is about 540 kJ/mol, and using the Arrhenius equation to get the rate constant for diamond-graphite conversion, with a radiator temperature of 1900 K, we get that after 10,000 years of continuous operation, 99.95% of the diamond will still be diamond. At room temperature, the diamond-to-carbon conversion rate is slow enough that protons will decay before any appreciable amount of graphite is made.

Even for a 100,000 year burn, 99.5% of the diamond will still be intact at 1900 K.

There isn't much room to ramp up the temperature, though. We can stick to around 99%+ of the diamond being intact up to around 2100 K, but 2200 K has 5% of the diamond converting, 2300 K has 15% converting, 2400K has 45%, and it's 80 and 99% conversion of diamond into graphite over 10,000 years for 2500 K and 2600 K respectively.

Comment by diffractor on So You Want to Colonize The Universe · 2019-02-27T19:31:12.774Z · score: 1 (4 votes) · LW · GW

Agreed. Also, there's an incentive to keep thinking about how to go faster until the marginal gain in design by one day of thought speeds the rocket up by less than one day, instead of launching, otherwise you'll get overtaken, and agreeing on a coordinated plan ahead of time (you get this galaxy, I get that galaxy, etc...) to avoid issues with lightspeed delays.

Comment by diffractor on So You Want to Colonize The Universe · 2019-02-27T19:28:57.684Z · score: 3 (3 votes) · LW · GW

Or maybe accepting messages from home (in rocket form or not) of "whoops, we were wrong about X, here's the convincing moral argument" and acting accordingly. Then the only thing to be worried about would be irreversible acts done in the process of colonizing a galaxy, instead of having a bad "living off resources" endstate.

Comment by diffractor on So You Want to Colonize the Universe Part 2: Deep Time Engineering · 2019-02-27T18:08:20.973Z · score: 5 (4 votes) · LW · GW

Edited. Thanks for that. I guess I managed to miss both of those, I was mainly going off of the indispensable and extremely thorough Atomic Rockets site having extremely little discussion of intergalactic missions as opposed to interstellar missions.

It looks like there are some spots where me and Armstrong converged on the same strategy (using lasers to launch probes), but we seem to disagree about how big of a deal dust shielding is, how hard deceleration is, and what strategy to use for deceleration.

Comment by diffractor on So You Want to Colonize The Universe · 2019-02-27T17:57:20.690Z · score: 2 (2 votes) · LW · GW

Yeah, Atomic Rockets was an incredibly helpful resource for me, I definitely endorse it for others.

Comment by diffractor on What makes people intellectually active? · 2019-01-15T01:16:04.921Z · score: 11 (6 votes) · LW · GW

This doesn't quite seem right, because just multiplying probabilities only works when all the quantities are independent. However, I'd put higher odds on someone having the ability to recognize a worthwhile result conditional on them having an ability to work on a problem, then having the ability to recognize a worthwhile result, so the multiplication of probabilities will be higher than it seems at first.

I'm unsure whether this consideration affects whether the distribution would be lognormal or not.

Comment by diffractor on Dutch-Booking CDT · 2019-01-14T01:29:23.101Z · score: 1 (1 votes) · LW · GW

(lightly edited restatement of email comment)

Let's see what happens when we adapt this to the canonical instance of "no, really, counterfactuals aren't conditionals and should have different probabilities". The cosmic ray problem, where the agent has the choice between two paths, it slightly prefers taking the left path, but its conditional on taking the right path is a tiny slice of probability mass that's mostly composed of stuff like "I took the suboptimal action because I got hit by a cosmic ray".

There will be 0 utility for taking left path, -10 utility for taking the right path, and -1000 utility for a cosmic ray hit. The CDT counterfactual says 0 utility for taking left path, -10 utility for taking the right path, while the conditional says 0 utility for left path, -1010 utility for right path (because conditional on taking the right path, you were hit by a cosmic ray).

In order to get the dutch book to go through, we need to get the agent to take the right path, to exploit P(cosmic ray) changing between the decision time and afterwards. So the initial bet could be something like -1 utility now, +12 utility upon taking the right path and not being hit by a cosmic ray. But now since the optimal action is "take the right path along with the bet", the problem setup has been changed, and we can't conclude that the agent's conditional on taking the right path places high probability on getting hit by a cosmic ray (because now the right path is the optimal action), so we can't money-pump with the "+0.5 utility, -12 utility upon taking a cosmic ray hit" bet.

So this seems to dutch-book Death-in-Damascus, not CDTEDT cases in general.

Comment by diffractor on Failures of UDT-AIXI, Part 1: Improper Randomizing · 2019-01-08T03:43:24.721Z · score: 3 (2 votes) · LW · GW

Yes, UDT means updateless decision theory, "the policy" is used as a placeholder for "whatever policy the agent ends up picking", much like a variable in an equation, and "the algorithm I wrote" is still unpublished because there were too many things wrong with it for me to be comfortable putting it up, as I can't even show it has any nice properties in particular. Although now that you mention it, I probably should put it up so future posts about what's wrong with it have a well-specified target to shoot holes in. >_>

Comment by diffractor on Cooperative Oracles · 2018-12-05T05:23:15.854Z · score: 1 (1 votes) · LW · GW

It actually is a weakening. Because all changes can be interpreted as making some player worse off if we just use standard Pareto optimality, the second condition mean that more changes count as improvements, as you correctly state. The third condition cuts down on which changes count as improvements, but the combination of conditions 2 and 3 still has some changes being labeled as improvements that wouldn't be improvements under the old concept of Pareto Optimality.

The definition of an almost stratified Pareto optimum was adapted from this , and was developed specifically to address the infinite game in that post involving a non-well-founded chain of players, where nothing is a stratified Pareto optimum for all players. Something isn't stratified Pareto optimal in a vacuum, it's stratified Pareto optimal for a particular player. There's no oracle that's stratified Pareto optimal for all players, but if you take the closure of everyone's SPO sets first to produce a set of ASPO oracles for every player, and take the intersection of all those sets, there are points which are ASPO for everyone.

Comment by diffractor on Beliefs at different timescales · 2018-11-06T01:00:36.301Z · score: 3 (2 votes) · LW · GW

My initial inclination is to introduce as the space of events on turn , and define and then you can express it as .

Comment by diffractor on Beliefs at different timescales · 2018-11-05T00:14:47.198Z · score: 1 (1 votes) · LW · GW

The notation for the sum operator is unclear. I'd advise writing the sum as and using an subscript inside the sum so it's clearer what is being substituted where.

Comment by diffractor on Asymptotic Decision Theory (Improved Writeup) · 2018-10-29T21:50:53.040Z · score: 3 (2 votes) · LW · GW

Wasn't there a fairness/continuity condition in the original ADT paper that if there were two "agents" that converged to always taking the same action, then the embedder would assign them the same value? (more specifically, if , then ) This would mean that it'd be impossible to have be low while is high, so the argument still goes through.

Although, after this whole line of discussion, I'm realizing that there are enough substantial differences between the original formulation of ADT and the thing I wrote up that I should probably clean up this post a bit and clarify more about what's different in the two formulations. Thanks for that.

Comment by diffractor on Asymptotic Decision Theory (Improved Writeup) · 2018-10-25T20:54:42.492Z · score: 1 (1 votes) · LW · GW
in the ADT paper, the asymptotic dominance argument is about the limit of the agent's action as epsilon goes to 0. This limit is not necessarily computable, so the embedder can't contain the agent, since it doesn't know epsilon. So the evil problem doesn't work.

Agreed that the evil problem doesn't work for the original ADT paper. In the original ADT paper, the agents are allowed to output distributions over moves. I didn't like this because it implicitly assumes that it's possible for the agent to perfectly randomize, and I think randomization is better modeled by a (deterministic) action that consults an environmental random-number generator, which may be correlated with other things.

What I meant was that, in the version of argmax that I set up, if is the two constant policies "take blank box" and "take shiny box", then for the embedder where the opponent runs argmax to select which box to fill, the argmax agent will converge to deterministically randomizing between the two policies, by the logical inductor assigning very similar expected utility to both options such that the inductor can't predict which action will be chosen. And this occurs because the inductor outputting more of "take the blank box" will have converge to a higher expected value (so argmax will learn to copy that), and the inductor outputting more of "take the shiny box" will have converge to a higher expected value (so argmax will learn to copy that).

The optimality proof might be valid. I didn't understand which specific step you thought was wrong.

So, the original statement in the paper was

It must then be the case that for every . Let be the first element of in . Since every class will be seperated by at least in the limit, will eventually be a distribution over just . And since for every , , by the definition of it must be the case that .

The issue with this is the last sentence. It's basically saying "since the two actions and get equal expected utility in the limit, the total variation distance between a distribution over the two actions, and one of the actions, limits to zero", which is false

And it is specifically disproved by the second counterexample, where there are two actions that both result in 1 utility, so they're both in the same equivalence class, but a probabilistic mixture between them (as converges to playing, for all ) gets less than 1 utility.

Consider the following embedder. According to this embedder, you will play chicken against ADT-epsilon who knows who you are. When ADT-epsilon considers this embedder, it will always pass the reality filter, since in fact ADT-epsilon is playing against ADT-epsilon. Furthermore, this embedder gives NeverSwerveBot a high utility. So ADT-epsilon expects a high utility from this embedder, through NeverSwerveBot, and it never swerves.

You'll have to be more specific about "who knows what you are". If it unpacks as "opponent only uses the embedder where it is up against [whatever policy you plugged in]", then NeverSwerveBot will have a high utility, but it will get knocked down by the reality filter, because if you converge to never swerving, will converge to 0, and the inductor will learn that so it will converge to assigning equal expected value to and, and converges to 1.

If it unpacks as "opponent is ADT-epsilon", and you converge to never swerving, then argmaxing will start duplicating the swerve strategy instead of going straight. In both cases, the argument fails.

Comment by diffractor on Asymptotic Decision Theory (Improved Writeup) · 2018-09-28T05:22:26.498Z · score: 3 (2 votes) · LW · GW

I got an improved reality-filter that blocks a certain class of environments that lead conjecture 1 to fail, although it isn't enough to deal with the provided chicken example and lead to a proof of conjecture 1. (the subscripts will be suppressed for clarity)

Instead of the reality-filter for being

it is now

This doesn't just check whether reality is recovered on average, it also checks whether all the "plausible conditionals" line up as well. Some of the conditionals may not be well-formed, as there may be conditioning on low-or-zero probability events, but these are then multiplied by a very small number, so no harm is done.

This has the nice property that for all "plausibly chosen embedders" that have a probability sufficiently far away from 0, all embedders and that pass this reality filter have the property that

So all embedders that pass the reality filter will agree on the expected utility of selecting a particular embedder that isn't very unlikely to be selected.

Comment by diffractor on Reflective AIXI and Anthropics · 2018-09-27T20:11:27.160Z · score: 1 (1 votes) · LW · GW

I figured out what feels slightly off about this solution. For events like "I have a long memory and accidentally dropped a magnet on it", it intuitively feels like describing your spot in the environment and the rules of your environment is much lower K-complexity than finding a turing machine/environment that starts by giving you the exact (long) scrambled sequence of memories that you have, and then resumes normal operating.

Although this also feels like something nearby is actually desired behavior. If you rewrite the tape to be describing some other simple environment, you would intuitively expect the AIXI to act as if it's in the simple environment for a brief time before gaining enough information to conclude that things have changed and rederive the new rules of where it is.

Comment by diffractor on Reflective AIXI and Anthropics · 2018-09-27T20:01:04.519Z · score: 1 (1 votes) · LW · GW

Not quite. If taking bet 9 is a prerequisite to taking bet 10, then AIXI won't take bet 9, but if bet 10 gets offered whether or not bet 9 is accepted, then AIXI will be like "ah, future me will take the bet, and wind up with 10+ in the heads world and -20+2 in the tails world. This is just a given. I'll take this +15/-15 bet as it has net positive expected value, and the loss in the heads world is more than counterbalanced by the reduction in the magnitude of loss for the tails world"

Something else feels slightly off, but I can't quite pinpoint it at this point. Still, I guess this solves my question as originally stated, so I'll PM you for payout. Well done!

(btw, you can highlight a string of text and hit crtl+4 to turn it into math-mode)

Comment by diffractor on Asymptotic Decision Theory (Improved Writeup) · 2018-09-27T16:01:20.403Z · score: 1 (1 votes) · LW · GW

Yup, I meant counterfactual mugging. Fixed.

Comment by diffractor on Asymptotic Decision Theory (Improved Writeup) · 2018-09-27T08:06:03.342Z · score: 1 (1 votes) · LW · GW

I think I remember the original ADT paper showing up on agent foundations forum before a writeup on logical EDT with exploration, and my impression of which came first was affected by that. Also, the "this is detailed in this post" was referring to logical EDT for exploration. I'll edit for clarity.

Comment by diffractor on Reflective AIXI and Anthropics · 2018-09-26T00:09:27.053Z · score: 1 (1 votes) · LW · GW

I actually hadn't read that post or seen the idea anywhere before writing this up. It's a pretty natural resolution, so I'd be unsurprised if it was independently discovered before. Sorry about being unable to assist.

The extra penalty to describe where you are in the universe corresponds to requiring sense data to pin down *which* star you are near, out of the many stars, even if you know the laws of physics, so it seems to recover desired behavior.

Comment by diffractor on Cooperative Oracles · 2018-09-05T03:54:17.930Z · score: 3 (2 votes) · LW · GW

Giles Edkins coded up a thing which lets you plug in numbers for a 2-player, 2-move game payoff matrix and it automatically displays possible outcomes in utility-space. It may be found here. The equilibrium points and strategy lines were added later in MS Paint.

Comment by diffractor on Cooperative Oracles · 2018-09-02T22:51:28.607Z · score: 1 (1 votes) · LW · GW

The basic reason for the dependency relation to care about oracle queries from strategies is that, when you have several players all calling the oracle on each other, there's no good way to swap out the oracle calls with the computation. The trick you describe does indeed work, and is a reason to not call any more turing machines than you need to, but there's several things it doesn't solve. For instance, if you are player 1, and your strategy depends on oracle calls to player 2 and 3, and the same applies to the other two players, you may be able to swap out an oracle call to player two with player two's actual code (which calls players 1 and 3), but you can't unpack any more oracle calls into their respective computations without hitting an infinite regress.

Comment by diffractor on Cooperative Oracles · 2018-09-01T16:24:54.548Z · score: 1 (1 votes) · LW · GW

I'm not sure what you mean by fixing the utility function occurring before fixing the strategy. In the problem setup of a game, you specify a utility function machine and a strategy machine for everyone, and there isn't any sort of time or order on this (there's just a set of pairs of probabilistic oracle machines) and you can freely consider things such as "what happens when we change some player's strategies/utility function machines"

Comment by diffractor on Probabilistic Tiling (Preliminary Attempt) · 2018-08-07T23:53:21.192Z · score: 1 (1 votes) · LW · GW

Ah, the formal statement was something like "if the policy A isn't the argmax policy, the successor policy B must be in the policy space of the future argmax, and the action selected by policy A is computed so the relevant equality holds"

Yeah, I am assuming fast feedback that it is resolved on day .

What I meant was that the computation isn't extremely long in the sense of description length, not in the sense of computation time. Also, we aren't doing policy search over the set of all turing machines, we're doing policy search over some smaller set of policies that can be guaranteed to halt in a reasonable time (and more can be added as time goes on)

Also I'm less confident in conditional future-trust for all conditionals than I used to be, I'll try to crystallize where I think it goes wrong.

Comment by diffractor on Probabilistic Tiling (Preliminary Attempt) · 2018-08-07T23:19:46.375Z · score: 1 (1 votes) · LW · GW

First: That notation seems helpful. Fairness of the environment isn't present by default, it still needs to be assumed even if the environment is purely action-determined, as you can consider an agent in the environment that is using a hardwired predictor of what the argmax agent would do. It is just a piece of the environment, and feeding a different sequence of actions into the environment as input gets a different score, so the environment is purely action-determined, but it's still unfair in the sense that the expected utility of feeding action into the function drops sharply if you condition on the argmax agent selecting action . The third condition was necessary to carry out this step. . The intuitive interpretation of the third condition is that, if you know that policy B selects action 4, then you can step from "action 4 is taken" to "policy B takes the actions it takes", and if you have a policy where you don't know what action it takes (third condition is violated), then "policy B does its thing" may have a higher expected utility than any particular action being taken, even in a fair environment that only cares about action sequences, as the hamster dance example shows.

Second: I think you misunderstood what I was claiming. I wasn't claiming that logical inductors attain the conditional future-trust property, even in the limit, for all sentences or all true sentences. What I was claiming was: The fact that is provable or disprovable in the future (in this case, is ), makes the conditional future-trust property hold (I'm fairly sure), and for statements where there isn't guaranteed feedback, the conditional future-trust property may fail. The double-expectation property that you state does not work to carry the proof through, because the proof (from the perspective of the first agent), takes as an assumption, so the "conditional on " part has to be outside of the future expectation, when you go back to what the first agent believes.

Third: the sense I meant for "agent is able to reason about this computation" is that the computation is not extremely long, so logical inductor traders can bet on it.

Comment by diffractor on Complete Class: Consequentialist Foundations · 2018-07-12T03:54:22.837Z · score: 3 (2 votes) · LW · GW

Pretty much that, actually. It doesn't seem too irrational, though. Upon looking at a mathematical universe where torture was decided upon as a good thing, it isn't an obvious failure of rationality to hope that a cosmic ray flips the sign bit of the utility function of an agent in there.

The practical problem with values that care about other mathematical worlds, however, is that if the agent you built has a UDT prior over values, it's an improvement (from the perspective of the prior) for the nosy neigbors/values that care about other worlds, to dictate some of what happens in your world (since the marginal contribution of your world to the prior expected utility looks like some linear combination of the various utility functions, weighted by how much they care about your world) So, in practice, it'd be a bad idea to build a UDT value learning prior containing utility functions that have preferences over all worlds, since it'd add a bunch of extra junk from different utility functions to our world if run.

Comment by diffractor on An environment for studying counterfactuals · 2018-07-11T03:48:02.722Z · score: 2 (2 votes) · LW · GW

If exploration is a hack, then why do pretty much all multi-armed bandit algorithms rely on exploration into suboptimal outcomes to prevent spurious underestimates of the value associated with a lever?

Comment by diffractor on Optimal and Causal Counterfactual Worlds · 2018-07-05T05:40:27.000Z · score: 0 (0 votes) · LW · GW

Yeah, when I went back and patched up the framework of this post to be less logical-omniscence-y, I was able to get , but 2 is a bit too strong to be proved from 1, because my framing of 2 is just about probability disagreements in general, while 1 requires to assign probability 1 to .

Comment by diffractor on Poker example: (not) deducing someone's preferences · 2018-06-23T00:37:21.418Z · score: 4 (2 votes) · LW · GW

Since beliefs/values combinations can be ruled out, would it then be possible to learn values by asking the human about their own beliefs?

Comment by diffractor on A Loophole for Self-Applicative Soundness · 2018-06-11T15:01:10.000Z · score: 0 (0 votes) · LW · GW

I found an improved version by Pavel, that gives a way to construct a proof of from that has a length of . The improved version is here.

There are restrictions to this result, though. One is that the C-rule must apply to the logic. This is just the ability to go from to instantiating a such that . Pretty much all reasonable theorem provers have this.

The second restriction is that the theory must be finitely axiomatizable. No axiom schemas allowed. Again, this isn't much of a restriction in practice, because NBG set theory, which proves the consistency of ZFC, is finitely axiomatizable.

The proof strategy is basically as follows. It's shown that the shortest proof of a statement with quantifier depth n must have a length of , if the maximum quantifier depth in the proof is or greater.

This can be flipped around to conclude that if there's a length-n proof of , the maximum quantifier depth in the proof can be at most .

The second part of the proof involves constructing a bounded-quantifier version of a truth predicate. By Tarski's undefinability of truth, a full truth predicate cannot be constructed, but it's possible to exhibit a formula for which it's provable that ( is the formula laying out Tarski's conditions for something to be a truth predicate). Also, if quantifier depth of , there's a proof of ( is the sentence with its free variables substituted for the elements enumerated in the list ) Also, there's a proof that is preserved under inference rules and logical axioms, as long as everything stays below a quantifier depth of .

All these proofs can be done in lines. One factor of comes from the formula abbreviated as getting longer at a linear rate, and the other factor comes from having to prove for each seperately as an ingredient for the next proof.

Combining the two parts, the bound on the quantifier depth and the bound on how long it takes to prove stuff about the truth predicate, make it take steps to prove all the relevant theorems about a sufficiently large bounded quantifier depth truth predicate, and then you can just go "the statement that we are claiming to have been proved must have apply to it, and we've proved this is equivalent to the statement itself"

As a further bonus, a single -length proof can establish the consistency of the theory itself for all -length proofs.

It seems like a useful project to develop a program that will automatically write a proof of this form, to assess whether abstract unpacking of bounded proofs is usable in practice, but it will require digging into a bunch of finicky details of exactly how to encode a math theory inside itself.

Comment by diffractor on A Loophole for Self-Applicative Soundness · 2018-06-04T22:25:12.000Z · score: 0 (0 votes) · LW · GW

Caught a flaw with this proposal in the currently stated form, though it is probably patchable.

When unpacking a proof, at some point the sentence will be reached as a conclusion, which is a false statement.

Comment by diffractor on Does Thinking Hard Hurt Your Brain? · 2018-04-29T21:56:04.384Z · score: 12 (3 votes) · LW · GW

It doesn't hurt my brain, but there's a brain fog that kicks in eventually, that's kind of like a blankness with no new ideas coming, an aversion to further work, and a reduction in working memory, so I can stare at some piece of math for a while, and not comprehend it, because I can't load all the concepts into my mind at once. It's kind of like a hard limit for any cognition-intensive task.

This kicks in around the 2 hour mark for really intensive work/studying, although for less intensive work/studying, it can vary up all the way up to 8 hours. As a general rule of thumb, the -afinil class of drugs triples my time limit until the brain fog kicks in, at a cost of less creative and lateral thinking.

Because of this, my study habits for school consisted of alternating 2-hour study blocks and naps.

Comment by diffractor on Smoking Lesion Steelman · 2018-04-19T22:24:29.000Z · score: 1 (1 votes) · LW · GW

I think that in that case, the agent shouldn't smoke, and CDT is right, although there is side-channel information that can be used to come to the conclusion that the agent should smoke. Here's a reframing of the provided payoff matrix that makes this argument clearer. (also, your problem as stated should have 0 utility for a nonsmoker imagining the situation where they smoke and get killed)

Let's say that there is a kingdom which contains two types of people, good people and evil people, and a person doesn't necessarily know which type they are. There is a magical sword enchanted with a heavenly aura, and if a good person wields the sword, it will guide them do heroic things, for +10 utility (according to a good person) and 0 utility (according to a bad person). However, if an evil person wields the sword, it will afflict them for the rest of their life with extreme itchiness, for -100 utility (according to everyone).

good person's utility estimates:

  • takes sword

    • I'm good: 10

    • I'm evil: -90

  • don't take sword: 0

evil person's utility estimates:

  • takes sword

    • I'm good: 0

    • I'm evil: -100

  • don't take sword: 0

As you can clearly see, this is the exact same payoff matrix as the previous example. However, now it's clear that if a (secretly good) CDT agent believes that most of society is evil, then it's a bad idea to pick up the sword, because the agent is probably evil (according to the info they have) and will be tormented with itchiness for the rest of their life, and if it believes that most of society is good, then it's a good idea to pick up the sword. Further, this situation is intuitively clear enough to argue that CDT just straight-up gets the right answer in this case.

A human (with some degree of introspective power) in this case, could correctly reason "oh hey I just got a little warm fuzzy feeling upon thinking of the hypothetical where I wield the sword and it doesn't curse me. This is evidence that I'm good, because an evil person would not have that response, so I can safely wield the sword and will do so".

However, what the human is doing in this case is using side-channel information that isn't present in the problem description. They're directly experiencing sense data as a result of the utility calculation outputting 10 in that hypothetical, and updating on that. In a society where everyone was really terrible at introspection so the only access they had to their decision algorithm was seeing their actual decision, (and assuming no previous decision problems that good and evil people decide differently on so the good person could learn that they were good by their actions), it seems to me like there's a very intuitively strong case for not picking up the sword/not smoking.