## Posts

Noise on the Channel 2020-07-02T01:58:18.128Z · score: 29 (6 votes)
Betting with Mandatory Post-Mortem 2020-06-24T20:04:34.177Z · score: 80 (33 votes)
Relating HCH and Logical Induction 2020-06-16T22:08:10.023Z · score: 47 (9 votes)
Dutch-Booking CDT: Revised Argument 2020-06-11T23:34:22.699Z · score: 45 (10 votes)
An Orthodox Case Against Utility Functions 2020-04-07T19:18:12.043Z · score: 114 (37 votes)
Thinking About Filtered Evidence Is (Very!) Hard 2020-03-19T23:20:05.562Z · score: 85 (29 votes)
Bayesian Evolving-to-Extinction 2020-02-14T23:55:27.391Z · score: 39 (15 votes)
A 'Practice of Rationality' Sequence? 2020-02-14T22:56:13.537Z · score: 75 (24 votes)
Instrumental Occam? 2020-01-31T19:27:10.845Z · score: 31 (11 votes)
Becoming Unusually Truth-Oriented 2020-01-03T01:27:06.677Z · score: 99 (39 votes)
The Credit Assignment Problem 2019-11-08T02:50:30.412Z · score: 67 (21 votes)
Defining Myopia 2019-10-19T21:32:48.810Z · score: 28 (6 votes)
Random Thoughts on Predict-O-Matic 2019-10-17T23:39:33.078Z · score: 27 (11 votes)
The Parable of Predict-O-Matic 2019-10-15T00:49:20.167Z · score: 194 (69 votes)
Partial Agency 2019-09-27T22:04:46.754Z · score: 53 (15 votes)
The Zettelkasten Method 2019-09-20T13:15:10.131Z · score: 175 (88 votes)
Do Sufficiently Advanced Agents Use Logic? 2019-09-13T19:53:36.152Z · score: 41 (16 votes)
Troll Bridge 2019-08-23T18:36:39.584Z · score: 73 (42 votes)
Conceptual Problems with UDT and Policy Selection 2019-06-28T23:50:22.807Z · score: 52 (13 votes)
What's up with self-esteem? 2019-06-25T03:38:15.991Z · score: 41 (20 votes)
How hard is it for altruists to discuss going against bad equilibria? 2019-06-22T03:42:24.416Z · score: 52 (15 votes)
Paternal Formats 2019-06-09T01:26:27.911Z · score: 60 (27 votes)
Mistakes with Conservation of Expected Evidence 2019-06-08T23:07:53.719Z · score: 148 (47 votes)
Does Bayes Beat Goodhart? 2019-06-03T02:31:23.417Z · score: 46 (15 votes)
Selection vs Control 2019-06-02T07:01:39.626Z · score: 114 (32 votes)
Separation of Concerns 2019-05-23T21:47:23.802Z · score: 70 (22 votes)
Alignment Research Field Guide 2019-03-08T19:57:05.658Z · score: 207 (77 votes)
Pavlov Generalizes 2019-02-20T09:03:11.437Z · score: 68 (20 votes)
What are the components of intellectual honesty? 2019-01-15T20:00:09.144Z · score: 32 (8 votes)
CDT=EDT=UDT 2019-01-13T23:46:10.866Z · score: 42 (11 votes)
When is CDT Dutch-Bookable? 2019-01-13T18:54:12.070Z · score: 25 (4 votes)
Dutch-Booking CDT 2019-01-13T00:10:07.941Z · score: 27 (8 votes)
Non-Consequentialist Cooperation? 2019-01-11T09:15:36.875Z · score: 46 (15 votes)
Combat vs Nurture & Meta-Contrarianism 2019-01-10T23:17:58.703Z · score: 61 (16 votes)
What makes people intellectually active? 2018-12-29T22:29:33.943Z · score: 92 (45 votes)
Embedded Agency (full-text version) 2018-11-15T19:49:29.455Z · score: 96 (39 votes)
Embedded Curiosities 2018-11-08T14:19:32.546Z · score: 86 (34 votes)
Subsystem Alignment 2018-11-06T16:16:45.656Z · score: 121 (39 votes)
Robust Delegation 2018-11-04T16:38:38.750Z · score: 120 (39 votes)
Embedded World-Models 2018-11-02T16:07:20.946Z · score: 91 (28 votes)
Decision Theory 2018-10-31T18:41:58.230Z · score: 101 (37 votes)
Embedded Agents 2018-10-29T19:53:02.064Z · score: 195 (85 votes)
A Rationality Condition for CDT Is That It Equal EDT (Part 2) 2018-10-09T05:41:25.282Z · score: 17 (6 votes)
A Rationality Condition for CDT Is That It Equal EDT (Part 1) 2018-10-04T04:32:49.483Z · score: 21 (7 votes)
In Logical Time, All Games are Iterated Games 2018-09-20T02:01:07.205Z · score: 84 (27 votes)
Track-Back Meditation 2018-09-11T10:31:53.354Z · score: 63 (27 votes)
Exorcizing the Speed Prior? 2018-07-22T06:45:34.980Z · score: 11 (4 votes)
Stable Pointers to Value III: Recursive Quantilization 2018-07-21T08:06:32.287Z · score: 20 (9 votes)
Probability is Real, and Value is Complex 2018-07-20T05:24:49.996Z · score: 51 (21 votes)

Comment by abramdemski on Noise on the Channel · 2020-07-03T16:35:01.231Z · score: 2 (1 votes) · LW · GW

I think it’s worth making a distinction between “noise” and “low bandwidth channel”. Your first examples of “a literal noisy room” or “people getting distracted by shiny objects passing by” fit the idea of “noise” well. Your last two examples of “inferential distance” and “land mines” don’t, IMO.

“Noise” is when the useful information is getting crowded out by random information in the channel, but land mines aren’t random. If you tell someone their idea is stupid and then you can’t continue telling them why because they’re flipping out at you, that’s not a random occurrence. Even if such things aren’t trivially predictable in more subtle cases, it’s still a predictable possibility and you can generally feel out when such things are safe to say or when you must tread a bit more carefully.

I edited my post to insert this distinction. You're totally right that I'm really focusing on bandwidth and calling it low-noise. But I disagree about the degree of the distinction you're making. In the case of the already-standard usage of "signal/noise ratio", there's no worry over whether the "noise" is really random. Twitter injects advertisements regularly, not randomly, but they still dilute the quality of the feed in the same way. Similarly, conversational land mines are functionally similar to distractions. First, because they tend to derail lines of thought. But second, and more frequently, in the way they influence conversation when they're merely a threat looming on the border of the conversation rather than a certainty. We avoid deep topics both because they're more likely to trigger defensiveness and because they aren't so valuable (and indeed may even be harmful) if they're interrupted. Indeed, I'm clustering them together because the two are somewhat exchangeable: a touchy subject can become quite approachable if you have a lot of quality time to feel it out and deal with misunderstandings/defensiveness (or any of the other helpful variables I mentioned).

Comment by abramdemski on Betting with Mandatory Post-Mortem · 2020-07-02T18:54:46.213Z · score: 4 (2 votes) · LW · GW

Yeah, pre-mortem is another name for pre-hindsight, and murphyjitsu is just the idea of alternating between making pre-mortems and fixing your plans to prevent whatever problem you envisioned in the pre-mortem.

Comment by abramdemski on Noise on the Channel · 2020-07-02T15:59:01.423Z · score: 2 (1 votes) · LW · GW

This seems to also apply to other cases where some amount of distraction actually benefits the conversation.

Comment by abramdemski on The Parable of Predict-O-Matic · 2020-07-01T17:05:59.088Z · score: 7 (3 votes) · LW · GW

I agree that it's broadly relevant to the partial-agency sequence, but I'm curious what particular reminiscence you're seeing here.

I would say that "if the free energy principle is a good way of looking at things" then it is the solution to at least one of the riddles I'm thinking about here. However, I haven't so far been very convinced.

I haven't looked into the technical details of Friston's work very much myself. However, a very mathematically sophisticated friend has tried going through Friston papers on a couple of occasions and found them riddled with mathematical errors. This does not, of course, mean that every version of free-energy/predictive-processing ideas is wrong, but it does make me hesitant to take purported results at face value.

Comment by abramdemski on Betting with Mandatory Post-Mortem · 2020-07-01T16:35:22.911Z · score: 4 (2 votes) · LW · GW

Sorta, but you might have 50:50 odds with a very large spread (both people are very confident in their side) or with a very small spread. So it might be helpful to record that.

Comment by abramdemski on Betting with Mandatory Post-Mortem · 2020-06-29T15:41:37.817Z · score: 6 (5 votes) · LW · GW

Another idea on this: both sides could do pre-mortems, "if I lose, ...". They could look back at this when doing post-mortems. Obviously this increases the effort involved.

Comment by abramdemski on Radical Probabilism [Transcript] · 2020-06-29T15:06:38.989Z · score: 6 (3 votes) · LW · GW

Yeah, the position in academic philosophy as I understand it is: Dutch book arguments aren't really about betting. It's not actually that we're so concerned about bets. Rather, it's a way to illustrate a kind of inconsistency. At first when I heard this I was kind of miffed about it, but now, I think it's the right idea. I suggest reading the SEP article on Dutch Book arguments, especially Section 1.4 (which voices your concerns) and Section 2.1 or section 2 as a whole (which addresses your concerns in the way I've outlined).

Note, however, that we might insist that the meaning of probability is as a guide for actions, and hence, "by definition" we should take bets when they have positive expectation according to our probabilities. If we buy this, then either (1) you're being irrational in rejecting those bets, or (2) you aren't really reporting your probabilities in the technical sense of what-guides-your-actions, but rather some subjective assessments which may somehow be related to your true probabilities.

But if you want this kind of "fully pragmatic" notion of probability, a better place to start might be the Complete Class Theorem, which really is a consequentialist argument for having a probability distribution, unlike Dutch Books.

Comment by abramdemski on Betting with Mandatory Post-Mortem · 2020-06-29T01:05:15.404Z · score: 5 (3 votes) · LW · GW

Thinking about this makes me think people should record not just their bets, but the probabilities. If I think the probability is 1% and you think it's 99%, then one of us is going to make a fairly big update. If you think it's 60% and I think it's 50%, yeah, not so much. As a rough rule of thumb, anyway. (Obviously I could be super confident in a 1% estimate in a similar way to how you describe being super confident in a 40%.)

But OTOH I think in many cases, by the time the bet is resolved there will also be a lot of other relevant evidence which determines questions related to a bet. So the warranted update will actually be much larger than would be justified just the one piece of information. In other words, if two Bayesians have different world-models and make a bet about something much into the future, by the time the actual bet is resolved they'll often have seen much more decisive evidence deciding between the two models (not necessarily in the same direction as the bet gets decided).

Still, yeah, I agree with your concern.

Comment by abramdemski on Radical Probabilism [Transcript] · 2020-06-28T15:40:34.425Z · score: 5 (3 votes) · LW · GW

Richard Bradley gives an example of a non-Bayes non-Jeffrey update in Radical Probabilism and Bayesian Conditioning. He calls his third type of update Adams conditioning. But he goes even further, giving an example which is not Bayes, Jeffrey, or Adams (the example with the pipes toward the end; figure 1 and accompanying text). To be honest I still find the example a bit baffling, because I'm not clear on why we're allowed to predictably violate the rigidity constraint in the case he considers.

I think what’s a little confusing is that I imagined these kinds of adjustments were already incorporated into ‘Bayesian reasoning’. Like, for the canonical ‘cancer test result’ example, we could easily adjust our understanding of ‘receives a positive test result’ to include uncertainty about the evidence itself, e.g. maybe the test was performed incorrectly or the result was misreported by the lab.

We can always invent a classically-bayesian scenario where we're uncertain about some particular X, by making it so we can't directly observe X, but rather get some other observations. EG, if we can't directly observe the test results but we're told about it through a fallible line of communication. What's radical about Jeffrey's view is to allow the observations themselves to be uncertain. So if you look at e.g. a color but aren't sure what you're looking at, you don't have to contrive a color-like proposition which you do observe in order to record your imperfect observation of color.

You can think of radical probabilism as "Bayesianism at a distance": like if you were watching a Bayesian agent, but couldn't bother to record every single little sense-datum. You want to record that the test results are probably positive, without recording your actual observations that make you think that. We can always posit underlying observations which make the radical-probabilist agent classically Bayesian. Think of Jeffrey as pointing out that it's often easier to work "at a distance" instead, and than once you start thinking this way, you can see it's closer to your conscious experience anyway -- so why posit underlying propositions which make all your updates into Bayes updates?

As for me, I have no problem with supposing the existence of such underlying propositions (I'll be making a post elaborating on that at some point...) but find radical probabilism to nonetheless be a very philosophically significant point.

Comment by abramdemski on Radical Probabilism [Transcript] · 2020-06-28T15:24:26.046Z · score: 2 (1 votes) · LW · GW

Ah, yep! Corrected.

Comment by abramdemski on Radical Probabilism [Transcript] · 2020-06-27T19:37:19.622Z · score: 15 (5 votes) · LW · GW

Understandable questions. I hope to expand this talk into a post which will explain things more properly.

Think of the two requirements for Bayes updates as forming a 2x2 matrix. If you have both (1) all information you learned can be summarised into one proposition which you learn with 100% confidence, and (2) you know ahead of time how you would respond to that information, then you must perform a Bayesian update. If you have (2) but not (1), ie you update some X to less than 100% confidence but you knew ahead of time how you would update to changed beliefs about X, then you are required to do a Jeffrey update. But if you don't have (2), updates are not very constrained by Dutch-book type rationality. So in general, Jeffrey argued that there are many valid updates beyond Bayes and Jeffrey updates.

Jeffrey updates are a simple generalization of Bayes updates. When a Bayesian learns X, they update it to 100%, and take P(Y|X) to be the new P(Y) for all Y. (More formally, we want to update P to get a new probability measure Q. We do so by setting Q(Y)=P(Y|X) for all Y.) Jeffrey wanted to handle the case where you somehow become 90% confident of X, instead of fully confident. He thought this was more true to human experience. A Jeffrey update is just the weighted average of the two possible Bayesian updates. (More formally, we want to update P to get Q where Q(X)=c for some chosen c. We set Q(Y) = cP(Y|X) + (1-c)P(Y|~X).)

A natural response for a classical Bayesian is: where does 90% come from? (Where does c come from?) But the Radical Probabilism retort is: where do observations come from? The Bayesian already works in a framework where information comes in from "outside" somehow. The radical probabilist is just working in a more general framework where more general types of evidence can come in from outside.

Pearl argued against this practice in his book introducing Bayesian networks. But he introduced an equivalent -- but more practical -- concept which he calls virtual evidence. The Bayesian intuition freaks out at somehow updating X to 90% without any explanation. But the virtual evidence version is much more intuitive. (Look it up; I think you'll like it better.) I don't think virtual evidence goes against the spirit of Radical Probabilism at all, and in fact if you look at Jeffrey's writing he appears to embrace it. So I hope to give that version in my forthcoming post, and explain why it's nicer than Jeffrey updates in practice.

Comment by abramdemski on Don't punish yourself for bad luck · 2020-06-26T17:58:13.647Z · score: 3 (2 votes) · LW · GW

Ah yeah, I didn't mean to be pointing that out, but that's an excellent point -- "effort" doesn't necessarily have anything to do with it. You were using "effort" as a handle for whether or not the agent is really trying, which under a perfect rationality assumption (plus an assumption of sufficient knowledge of the situation) would entail employing the best strategy. But in real life conflating effort with credit-worthiness could be a big mistake.

Comment by abramdemski on Don't punish yourself for bad luck · 2020-06-26T16:07:24.597Z · score: 9 (6 votes) · LW · GW

I just want to mention that this is an example of the credit assignment problem. Broadly punishing/rewarding every thought process when something happens is policy-gradient learning, which is going to be relatively slow because (1) you get irrelevant punishments and rewards due to noise, so you're "learning" when you shouldn't be; (2) you can't zero in on the source of problems/successes, so you have to learn through the accumulation of the weak and noisy signal.

So, model-based learning is extremely important. In practice, if you lose a game of magic (or any game with hidden information and/or randomness), I think you should rely almost entirely on model-based updates. Don't denigrate strategies only because you lost; check only whether you could have done something better given the information you had. Plan at the policy level.

OTOH, model-based learning is full of problems, too. If your models are wrong, you'll identify the wrong sub-systems to reward/punish. I've also argued that if your model-based learning is applied to itself, IE, applied to the problem of correcting the models themselves, then you get loopy self-reinforcing memes which take over the credit-assignment system and employ rent-seeking strategies.

I currently see two opposite ways out of this dilemma.

1. Always use model-free learning as a backstop for model-based learning. No matter how true a model seems, ditch it if you keep losing when you use it.

2. Keep your epistemics uncontaminated by instrumental concerns. Only ever do model-based learning; but don't let your instrumental credit-assignment system touch your beliefs. Keep your beliefs subservient entirely to predictive accuracy.

Both of these have some distasteful aspects for rationalists. Maybe there is a third way which puts instrumental and epistemic rationality in perfect harmony.

PS: I really like this post for relating a simple (but important) result in mechanism design (/theory-of-the-firm) with a simple (but important) introspective rationality problem.

Comment by abramdemski on Dutch-Booking CDT: Revised Argument · 2020-06-17T15:11:15.411Z · score: 3 (2 votes) · LW · GW

Oh right, OK. That's because of the general assumption that rational agents bet according to their beliefs. If a CDT agent doesn't think of a bet as intervening on a situation, then when betting ahead of time, it'll just bet according to its probabilities. But during the decision, it is using the modified (interventional) probabilities. That's how CDT makes decisions. So any bets which have to be made simultaneously, as part of the decision, will be evaluated according to those modified beliefs.

Comment by abramdemski on Relating HCH and Logical Induction · 2020-06-17T14:50:52.913Z · score: 4 (2 votes) · LW · GW
HCH is about deliberation, and logical inductors are about trial and error.

I think that's true of the way I describe the relationship in the OP, but not quite true in reality. I think there's also an aspect of deliberation that's present in logical induction and not in HCH. If we think of HCH as a snapshot of a logical inductor, the logical inductor is "improving over time as a result of thinking longer". This is partly due to trial-and-error, but there's also a deliberative aspect to it.

I mean, partly what I'm saying is that it's hard to draw a sharp line between deliberation and trial-and-error. If you try to draw that line such that logical induction lands to one side, you're putting Bayes' Law on multiple hypotheses on the "trial-and-error" side. But it's obvious that one would want it to be on both sides. It's definitely sort of about trial-and-error, but we also definitely want to apply Bayes' Law in deliberation. Similarly, it might turn out that we want to apply the more general logical-induction updates within deliberation.

But part of what I'm saying is that LIC (logical induction criterion) is a theory of rational deliberation in the sense of revising beliefs over time. The LIA (logical induction algorithm) captures the trial-and-error aspect, running lots of programs without knowing which ones are actually needed to satisfy LIC. But the LIC is a normative theory of deliberation, saying that what it means for belief revisions over time to be rational is that they not be too exploitable.

The cost is that it doesn't optimize what you want (unless what you want is the logical induction criterion) and that it will generally get taken over by consequentialists who can exercise malicious influence a constant number of times before the asymptotics assert themselves.

Yeah, if you take the LIA as a design proposal, it's pretty unhelpful. But if you take the LIC as a model of rational deliberation, you get potentially useful ideas.

The benefit of deliberation is that its preferences are potentially specified indirectly by the original deliberator (rather than externally by the criterion for trial and error), and that if the original deliberator is strong enough they may suppress internal selection pressures.

For example, the LIC is a context in which we can formally establish a version of "if the deliberator is strong enough they can suppress internal selection pressures".

Comment by abramdemski on Dutch-Booking CDT: Revised Argument · 2020-06-15T15:18:45.207Z · score: 2 (1 votes) · LW · GW

I guess here you wanted to say something interesting about free will, but it was probably lost from the draft to the final version of the post.

Ah whoops. Fixed.

I think developing this two points would be useful to readers since, usually, the pivotal concepts behind EDT and CDT are considered to be “conditional probabilities” and “(physical) causation” respectively, while here you seem to point at something different about the times at which decisions are made.

I'm not sure what you mean here. The "two different times" are (1) just before CDT makes the decision, and (2) right when CDT makes the decision. So the two times aren't about differentiating CDT and EDT.

Comment by abramdemski on Dutch-Booking CDT: Revised Argument · 2020-06-12T20:15:38.528Z · score: 3 (2 votes) · LW · GW

Ah, yeah, I'll think about how to clear this up. The short answer is that, yes, I slipped up and used CDT in the usual way rather than the broader definition I had set up for the purpose of this post.

On the other hand, I also want to emphasize that EDT two-boxes (and defects in twin PD) much more easily than I see commonly supposed. And, thus, to the extent one wants to apply the arguments of this post to TDT, TDT would also. Specifically, an EDT agent can only see something as correlated with its action if that thing has more information about the action than the EDT agent itself. Otherwise, the EDT agents own knowledge about its action screens off any correlation.

This means that in Newcomb with a perfect predictor, EDT one-boxes. But in Newcomb where the predictor is only moderately good, in particular knows as much or less than the agent, EDT two-boxes. So, similarly, TDT must two-box in these situations, or be vulnerable to the Dutch Book argument of this post.

Comment by abramdemski on When is CDT Dutch-Bookable? · 2020-06-12T16:48:03.165Z · score: 7 (4 votes) · LW · GW

A much improved Dutch Book argument is now here.

Comment by abramdemski on The Zettelkasten Method · 2020-06-11T20:36:07.593Z · score: 2 (1 votes) · LW · GW

Oh I absolutely give the card an index as soon as it's created. That's always, always the first thing I write on a card. So there's no trouble linking things before they're sorted.

The thing about creating cards without sorting them is,

1. They end up in order of recency. Recency is a good heuristic for how likely I am to want to look at a card.
2. They're "mostly sorted" anyway. I always create, for example, card 2a after cand 2, and card 2b after card 2a, etc. So I usually know it'll be later in the stack. I just can't find the exact location as deterministically as I otherwise could.
Comment by abramdemski on Modeling naturalized decision problems in linear logic · 2020-05-17T06:16:21.998Z · score: 2 (1 votes) · LW · GW

I haven't tried as hard as I could have to understand, so, sorry if this comment is low quality.

But I currently don't see the point of employing linear logic in the way you are doing it.

The appendix suggests that the solution to spurious counterfactuals here is the same as known ideas for resolving that problem. Which seems right to me. So solving spurious counterfactuals isn't the novel aspect here.

But then I'm confused why you focus on 5&10 in the main body of the post, since that's the main point of the 5&10 problem.

Maybe 5&10 is just a super simple example to illustrate things. But then, I don't know what it is you are illustrating. What is linear logic doing for you that you could not do some other way?

I have heard the suggestion that linear logic should possibly be used to aid in the difficulties of logical counterfactuals, before. But (somehow) those suggestions seemed to be doing something more radical. Spurious counterfactuals were supposed to be blocked by something about the restrictive logic. By allowing the chosen action to be used only once (since it gets consumed when used), something nicer is supposed to happen, perhaps avoiding problematic self-referential chains of reasoning.

(As I see it at the moment, linear logic seems to -- if anything -- work against the kind of thing we typically want to achieve. If you can use "my program, when executed, outputs 'one-box'" only once, you can't re-use the result both within Omega's thinking and within the physical choice of box. So linear logic would seem to make it hard to respect logical correlations. Of course this doesn't happen for your proposal here, since you treat the program output as classical.)

But your use (iiuc!) seems less radical. You are kind of just using linear logic as a way to specify a world model. But I don't see what this does for you. What am I missing?

Comment by abramdemski on The Zettelkasten Method · 2020-05-17T05:56:44.559Z · score: 4 (2 votes) · LW · GW

Hah, yep, I also tried binders which look very much like these! I didn't end up using them so much as the index cards, but, they did seem like a decent solution.

Comment by abramdemski on It's Not About The Nail · 2020-05-01T19:17:44.938Z · score: 4 (2 votes) · LW · GW

Ah, I see. But to what extent is recency bias irrational vs just a good prior for the world we live in?

Comment by abramdemski on It's Not About The Nail · 2020-04-29T01:45:49.033Z · score: 2 (1 votes) · LW · GW

But it also has secondary effects on your model of the world. Your mind (consciously or subconsciously) now has new information that the world is slightly less safe or slightly less predictable than it thought before.

Are you saying that there's a bias to over-update in favor of the world being bad? And that talking it out helps correct for that?

I would guess:

• Sometimes people over-update, but people under-update too; not clear which direction the overall bias would be if any.
• Over-updating might cause one to run to one's allies for more support, but doesn't usually cause one to seek reassurance of the kind that corrects for the bias; e.g. someone doing this would find reassuring words like "you'll make it through this" reassuring, but wouldn't be explicitly seeking them out -- there's no reason to specifically seek evidence in one direction, that doesn't make sense
• On the other hand, people might play at over-updating in order to get sympathy and reassurances. This (not-necessarily-conscious) tactic can put one in a better position in a group dynamic, as others attempt to make you feel better.

Talking about a problem to somebody you have a close relationship with addresses these second-order effects in a pretty concrete way: it reaffirms the reliability of your relationship in a way that makes the world feel more safe and predictable,

Is it reaffirming something that already should/could be known (so perhaps helping mitigate a bias)? Or is it really gathering important new information?

Gathering new information can make sense: even long-established partnerships can go sour, so it totally makes sense to gather information on how strong your partnerships are. And it also might especially make sense when you've discovered a new problem or updated toward the world being a bit harder to deal with in general.

And it also makes sense that this would end up being a weird indirect kind of conversation to have, since just asking "is our partnership strong?" is not a very good signalling equilibrium -- too easy to just say "yes".

(Not saying that's the actual answer, though. I think perhaps there are yet more complexities here.)

Comment by abramdemski on Thinking About Filtered Evidence Is (Very!) Hard · 2020-04-15T20:08:45.772Z · score: 2 (1 votes) · LW · GW

Adding to the axioms of PA, the statement "a proof of X from the axioms of PA implies X", does not create any contradictions. This is just the belief that PA is sound.

What would be contradictory would be for PA itself to believe that PA is sound. It is fine for an agent to have the belief that PA is sound.

Comment by abramdemski on An Orthodox Case Against Utility Functions · 2020-04-15T20:00:19.707Z · score: 2 (1 votes) · LW · GW

The mechanism is the same in both cases:

• Shares in the event are bought and sold on the market. The share will pay out $1 if the event is true. The share can also be shorted, in which case the shorter gets$1 if the event turns out false. The overall price equilibrates to a probability for the event.
• There are several ways to handle utility. One way is to make bets about whether the utility will fall in particular ranges. Another way is for the market to directly contain shares of utility which can be purchased (and shorted). These pay out \$U, whatever the utility actually turns out to be -- traders give it an actual price by speculating on what the eventual value will be. In either case, we would then assign expected utility to events via conditional betting.

If we want do do reward-learning in a setup like this, the (discounted) rewards can be incremental payouts of the U shares. But note that even if there is no feedback of any kind (IE, the shares of U never actually pay out), the shares equilibrate to a subjective value on the market -- like collector's items. But the market still forces the changes in value over time to be increasingly coherent, and the conditional beliefs about it to be increasingly coherent. This corresponds to fully subjective utility with no outside feedback.

If two traders disagree about whether to pull the lever, how is it determined which one gets the currency?

They make bets about what happens if the lever is or isn't pulled (including conditional buys/sells of shares of utility). These bets will be evaluated as normal. In this setup we only get feedback on whichever action actually happens -- but, this may still be enough data to learn under certain assumptions (which I hope to discuss in a future post). We can also consider more exotic settings in which we do get feedback on both cases even though only one happens; this could be feasible through human feedback about counterfactuals. (I also hope to discuss this alternative in a future post.)

Comment by abramdemski on An Orthodox Case Against Utility Functions · 2020-04-15T19:34:32.296Z · score: 2 (1 votes) · LW · GW

Of course, this interpretation requires a fair amount of reading between the lines, since the Jeffrey-Bolker axioms make no explicit mention of any probability distribution, but I don’t see any other reasonable way to interpret them,

Part of the point of the JB axioms is that probability is constructed together with utility in the representation theorem, in contrast to VNM, which constructs utility via the representation theorem, but takes probability as basic.

This makes Savage a better comparison point, since the Savage axioms are more similar to the VNM framework while also trying to construct probability and utility together with one representation theorem.

VNM does not start out with a prior, and allows any probability distribution over outcomes to be compared to any other, and Jeffrey-Bolker only allows comparison of probability distributions obtained by conditioning the prior on an event.

As a representation theorem, this makes VNM weaker and JB stronger: VNM requires stronger assumptions (it requires that the preference structure include information about all these probability-distribution comparisons), where JB only requires preference comparison of events which the agent sees as real possibilities. A similar remark can be made of Savage.

Starting with a prior that gets conditioned on events that correspond to the agent’s actions seems to build in evidential decision theory as an assumption, which makes me suspicious of it.

Right, that's fair. Although: James Joyce, the big CDT advocate, is quite the Jeffrey-Bolker fan! See Why We Still Need the Logic of Decision for his reasons.

I don’t think the motivation for this is quite the same as the motivation for pointless topology, which is designed to mimic classical topology in a way that Jeffrey-Bolker-style decision theory does not mimic VNM-style decision theory. [...] So a similar thing here would be to treat a utility function as a function from some lattice of subsets of (the Borel subsets, for instance) to the lattice of events.

Doesn't pointless topology allow for some distinctions which aren't meaningful in pointful topology, though? (I'm not really very familiar, I'm just going off of something I've heard.)

Isn't the approach you mention pretty close to JB? You're not modeling the VNM/Savage thing of arbitrary gambles; you're just assigning values (and probabilities) to events, like in JB.

Setting aside VNM and Savage and JB, and considering the most common approach in practice -- use the Kolmogorov axioms of probability, and treat utility as a random variable -- it seems like the pointless analogue would be close to what you say.

This can be resolved by defining worlds to be minimal non-zero elements of the completion of the Boolean algebra of events, rather than a minimal non-zero event.

Yeah. The question remains, though: should we think of utility as a function of these minimal elements of the completion? Or not? The computability issue I raise is, to me, suggestive of the negative.

Comment by abramdemski on An Orthodox Case Against Utility Functions · 2020-04-15T18:56:34.241Z · score: 4 (2 votes) · LW · GW

First, I really like this shift in thinking, partly because it moves the needle toward an anti-realist position, where you don’t even need to postulate an external world (you probably don’t see it that way, despite saying “Everything is a subjective preference evaluation”).

I definitely see it as a shift in that direction, although I'm not ready to really bite the bullets -- I'm still feeling out what I personally see as the implications. Like, I want a realist-but-anti-realist view ;p

Second, I wonder if you need an even stronger restriction, not just computable, but efficiently computable, given that it’s the agent that is doing the computation, not some theoretical AIXI. This would probably also change “too easily” in “those expectations aren’t (too easily) exploitable to Dutch-book.” to efficiently.

Right, that's very much what I'm thinking.

Comment by abramdemski on An Orthodox Case Against Utility Functions · 2020-04-15T18:49:53.425Z · score: 6 (3 votes) · LW · GW

First, it seems to me rather clear what macroscopic physics I attach utility to. If I care about people, this means my utility function comes with some model of what a “person” is (that has many free parameters), and if something falls within the parameters of this model then it’s a person,

This does not strike me as the sort of thing which will be easy to write out. But there are other examples. What if humans value something like observer-independent beauty? EG, valuing beautiful things existing regardless of whether anyone observes their beauty. Then it seems pretty unclear what ontological objects it gets predicated on.

Second, what does it mean for a hypothesis to be “individual”? If we have a prior over a family of hypotheses, we can take their convex combination and get a new individual hypothesis. So I’m not sure what sort of “fluidity” you imagine that is not supported by this.

What I have in mind is complicated interactions between different ontologies. Suppose that we have one ontology -- the ontology of classical economics -- in which:

• Utility is predicated on individuals alone.
• Individuals always and only value their own hedons; any apparent revealed preference for something else is actually an indication that observing that thing makes the person happy, or that behaving as if they value that other thing makes them happy. (I don't know why this is part of classical economics, but it seems at least highly correlated with classical-econ views.)
• Aggregate utility (across many individuals) can only be defined by giving an exchange rate, since utility functions of different individuals are incomparable. However, an exchange rate is implicitly determined by the market.

And we have another ontology -- the hippie ontology -- in which:

• Energy, aka vibrations, is an essential part of social interactions and other things.
• People and things can have good energy and bad energy.
• People can be on the same wavelength.
• Etc.

And suppose what we want to do is try to reconcile the value-content of these two different perspectives. This isn't going to be a mixture between two partial hypotheses. It might actually be closer to an intersection between two partial hypotheses -- since the different hypotheses largely talk about different entities. But that won't be right either. Rather, there is philosophical work to be done, figuring out how to appropriately mix the values which are represented in the two ontologies.

My intuition behind allowing preference structures which are "uncomputable" as functions of fully specified worlds is, in part, that one might continue doing this kind of philosophical work in an unbounded way -- IE there is no reason to assume there's a point at which this philosophical work is finished and you now have something which can be conveniently represented as a function of some specific set of entities. Much like logical induction never finishes and gives you a Bayesian probability function, even if it gets closer over time.

The agent doesn’t have full Knightian uncertainty over all microscopic possibilities. The prior is composed of refinements of an “ontological belief” that has this uncertainty. You can even consider a version of this formalism that is entirely Bayesian (i.e. each refinement has to be maximal),

OK, that makes sense!

but then you lose the ability to retain an “objective” macroscopic reality in which the agent’s point of view is “unspecial”, because if the agent’s beliefs about this reality have no Knightian uncertainty then it’s inconsistent with the agent’s free will (you could “avoid” this problem using an EDT or CDT agent but this would be bad for the usual reasons EDT and CDT are bad, and ofc you need Knightian uncertainty anyway because of non-realizability).

Right.

Comment by abramdemski on An Orthodox Case Against Utility Functions · 2020-04-14T20:14:03.826Z · score: 6 (3 votes) · LW · GW

I don't want to make a strong argument against your position here. Your position can be seen as one example of "don't make utility a function of the microscopic".

But let's pretend for a minute that I do want to make a case for my way of thinking about it as opposed to yours.

• Humans are not clear on what macroscopic physics we attach utility to. It is possible that we can emulate human judgement sufficiently well by learning over macroscopic-utility hypotheses (ie, partial hypotheses in your framework). But perhaps no individual hypothesis will successfully capture the way human value judgements fluidly switch between macroscopic ontologies -- perhaps human reasoning of this kind can only be accurately captured by a dynamic LI-style "trader" who reacts flexibly to an observed situation, rather than a fixed partial hypothesis. In other words, perhaps we need to capture something about how humans reason, rather than any fixed ontology (even of the flexible macroscopic kind).
• Your way of handling macroscopic ontologies entails knightian uncertainty over the microscopic possibilities. Isn't that going to lack a lot of optimization power? EG, if humans reasoned this way using intuitive physics, we'd be afraid that any science experiment creating weird conditions might destroy the world, and try to minimize chances of those situations being set up, or something along those lines? I'm guessing you have some way to mitigate this, but I don't know how it works.

As for discontinuous utility:

For example, since the utility functions you consider are discontinuous, it is no longer guaranteed an optimal policy exists at all. Personally, I think discontinuous utility functions are strange and poorly motivated.

My main motivating force here is to capture the maximal breadth of what rational (ie coherent, ie non-exploitable) preferences can be, in order to avoid ruling out some human preferences. I have an intuition that this can ultimately help get the right learning-theoretic guarantees as opposed to hurt, but, I have not done anything to validate that intuition yet.

With respect to procrastination-like problems, optimality has to be subjective, since there is no foolproof way to tell when an agent will procrastinate forever. If humans have any preferences like this, then alignment means alignment with human subjective evaluations of this matter -- if the human (or some extrapolated human volition, like HCH) looks at the system's behavior and says "NO!! Push the button now, you fool!!" then the system is misaligned. The value-learning should account for this sort of feedback in order to avoid this. But this does not attempt to minimize loss in an objective sense -- we export that concern to the (extrapolated?) human evaluation which we are bounding loss with respect to.

With respect to the problem of no-optimal-policy, my intuition is that you try for bounded loss instead; so (as with logical induction) you are never perfect but you have some kind of mistake bound. Of course this is more difficult with utility than it is with pure epistemics.

Comment by abramdemski on An Orthodox Case Against Utility Functions · 2020-04-14T19:39:55.185Z · score: 2 (1 votes) · LW · GW

What happens if the author/definer of U(E) is wrong about the probabilities? If U(E) is not defined from, nor defined by, the value of its sums, what bad stuff happens if they aren’t equal?

Ultimately, I am advocating a logical-induction like treatment of this kind of thing.

• Initial values are based on a kind of "prior" -- a distribution of money across traders.
• Values are initially inconsistent (indeed, they're always somewhat inconsistent), but, become more consistent over time as a result of traders correcting inconsistencies. The traders who are better at this get more money, while the chronically inconsistent traders lose money and eventually don't have influence any more.
• Evidence of all sorts can come into the system, at any time. The system might suddenly get information about the utility of some hypothetical example, or a logical proposition about utility, whatever. It can be arbitrarily difficult to connect this evidence to practical cases. However, the traders work to reduce inconsistencies throughout the whole system, and therefore, evidence gets propagated more or less as well as it can be.
Comment by abramdemski on An Orthodox Case Against Utility Functions · 2020-04-14T19:26:39.170Z · score: 2 (1 votes) · LW · GW

What does it mean for the all-zero universe to be infinite, as opposed to not being infinite? Finite universes have a finite number of bits of information describing them (This doesn’t actually negate the point that uncomputable utility functions exist, merely that utility functions that care whether they are in a mostly-empty vs perfectly empty universe are a weak example.

What it means here is precisely that it is described by an infinite number of bits -- specifically, an infinite number of zeros!

Granted, we could try to reorganize the way we describe the universe so that we have a short code for that world, rather than an infinitely long one. This becomes a fairly subtle issue. I will say a couple of things:

First, it seems to me like the reductionist may want to object to such a reorganization. In the reductive view, it is important that there is a special description of the universe, in which we have isolated the actual basic facts of reality -- things resembling particle position and momentum, or what-have-you.

Second, I challenge you to propose a description language which (a) makes the procrastination example computable, (b) maps all worlds onto a description, and (c) does not create any invalid input tapes.

For example, I can make a modified universe-description in which the first bit is '1' if the button ever gets pressed. The rest of the description remains as before, placing a '1' at time-steps when the button is pressed (but offset by one place, to allow for the extra initial bit). So seeing '0' right away tells me I'm in the button-never-pressed world; it now has a 1-bit description, rather than an infinite-bit description. HOWEVER, this description language includes a description which does not correspond to any world, and is therefore invalid: the string which starts with '1' but then contains only zeros forever.

This issue has a variety of potential replies/implications -- I'm not saying the situation is clear. I didn't get into this kind of thing in the post because it seems like there are just too many things to say about it, with no totally clear path.

Comment by abramdemski on An Orthodox Case Against Utility Functions · 2020-04-13T19:20:38.122Z · score: 4 (2 votes) · LW · GW

Perhaps it goes without saying, but obviously, both frameworks are flexible enough to allow for most phenomena -- the question here is what is more natural in one framework or another.

My main argument is that the procrastination paradox is not natural at all in a Savage framework, as it suggests an uncomputable utility function. I think this plausibly outweighs the issue you're pointing at.

But with respect to the issue you are pointing at:

I try to think about what I expect to happen if I take that action (ie the outcome), and I think about how likely that outcome is to have various properties that I care about,

In the Savage framework, an outcome already encodes everything you care about. So the computation which seems to be suggested by Savage is to think of these maximally-specified outcomes, assigning them probability and utility, and then combining those to get expected utility. This seems to be very demanding: it requires imagining these very detailed scenarios.

Alternately, we might say (as as Savage said) that the Savage axioms apply to "small worlds" -- small scenarios which the agent abstracts from its experience, such as the decision of whether to break an egg for an omelette. These can be easily considered by the agent, if it can assign values "from outside the problem" in an appropriate way.

But then, to account for the breadth of human reasoning, it seems to me we also want an account of things like extending a small world when we find that it isn't sufficient, and coherence between different small-world frames for related decisions.

This gives a picture very much like the Jeffrey-Bolker picture, in that we don't really work with outcomes which completely specify everything we care about, but rather, work with a variety of simplified outcomes with coherence requirements between simpler and more complex views.

So overall I think it is better to have some picture where you can break things up in a more tractable way, rather than having full outcomes which you need to pass through to get values.

In the Jeffrey-Bolker framework, you can re-estimate the value of an event by breaking it up into pieces, estimating the value and probability of each piece, and combining them back together. This process could be iterated in a manner similar to dynamic programming in RL, to improve value estimates for actions -- although one needs to settle on a story about where the information originally comes from. I currently like the logical-induction-like picture where you get information coming in "somehow" (a broad variety of feedback is possible, including abstract judgements about utility which are hard to cash out in specific cases) and you try to make everything as coherent as possible in the meanwhile.

Comment by abramdemski on An Orthodox Case Against Utility Functions · 2020-04-08T01:47:26.149Z · score: 4 (2 votes) · LW · GW

Yeah, a didactic problem with this post is that when I write everything out, the "reductive utility" position does not sound that tempting.

I still think it's a really easy trap to fall into, though, because before thinking too much the assumption of a computable utility function sounds extremely reasonable.

Suppose I'm running a company, trying to maximize profits. I don't make decisions by looking at the available options, and then estimating how profitable I expect the company to be under each choice. Rather, I reason locally: at a cost of X I can gain Y, I've cached an intuitive valuation of X and Y based on their first-order effects, and I make the choice based on that without reasoning through all the second-, third-, and higher-order effects of the choice. I don't calculate all the way through to an expected utility or anything comparable to it.

With dynamic-programming inspired algorithms such as AlphaGo, "cached an intuitive valuation of X and Y" is modeled as a kind of approximate evaluation which is learned based on feedback -- but feedback requires the ability to compute U() at some point. (So you don't start out knowing how to evaluate uncertain situations, but you do start out knowing how to evaluate utility on completely specified worlds.)

So one might still reasonably assume you need to be able to compute U() despite this.

Comment by abramdemski on Two Alternatives to Logical Counterfactuals · 2020-04-07T19:37:51.197Z · score: 4 (2 votes) · LW · GW

OK, all of that made sense to me. I find the direction more plausible than when I first read your post, although it still seems like it'll fall to the problem I sketched.

I both like and hate that it treats logical uncertainty in a radically different way from empirical uncertainty -- like, because we have so far failed to find any way to treat the two uniformly (besides being entirely updateful that is); and hate, because it still feels so wrong for the two to be very different.

Comment by abramdemski on Two Alternatives to Logical Counterfactuals · 2020-04-05T03:30:23.185Z · score: 2 (1 votes) · LW · GW

Ahhh ok.

Comment by abramdemski on Two Alternatives to Logical Counterfactuals · 2020-04-05T03:29:17.850Z · score: 5 (3 votes) · LW · GW

I'm left with the feeling that you don't see the problem I'm pointing at.

My concern is that the most plausible world where you aren't a pure optimizer might look very very different, and whether this very very different world looks better or worse than the normal-looking world does not seem very relevant to the current decision.

Consider the "special exception selves" you mention -- the Nth exception-self has a hard-coded exception "go right if it's beet at least N turns and you've gone right at most 1/N of the time".

Now let's suppose that the worlds which give rise to exception-selves are a bit wild. That is to say, the rewards in those worlds have pretty high variance. So a significant fraction of them have quite high reward -- let's just say 10% of them have value much higher than is achievable in the real world.

So we expect that by around N=10, there will be an exception-self living in a world that looks really good.

This suggests to me that the policy-dependent-source agent cannot learn to go left > 90% of the time, because once it crosses that threshhold, the exception-self in the really good looking world is ready to trigger its exception -- so going right starts to appear really good. The agent goes right until it is under the threshhold again.

If that's true, then it seems to me rather bad: the agent ends up repeatedly going right in a situation where it should be able to learn to go left easily. Its reason for repeatedly going right? There is one enticing world, which looks much like the real world, except that in that world the agent definitely goes right. Because that agent is a lucky agent who gets a lot of utility, the actual agent has decided to copy its behavior exactly -- anything else would prove the real agent unlucky, which would be sad.

Of course, this outcome is far from obvious; I'm playing fast and loose with how this sort of agent might reason.

Comment by abramdemski on Two Alternatives to Logical Counterfactuals · 2020-04-03T20:00:21.583Z · score: 7 (4 votes) · LW · GW

If you see your source code is B instead of A, you should anticipate learning that the programmers programmed B instead of A, which means something was different in the process. So the counterfactual has implications backwards in physical time.

At some point it will ground out in: different indexical facts, different laws of physics, different initial conditions, different random events...

I'm not sure how you are thinking about this. It seems to me like this will imply really radical changes to the universe. Suppose the agent is choosing between a left path and a right path. Its actual programming will go left. It has to come up with alternate programming which would make it go right, in order to consider that scenario. The most probable universe in which its programming would make it go right is potentially really different from our own. In particular, it is a universe where it would go right despite everything it has observed, a lifetime of (updateless) learning, which in the real universe, has taught it that it should go left in situations like this.

EG, perhaps it has faced an iterated 5&10 problem, where left always yields 10. It has to consider alternate selves who, faced with that history, go right.

It just seems implausible that thinking about universes like that will result in systematically good decisions. In the iterated 5&10 example, perhaps universes where its programming fails iterated 5&10 are universes where iterated 5&10 is an exceedingly unlikely situation; so in fact, the reward for going right is quite unlikely to be 5, and very likely to be 100. Then the AI would choose to go right.

Obviously, this is not necessarily how you are thinking about it at all -- as you said, you haven't given an actual decision procedure. But the idea of considering only really consistent counterfactual worlds seems quite problematic.

Comment by abramdemski on Two Alternatives to Logical Counterfactuals · 2020-04-03T19:44:04.503Z · score: 2 (1 votes) · LW · GW

Conditioning on ‘A(obs) = act’ is still a conditional, not a counterfactual. The difference between conditionals and counterfactuals is the difference between “If Oswald didn’t kill Kennedy, then someone else did” and “If Oswald didn’t kill Kennedy, then someone else would have”.

I still disagree. We need a counterfactual structure in order to consider the agent as a function A(obs). EG, if the agent is a computer program, the function would contain all the counterfactual information about what the agent would do if it observed different things. Hence, considering the agent's computer program as such a function leverages an ontological commitment to those counterfactuals.

To illustrate this, consider counterfactual mugging where we already see that the coin is heads -- so, there is nothing we can do, we are at the mercy of our counterfactual partner. But suppose we haven't yet observed whether Omega gives us the money.

A "real counterfactual" is one which can be true or false independently of whether its condition is met. In this case, if we believe in real counterfactuals, we believe that there is a fact of the matter about what we do in the case, even though the coin came up heads. If we don't believe in real counterfactuals, we instead think only that there is a fact of how Omega is computing "what I would have done if the coin had been tails" -- but we do not believe there is any "correct" way for Omega to compute that.

The representation and the representation both appear to satisfy this test of non-realism. The first is always true if the observation is false, so, lacks the ability to vary independently of the observation. The second is undefined when the observation is false, which is perhaps even more appealing for the non-realist.

Now consider the representation. can still vary even when we know . So, it fails this test -- it is a realist representation!

Putting something into functional form imputes a causal/counterfactual structure.

Comment by abramdemski on Two Alternatives to Logical Counterfactuals · 2020-04-03T19:36:00.626Z · score: 2 (1 votes) · LW · GW

In the happy dance problem, when the agent is considering doing a happy dance, the agent should have already updated on M. This is more like timeless decision theory than updateless decision theory.

I agree that this gets around the problem, but to me the happy dance problem is still suggestive -- it looks like the material conditional is the wrong representation of the thing we want to condition on.

Also -- if the agent has already updated on observations, then updating on is just the same as updating on . So this difference only matters in the updateless case, where it seems to cause us trouble.

Comment by abramdemski on Thinking About Filtered Evidence Is (Very!) Hard · 2020-04-03T07:23:51.561Z · score: 4 (2 votes) · LW · GW

Sure, that seems reasonable. I guess I saw this as the point of a lot of MIRI’s past work, and was expecting this to be about honesty / filtered evidence somehow.

Yeah, ok. This post as written is really less the kind of thing somebody who has followed all the MIRI thinking needs to hear and more the kind of thing one might bug an orthodox Bayesian with. I framed it in terms of filtered evidence because I came up with it by thinking about some confusion I was having about filtered evidence. And it does problematize the Bayesian treatment. But in terms of actual research progress it would be better framed as a negative result about whether Sam's untrollable prior can be modified to have richer learning.

I think we mean different things by “perfect model”. What if [...]

Yep, I agree with everything you say here.

Comment by abramdemski on The absurdity of un-referenceable entities · 2020-04-02T22:27:19.800Z · score: 6 (3 votes) · LW · GW

Also -- it may not come across in my other comments -- the argument in the OP was novel to me (at least, if I had heard it before, I thought it was wrong at that time and didn't update on it) and feels like a nontrivial observation about how reference has to work.

Comment by abramdemski on The absurdity of un-referenceable entities · 2020-04-02T22:16:45.379Z · score: 8 (4 votes) · LW · GW

Alright, cool. 👌In general I think reference needs to be treated as a vague object to handle paradoxes (something along the lines of Hartry Field's theory of vague semantics, although I may prefer something closer to linear logic rather than his non-classical logic) -- and also just to be more true to actual use.

I am not able to think of any argument why the set of un-referenceable entities should be paradoxical rather than empty, at the moment. But it seems somehow appropriate that the domain of quantification for our language be vague, and further could be that we don't assert that nothing lies outside of it. (Only that there is not some thing definitely outside of it.)

Comment by abramdemski on Two Alternatives to Logical Counterfactuals · 2020-04-02T21:52:17.099Z · score: 17 (6 votes) · LW · GW

I too have recently updated (somewhat) away from counterfactual non-realism. I have a lot of stuff I need to work out and write about it.

I seem to have a lot of disagreements with your post.

Given this uncertainty, you may consider material conditionals: if I take action X, will consequence Q necessarily follow? An action may be selected on the basis of these conditionals, such as by determining which action results in the highest guaranteed expected utility if that action is taken.

I don't think material conditionals are the best way to cash out counterfactual non-realism.

• The basic reason I think it's bad is the happy dance problem. This makes it seem clear that the sentence to condition on should not be .
• If the action can be viewed as a function of observations, conditioning on makes sense. But this is sort of like already having counterfactuals, or at least, being realist that there are counterfactuals about whan would do if the agent observed different things. So this response can be seen as abandoning counterfactual non-realism.
• A different approach is to consider conditional beliefs rather than material implications. I think this is more true to counterfactual non-realism. In the simplest form, this means you just condition on actions (rather than trying to condition on something like or ). However, in order to reason updatelessly, you need something like conditioning on conditionals, which complicates matters.
• Another reason to think it's bad is Troll Bridge.
• Again if the agent thinks there are basic counterfactual facts, (required to respect but little else -- ie entirely determined by subjective beliefs), then the agent can escape Troll Bridge by disagreeing with the relevant inference. But this, of course, rejects the kind of counterfactual non-realism you intend.
• To be more in line with counterfactual non-realism, we would like to use conditional probabilities instead. However, conditional probability behaves too much like material implication to block the Troll Bridge argument. However, I believe that there is an account of conditional probability which avoids this by rejecting the ratio analysis of conditional probability -- ie Bayes' definition -- and instead regards conditional probability as a basic entity. (Along the lines of what Alan Hájek goes on and on about.) Thus an EDT-like procedure can be immune to both 5-and-10 and Troll Bridge. (I claim.)

As for policy-dependent source code, I find myself quite unsympathetic to this view.

• If the agent is updateful, this is just saying that in counterfactuals where the agent does something else, it might have different source code. Which seems fine, but does it really solve anything? Why is this much better than counterfactuals which keep the source code fixed but imagine the execution trace being different? This seems to only push the rough spots further back -- there can still be contradictions, e.g. between the source code and the process by which programmers wrote the source code. Do you imagine it is possible to entirely remove such rough spots from the counterfactuals?
• So it seems you intend the agent to be updateless instead. But then we have all the usual issues with logical updatelessness. If the agent is logically updateless, there is absolutely no reason to think that its beliefs about the connections between source code and actual policy behavior is any good. Making those connections requires actual reasoning, not simply a good enough prior -- which means being logically updateful. So it's unclear what to do.
• Perhaps logically-updateful policy-dependent-source-code is the most reasonable version of the idea. But then we are faced with the usual questions about spurious counterfactuals, chicken rule, exploration, and Troll Bridge. So we still have to make choices about those things.
Comment by abramdemski on The absurdity of un-referenceable entities · 2020-04-02T20:31:15.112Z · score: 4 (2 votes) · LW · GW

Yeah, I'm describing a confusion between views from nowhere and 3rd person perspectives.

Do we disagree about something? It seems possible that you think "ontologizing the by-definition-not-ontologizable" is a bad thing, whereas I'm arguing it's important to have that in one's ontology (even if it's an empty set).

I could see becoming convinced that "the non-ontologizable" is an inherently vague set, IE, achieves a paradoxical status of not being definitely empty, but definitely not being definitely populated.

Comment by abramdemski on The absurdity of un-referenceable entities · 2020-04-01T22:10:13.974Z · score: 9 (2 votes) · LW · GW

Another reason why unreferenceable entities may be intuitively appealing is that if we take a third person perspective, we can easily imagine an abstract agent being unable to reference some entity.

In map/territory thinking, we could imagine things beyond the curvature of the earth being impossible to illustrate on a 2d map. In pure logic, we imagine a Tarskian truth predicate for a logic.

You, sitting outside the thought experiment, cannot be referenced by the agent you imagine. (That is, one easily neglects the possibility.) So the agent saying "the stuff someone else might think of" appears to be no help.

So, I note that the absurdity of the unreferenceable entity is not quite trivial. You are assuming that "unreferenceable" is a concept within the ontology, in order to prove that no such thing can be.

It is perfectly consistent to imagine an entity and an object which cannot be referenced by our imagined entity. We need only suppose that our entity lacks a concept of the unreferenceable.

So despite the absurdity of unreferenceable objects, it seems we need them in our ontology in order to avoid them. ;)

Comment by abramdemski on Thinking About Filtered Evidence Is (Very!) Hard · 2020-04-01T20:23:41.882Z · score: 8 (4 votes) · LW · GW

I agree with your first sentence, but I worry you may still be missing my point here, namely that the Bayesian notion of belief doesn't allow us to make the distinction you are pointing to. If a hypothesis implies something, it implies it "now"; there is no "the conditional probability is 1 but that isn't accessible to me yet".

I also think this result has nothing to do with "you can't have a perfect model of Carol". Part of the point of my assumptions is that they are, individually, quite compatible with having a perfect model of Carol amongst the hypotheses.

Comment by abramdemski on Thinking About Filtered Evidence Is (Very!) Hard · 2020-04-01T18:14:43.759Z · score: 6 (3 votes) · LW · GW

I'm not sure exactly what the source of your confusion is, but:

I don't see how this follows. At the point where the confidence in PA rises above 50%, why can't the agent be mistaken about what the theorems of PA are?

The confidence in PA as a hypothesis about what the speaker is saying is what rises above 50%. Specifically, an efficiently computable hypothesis eventually enumerating all and only the theorems of PA rises above 50%.

For example, let T be a theorem of PA that hasn't been claimed yet. Why can't the agent believe P(claims-T) = 0.01 and P(claims-not-T) = 0.99? It doesn't seem like this violates any of your assumptions.

This violates the assumption of honesty that you quote, because the agent simultaneously has P(H) > 0.5 for a hypothesis H such that P(obs_n-T | H) = 1, for some (possibly very large) n, and yet also believes P(T) < 0.5. This is impossible since it must be that P(obs_n-T) > 0.5, due to P(H) > 0.5, and therefore must be that P(T) > 0.5, by honesty.

Comment by abramdemski on Thinking About Filtered Evidence Is (Very!) Hard · 2020-04-01T18:07:20.843Z · score: 4 (2 votes) · LW · GW

Here's one way to extend a result like this to lying. Rather than assume honesty, we could assume observations carry sufficiently much information about the truth. This is like saying that sensory perception may be fooled, but in the long run, bears a strong enough connection to reality for us to infer a great deal. Something like this should imply the same computational difficulties.

I'm not sure exactly how this assumption should be spelled out, though.

Comment by abramdemski on Thinking About Filtered Evidence Is (Very!) Hard · 2020-04-01T17:58:21.886Z · score: 2 (1 votes) · LW · GW

It's sufficient to allow an adversarial (and dishonest) speaker to force a contradiction, sure. But the theorem is completely subjective. It says that even from the agent's perspective there is a problem. IE, even if we think the speaker to be completely honest, we can't (computably) have (even minimally) consistent beliefs. So it's more surprising than simply saying that if we believe a speaker to be honest then that speaker can create a contradiction by lying to us. (At least, more surprising to me!)

Comment by abramdemski on Thinking About Filtered Evidence Is (Very!) Hard · 2020-04-01T17:51:41.885Z · score: 6 (3 votes) · LW · GW

It's absurd (in a good way) how much you are getting out of incomplete hypotheses. :)