Posts

Many-worlds versus discrete knowledge 2020-08-13T18:35:53.442Z · score: 21 (11 votes)
Modeling naturalized decision problems in linear logic 2020-05-06T00:15:15.400Z · score: 15 (5 votes)
Topological metaphysics: relating point-set topology and locale theory 2020-05-01T03:57:11.899Z · score: 13 (4 votes)
Two Alternatives to Logical Counterfactuals 2020-04-01T09:48:29.619Z · score: 30 (10 votes)
The absurdity of un-referenceable entities 2020-03-14T17:40:37.750Z · score: 22 (8 votes)
Puzzles for Physicalists 2020-03-12T01:37:13.353Z · score: 44 (19 votes)
A conversation on theory of mind, subjectivity, and objectivity 2020-03-10T04:59:23.266Z · score: 13 (4 votes)
Subjective implication decision theory in critical agentialism 2020-03-05T23:30:42.694Z · score: 16 (5 votes)
A critical agential account of free will, causation, and physics 2020-03-05T07:57:38.193Z · score: 25 (9 votes)
On the falsifiability of hypercomputation, part 2: finite input streams 2020-02-17T03:51:57.238Z · score: 21 (5 votes)
On the falsifiability of hypercomputation 2020-02-07T08:16:07.268Z · score: 26 (5 votes)
Philosophical self-ratification 2020-02-03T22:48:46.985Z · score: 25 (6 votes)
High-precision claims may be refuted without being replaced with other high-precision claims 2020-01-30T23:08:33.792Z · score: 63 (26 votes)
On hiding the source of knowledge 2020-01-26T02:48:51.310Z · score: 109 (33 votes)
On the ontological development of consciousness 2020-01-25T05:56:43.244Z · score: 37 (13 votes)
Is requires ought 2019-10-28T02:36:43.196Z · score: 23 (10 votes)
Metaphorical extensions and conceptual figure-ground inversions 2019-07-24T06:21:54.487Z · score: 42 (10 votes)
Dialogue on Appeals to Consequences 2019-07-18T02:34:52.497Z · score: 36 (22 votes)
Why artificial optimism? 2019-07-15T21:41:24.223Z · score: 64 (21 votes)
The AI Timelines Scam 2019-07-11T02:52:58.917Z · score: 45 (80 votes)
Self-consciousness wants to make everything about itself 2019-07-03T01:44:41.204Z · score: 43 (30 votes)
Writing children's picture books 2019-06-25T21:43:45.578Z · score: 114 (37 votes)
Conditional revealed preference 2019-04-16T19:16:55.396Z · score: 18 (7 votes)
Boundaries enable positive material-informational feedback loops 2018-12-22T02:46:48.938Z · score: 30 (12 votes)
Act of Charity 2018-11-17T05:19:20.786Z · score: 184 (71 votes)
EDT solves 5 and 10 with conditional oracles 2018-09-30T07:57:35.136Z · score: 62 (19 votes)
Reducing collective rationality to individual optimization in common-payoff games using MCMC 2018-08-20T00:51:29.499Z · score: 58 (18 votes)
Buridan's ass in coordination games 2018-07-16T02:51:30.561Z · score: 55 (19 votes)
Decision theory and zero-sum game theory, NP and PSPACE 2018-05-24T08:03:18.721Z · score: 111 (37 votes)
In the presence of disinformation, collective epistemology requires local modeling 2017-12-15T09:54:09.543Z · score: 128 (45 votes)
Autopoietic systems and difficulty of AGI alignment 2017-08-20T01:05:10.000Z · score: 9 (4 votes)
Current thoughts on Paul Christano's research agenda 2017-07-16T21:08:47.000Z · score: 19 (9 votes)
Why I am not currently working on the AAMLS agenda 2017-06-01T17:57:24.000Z · score: 27 (11 votes)
A correlated analogue of reflective oracles 2017-05-07T07:00:38.000Z · score: 4 (4 votes)
Finding reflective oracle distributions using a Kakutani map 2017-05-02T02:12:06.000Z · score: 1 (1 votes)
Some problems with making induction benign, and approaches to them 2017-03-27T06:49:54.000Z · score: 3 (3 votes)
Maximally efficient agents will probably have an anti-daemon immune system 2017-02-23T00:40:47.000Z · score: 6 (5 votes)
Are daemons a problem for ideal agents? 2017-02-11T08:29:26.000Z · score: 5 (2 votes)
How likely is a random AGI to be honest? 2017-02-11T03:32:22.000Z · score: 1 (1 votes)
My current take on the Paul-MIRI disagreement on alignability of messy AI 2017-01-29T20:52:12.000Z · score: 17 (9 votes)
On motivations for MIRI's highly reliable agent design research 2017-01-29T19:34:37.000Z · score: 19 (10 votes)
Strategies for coalitions in unit-sum games 2017-01-23T04:20:31.000Z · score: 3 (3 votes)
An impossibility result for doing without good priors 2017-01-20T05:44:26.000Z · score: 1 (1 votes)
Pursuing convergent instrumental subgoals on the user's behalf doesn't always require good priors 2016-12-30T02:36:48.000Z · score: 15 (6 votes)
Predicting HCH using expert advice 2016-11-28T03:38:05.000Z · score: 5 (4 votes)
ALBA requires incremental design of good long-term memory systems 2016-11-28T02:10:53.000Z · score: 1 (1 votes)
Modeling the capabilities of advanced AI systems as episodic reinforcement learning 2016-08-19T02:52:13.000Z · score: 4 (2 votes)
Generative adversarial models, informed by arguments 2016-06-27T19:28:27.000Z · score: 0 (0 votes)
In memoryless Cartesian environments, every UDT policy is a CDT+SIA policy 2016-06-11T04:05:47.000Z · score: 24 (6 votes)
Two problems with causal-counterfactual utility indifference 2016-05-26T06:21:07.000Z · score: 3 (3 votes)

Comments

Comment by jessica-liu-taylor on The Bayesian Tyrant · 2020-08-21T03:03:25.904Z · score: 3 (3 votes) · LW · GW

The basic point here is that Bayesians lose zero sum games in the long term. Which is to be expected, because Bayesianism is a non adversarial epistemology. (Adversarial Bayesianism is simply game theory)

This sentence is surprising, though: "It is a truth more fundamental than Bayes’ Law that money will flow from the unclever to the clever".

Clearly, what wins zero sum games wins zero sum games, but what wins zero sum games need not correspond to collective epistemology.

As a foundation for epistemology, many things are superior to "might makes right", including Bayes' rule (despite its limitations).

Legislating Bayesianism in an adversarial context is futile; mechanism design is what is needed.

Comment by jessica-liu-taylor on Many-worlds versus discrete knowledge · 2020-08-14T18:24:34.414Z · score: 6 (3 votes) · LW · GW

Thanks! To the extent that discrete branches can be identified this way, that solves the problem. This is pushing the limits of my knowledge of QM at this point so I'll tag this as something to research further at a later point.

Comment by jessica-liu-taylor on Many-worlds versus discrete knowledge · 2020-08-14T18:21:35.481Z · score: 2 (1 votes) · LW · GW

I'm not asking for there to be a function to the entire world state, just a function to observations. Otherwise the theory does not explain observations!

(aside: I think Bohm does say there is a definite answer in the cat case, as there is a definite configuration that is the true one; it's Copenhagen that fails to say it is one way or the other)

Comment by jessica-liu-taylor on Many-worlds versus discrete knowledge · 2020-08-14T17:17:27.545Z · score: 2 (1 votes) · LW · GW

Then you need a theory of how the continuous microstate determines the discrete macrostate. E.g. as a function from reals to booleans. What is that theory in the case of the wave function determining photon measurements?

Comment by jessica-liu-taylor on Many-worlds versus discrete knowledge · 2020-08-14T16:25:56.375Z · score: 2 (1 votes) · LW · GW

I'm saying that our microphysical theories should explain our macrophysical observations. If they don't then we toss out the theory (Occam's razor).

Macrophysical observations are discrete.

Comment by jessica-liu-taylor on Many-worlds versus discrete knowledge · 2020-08-14T16:23:50.379Z · score: 2 (1 votes) · LW · GW

Let me know if anyone succeeds at that. I've thought in this direction and found it very difficult.

Comment by jessica-liu-taylor on Many-worlds versus discrete knowledge · 2020-08-14T16:21:11.766Z · score: 2 (1 votes) · LW · GW

See my reply here.

Consistent histories may actually solve the problem I'm talking about, because it discusses evolving configurations, not just an evolving wave function.

Comment by jessica-liu-taylor on Many-worlds versus discrete knowledge · 2020-08-14T16:20:09.332Z · score: 3 (2 votes) · LW · GW

The wave function is a fluid in configuration space that evolves over time. You need more theory than that to talk about discrete branches of it (configurations) evolving over time.

I agree that once you have this, you can say the knowledge gained is indexical.

Comment by jessica-liu-taylor on Many-worlds versus discrete knowledge · 2020-08-13T20:12:37.423Z · score: 3 (2 votes) · LW · GW

It's rather nonstandard to consider things like photon measurements to be nonphysical facts. Presumably, these come within the domain of physical theories.

Suppose we go with Solomonoff induction. Then we only adopt physical theories that explain observations happening over subjective time. These observations include discrete physical measurements.

It's not hard to see how Bohm explains these measurements: they are facts about the true configuration history.

It is hard to see how many worlds explains these measurements. Some sort of bridge law is required. The straightworward way of specifying the bridge law is the Bohm interpretation.

Comment by jessica-liu-taylor on Many-worlds versus discrete knowledge · 2020-08-13T20:09:25.067Z · score: 3 (2 votes) · LW · GW

Yes the argument has to be changed but that's mostly an issue of wording. Just replace discrete knowledge with discrete factual evidence.

If a Bayesian sees that the detector has detected a photon, how is that evidence about the wave function?

Comment by jessica-liu-taylor on Many-worlds versus discrete knowledge · 2020-08-13T20:07:06.647Z · score: 5 (3 votes) · LW · GW

Many worlds plus a location tag is the Bohm interpretation. You need theory for how locations evolve into other locations (in order to talk about multiple events happening in observed time), hence the nontriviality of the Bohm interpretation.

Comment by jessica-liu-taylor on Many-worlds versus discrete knowledge · 2020-08-13T19:02:40.076Z · score: 3 (2 votes) · LW · GW

I believe there are physical theories and physical facts, but that not all facts are straightforwardly physical (although, perhaps these are indirectly physical in a way that requires significant philosophical and conceptual work to determine, and which has degrees of freedom).

The issue in this post is about physical facts, e.g. measurements, needing to be interpreted in terms of a physical reality. These interpretations are required to have explanatory physical theories even if there are also non-physical facts.

Comment by jessica-liu-taylor on Many-worlds versus discrete knowledge · 2020-08-12T23:21:22.052Z · score: 3 (2 votes) · LW · GW

Bayesianism still believes in events, which are facts about the world. So the same problem comes up there, even if no fact can be known with certainty.

(in other words: the same problems that apply to 100% justification of belief apply to 99% justification of belief)

Comment by jessica-liu-taylor on Why artificial optimism? · 2020-06-13T06:49:05.078Z · score: 4 (2 votes) · LW · GW

I don't have a great theory here, but some pointers at non-hedonic values are:

  • "Wanting" as a separate thing from "liking"; what is planned/steered towards, versus what affective states are generated? See this. In a literal sense, people don't very much want to be happy.
  • It's common to speak in terms of "mental functions", e.g. perception and planning. The mind has a sort of "telos"/direction, which is not primarily towards maximizing happiness (if it were, we'd be happier); rather, the happiness signal has a function as part of the mind's functioning.
  • The desire to not be deceived, or to be correct, requires a correspondence between states of mind and objective states. To be deceived about, say, which mathematical results are true/interesting, means to explore a much more impoverished space of mathematical reasoning, than one could by having intact mathematical judgment.
  • Related to deception, social emotions are referential: they refer to other beings. The emotion can be present without the other beings existing, but this is a case of deception. Living in a simulation in which all apparent intelligent beings are actually (convincing) nonsentient robots seems undesirable.
  • Desire for variety. Having the same happy mind replicated everywhere is unsatisfying compared to having a diversity of mental states being explored. Perhaps you could erase your memory so you could re-experience the same great movie/art/whatever repeatedly, but would you want to?
  • Relatedly, the best art integrates positive and negative emotions. Having only positive emotions is like painting using only warm colors.

In epistemic matters we accept that beliefs about what is true may be wrong, in the sense that they may be incoherent, incompatible with other information, fail to take into account certain hypotheses, etc. Similarly, we may accept that beliefs about the quality of one's experience may be wrong, in that they may be incoherent, incompatible with other information, fail to take into account certain hypotheses, etc. There has to be a starting point for investigation (as there is in epistemic matters), which might or might not be hedonic, but coherence criteria and so on will modify the starting point.

I suspect that some of my opinions here are influenced by certain meditative experiences that reduce the degree to which experiential valence seems important, in comparison to variety, coherence, and functionality.

Comment by jessica-liu-taylor on Why artificial optimism? · 2020-06-13T05:25:06.864Z · score: 6 (3 votes) · LW · GW

Experiences of sentient beings are valuable, but have to be "about" something to properly be experiences, rather than, say, imagination.

I would rather that conditions in the universe are good for the lifeforms, and that the lifeforms' emotions track the situation, such that the lifeforms are happy. But if the universe is bad, then it's better (IMO) for the lifeforms to be sad about that.

The issue with evolution is that it's a puzzle that evolution would create animals that try to wirehead themselves, it's not a moral argument against wireheading.

Comment by jessica-liu-taylor on Why artificial optimism? · 2020-06-13T00:06:18.689Z · score: 4 (2 votes) · LW · GW

"Isn't the score I get in the game I'm playing one of the most important part of the 'actual state of affairs'? How would you measure the value of the actual state of affairs other than according to how it affects your (or others') scores?"

I'm not sure if this analogy is, by itself, convincing. But, it's suggestive, in that happiness is a simple, scalar-like thing, and it would be strange for such a simple thing to have a high degree of intrinsic value. Rather, on a broad perspective, it would seem that those things of most intrinsic value are those things that are computationally interesting, which can explore and cohere different sources of information, etc, rather than very simple scalars. (Of course, scalars can offer information about other things)

On an evolutionary account, why would it be fit for an organism care about a scalar quantity, except in that that quantity is correlated with the organism's fitness? It would seem that wireheading is a bug, from a design perspective.

Comment by jessica-liu-taylor on Jimrandomh's Shortform · 2020-06-10T15:45:15.495Z · score: 10 (4 votes) · LW · GW

It's been over 72 hours and the case count is under 110, as would be expected from linear extrapolation.

Comment by jessica-liu-taylor on Jimrandomh's Shortform · 2020-06-10T15:44:49.912Z · score: 2 (1 votes) · LW · GW

It's been over 72 hours and the case count is under 110.

Comment by jessica-liu-taylor on Estimating COVID-19 Mortality Rates · 2020-06-07T22:50:07.470Z · score: 6 (3 votes) · LW · GW

The intro paragraph seems to be talking about IFR ("around 2% of people who got COVID-19 would die") and suggesting that "we have enough data to check", i.e. that you're estimating IFR and have good data on it.

Comment by jessica-liu-taylor on The Presumptuous Philosopher, self-locating information, and Solomonoff induction · 2020-06-01T15:45:47.437Z · score: 2 (1 votes) · LW · GW

I mean efficiently in terms of number of bits, not computation time. Which contributes to posterior probability.

Comment by jessica-liu-taylor on The Presumptuous Philosopher, self-locating information, and Solomonoff induction · 2020-06-01T15:10:04.132Z · score: 4 (2 votes) · LW · GW

Yes, I agree. "Reference class" is a property of some models, not all models.

Comment by jessica-liu-taylor on The Presumptuous Philosopher, self-locating information, and Solomonoff induction · 2020-06-01T14:48:19.817Z · score: 4 (2 votes) · LW · GW

At this point it seems simplest to construct your reference class so as to only contain agents that can be found using the same procedure as yourself. Since you have to be decidable for the hypothesis to predict your observations, all others in your reference class are also decidable.

Comment by jessica-liu-taylor on The Presumptuous Philosopher, self-locating information, and Solomonoff induction · 2020-06-01T03:39:32.762Z · score: 7 (4 votes) · LW · GW

If there's a constant-length function mapping the universe description to the number of agents in that universe, doesn't that mean K(n) can't be more than the Kolmogorov complexity of the universe by more than that constant length?

If it isn't constant-length, then it seems strange to assume Solomonoff induction would posit a large objective universe, given that such positing wouldn't help it predict its inputs efficiently (since such prediction requires locating agents).

This still leads to the behavior I'm talking about in the limit; the sum of 1/2^K(n) over all n can be at most 1 so the probabilities on any particular n have to go arbitrarily small in the limit.

Comment by jessica-liu-taylor on The Presumptuous Philosopher, self-locating information, and Solomonoff induction · 2020-05-31T20:12:40.757Z · score: 4 (2 votes) · LW · GW

My understanding is that Solomonoff induction leads to more SSA-like behavior than SIA-like, at least in the limit, so will reject the presumptuous philosopher's argument.

Asserting that there are n people takes at least K(n) bits, so large universe sizes have to get less likely at some point.

Comment by jessica-liu-taylor on Nihilism doesn't matter · 2020-05-21T19:09:35.365Z · score: 2 (1 votes) · LW · GW

Active nihilism described in the paragraph definitely includes, but is not limited to, the negation of values. The active nihilists of a moral parliament may paralyze the parliament as a means to an end; perhaps, to cause systems other than the moral parliament to be the primary determinants of action, rather than the moral parliament.

Comment by jessica-liu-taylor on Nihilism doesn't matter · 2020-05-21T18:39:04.151Z · score: 2 (1 votes) · LW · GW

What you are describing is a passive sort of nihilism. Active nihilism, on the other hand, would actively try to negate the other values. Imagine a parliament where whenever a non-nihilist votes in favor of X, a nihilist votes against X, such that these votes exactly cancel out. Now, if (active) nihilists are a majority, they will ensure that the parliament as a whole has no aggregate preferences.

Comment by jessica-liu-taylor on Modeling naturalized decision problems in linear logic · 2020-05-17T15:23:00.849Z · score: 2 (1 votes) · LW · GW

CDT and EDT have known problems on 5 and 10. TDT/UDT are insufficiently formalized, and seem like they might rely on known-to-be-unfomalizable logical counterfactuals.

So 5 and 10 isn't trivial even without spurious counterfactuals.

What does this add over modal UDT?

  • No requirement to do infinite proof search
  • More elegant handling of multi-step decision problems
  • Also works on problems where the agent doesn't know its source code (of course, this prevents logical dependencies due to source code from being taken into account)

Philosophically, it works as a nice derivation of similar conclusions to modal UDT. The modal UDT algorithm doesn't by itself seem entirely well-motivated; why would material implication be what to search for? On the other hand, every step in the linear logic derivation is quite natural, building action into the logic, and encoding facts about what the agent can be assured of upon taking different actions. This makes it easier to think clearly about what the solution says about counterfactuals, e.g. in a section of this post.

Comment by jessica-liu-taylor on Consistent Glomarization should be feasible · 2020-05-04T20:59:41.157Z · score: 4 (2 votes) · LW · GW

Why lie on the d100 coming up 1 instead of "can neither confirm nor deny"?

Comment by jessica-liu-taylor on "Don't even think about hell" · 2020-05-03T02:10:13.458Z · score: 5 (3 votes) · LW · GW

Note: the provided utility function is incredibly insecure; even a not-very-powerful individual can manipulate the AI by writing down that hash code under certain conditions.

Also, the best way to minimize V + W is to minimize both V and W (i.e. write the hash code and create hell). If we replace this with min(V, W) then the AI becomes nihilistic if someone writes down the hash code, also a significant security vulnerability.

Comment by jessica-liu-taylor on Topological metaphysics: relating point-set topology and locale theory · 2020-05-01T19:32:32.726Z · score: 3 (2 votes) · LW · GW

Reals are still defined as sets of (a, b) rational intervals. The locale contains countable unions of these, but all these are determined by which (a, b) intervals contain the real number.

Comment by jessica-liu-taylor on Topological metaphysics: relating point-set topology and locale theory · 2020-05-01T17:07:56.148Z · score: 5 (3 votes) · LW · GW

Good point; I've changed the wording to make it clear that the rational-delimited open intervals are the basis, not all the locale elements. Luckily, points can be defined as sets of basis elements containing them, since all other properties follow. (Making the locale itself countable requires weakening the definition by making the sets to form unions over countable, e.g. by requiring them to be recursively enumerable)

Comment by jessica-liu-taylor on Motivating Abstraction-First Decision Theory · 2020-04-29T20:36:42.574Z · score: 12 (6 votes) · LW · GW

I've also been thinking about the application of agency abstractions to decision theory, from a somewhat different angle.

It seems like what you're doing is considering relations between high-level third-person abstractions and low-level third-person abstractions. In contrast, I'm primarily considering relations between high-level first-person abstractions and low-level first-person abstractions.

The VNM abstraction itself assumes that "you" are deciding between different options, each of which has different (stochastic) consequences; thus, it is inherently first-personal. (Applying it to some other agent requires conjecturing things about that agent's first-person perspective: the consequences it expects from different actions)

In general, conditions of rationality are first-personal, in the sense that they tell a given perspective what they must believe in order to be consistent.

The determinism vs. free will paradox comes about when trying to determine when a VNM-like choice abstraction is valid of a third-personal physical world.

My present view of physics is that it is also first-personal, in the sense that:

  1. If physical entities are considered perceptible, then there is an assumed relation between them and first-personal observations.
  2. If physical entities are causal in a Pearlian sense, then there is an assumed relation between them and metaphysically-real interventions, which are produced through first-personal actions.

Decision theory problems, considered linguistically, are also first-personal. In the five and ten problem, things are said about "you" being in a given room, choosing between two items on "the" table, presumably the one in front of "you". If the ability to choose different dollar bills is, linguistically, considered a part of the decision problem, then the decision problem already contains in it a first-personal VNM-like choice abstraction.

The naturalization problem is to show how such high-level, first-personal decision theory problems could be compatible with physics. Such naturalization is hard, perhaps impossible, if physics is assumed to be third-personal, but may be possible if physics is assumed to be first-personal.

Comment by jessica-liu-taylor on Subjective implication decision theory in critical agentialism · 2020-04-28T21:54:24.444Z · score: 6 (3 votes) · LW · GW

Looking back on this, it does seem quite similar to EDT. I'm actually, at this point, not clear on how EDT and TDT differ, except in that EDT has potential problems in cases where it's sure about its own action. I'll change the text so it notes the similarity to EDT.

On XOR blackmail, SIDT will indeed pay up.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-12T05:52:57.863Z · score: 2 (1 votes) · LW · GW

Yes, it's about no backwards assumption. Linear has lots of meanings, I'm not concerned about this getting confused with linear algebra, but you can suggest a better term if you have one.

Comment by jessica-liu-taylor on Seemingly Popular Covid-19 Model is Obvious Nonsense · 2020-04-12T00:23:35.484Z · score: 12 (11 votes) · LW · GW

Epistemic Status: Something Is Wrong On The Internet.

If you think this applies, it would seem that "The Internet" is being construed so broadly that it includes the mainstream media, policymaking, and a substantial fraction of people, such that the "Something Is Wrong On The Internet" heuristic points against correction of public disinformation in general.

This is a post that is especially informative, aligned with justice, and likely to save lives, and so it would be a shame if this heuristic were to dissuade you from writing it.

Comment by jessica-liu-taylor on In Defense of Politics · 2020-04-10T22:03:00.977Z · score: 21 (6 votes) · LW · GW

The presumption with conspiracies is that they are engaged in for some local benefit by the conspiracy at the detriment of the broader society. Hence, the "unilateralist's curse" is a blessing in this case, as the overestimation by one member of a conspiracy of their own utility in having the secret exposed, brings their estimation more in line with the estimation of the broader society, whose interests differ from those of the conspirators.

If differences between the interests of different groups were not a problem, then there would be no motive to form a conspiracy.

In general, I am quite annoyed at the idea of the unilateralist's curse being used as a general argument against the revelation of the truth, without careful checking of the correspondence between the decision theoretic model of the unilateralist's curse and the actual situation, which includes crime and conflict.

Comment by jessica-liu-taylor on Solipsism is Underrated · 2020-04-10T20:44:25.573Z · score: 3 (2 votes) · LW · GW

A major problem with physicalist dismissal of experiential evidence (as I've discussed previously) is that the conventional case for believing in physics is that it explains experiential evidence, e.g. experimental results. Solomonoff induction, among the best formalizations of Occam's razor, believes in "my observations".

If basic facts like "I have observations" are being doubted, then any case for belief in physics has to go through something independent of its explanations of experiential evidence. This looks to be a difficult problem.

You could potentially resolve the problem by saying that only some observations, such as those of mechanical measuring devices, count; however, this still leads to an analogous problem to the hard problem of consciousness, namely, what is the mapping between physics and the outputs of the mechanical measuring devices that are being explained by theories? (The same problem comes up of "what data is the theorizing trying to explain" whether the theorizing happens in a single brain or in a distributed intelligence, e.g. a collection of people using the scientific method)

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-08T23:52:37.208Z · score: 2 (1 votes) · LW · GW

Basically, the assumption that you're participating in a POMDP. The idea is that there's some hidden state that your actions interact with in a temporally linear fashion (i.e. action 1 affects state 2), such that your late actions can't affect early states/observations.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-07T03:49:35.526Z · score: 2 (1 votes) · LW · GW

The way you are using it doesn’t necessarily imply real control, it may be imaginary control.

I'm discussing a hypothetical agent who believes itself to have control. So its beliefs include "I have free will". Its belief isn't "I believe that I have free will".

It’s a “para-consistent material conditional” by which I mean the algorithm is limited in such a way as to prevent this explosion.

Yes, that makes sense.

However, were you flowing this all the way back in time?

Yes (see thread with Abram Demski).

What do you mean by dualistic?

Already factorized as an agent interacting with an environment.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-06T18:55:42.139Z · score: 4 (2 votes) · LW · GW

Secondly, “free will” is such a loaded word that using it in a non-standard fashion simply obscures and confuses the discussion.

Wikipedia says "Free will is the ability to choose between different possible courses of action unimpeded." SEP says "The term “free will” has emerged over the past two millennia as the canonical designator for a significant kind of control over one’s actions." So my usage seems pretty standard.

For example, recently I’ve been arguing in favour of what counts as a valid counterfactual being at least partially a matter of social convention.

All word definitions are determined in large part by social convention. The question is whether the social convention corresponds to a definition (e.g. with truth conditions) or not. If it does, then the social convention is realist, if not, it's nonrealist (perhaps emotivist, etc).

Material conditions only provide the outcome when we have a consistent counterfactual.

Not necessarily. An agent may be uncertain over its own action, and thus have uncertainty about material conditionals involving its action. The "possible worlds" represented by this uncertainty may be logically inconsistent, in ways the agent can't determine before making the decision.

Proof-based UDT doesn’t quite use material conditionals, it uses a paraconsistent version of them instead.

I don't understand this? I thought it searched for proofs of the form "if I take this action, then I get at least this much utility", which is a material conditional.

So, to imagine counterfactually taking action Y we replace the agent doing X with another agent doing Y and flow causation both forwards and backwards.

Policy-dependent source code does this; one's source code depends on one's policy.

I guess from a philosophical perspective it makes sense to first consider whether policy-dependent source code makes sense and then if it does further ask whether UDT makes sense.

I think UDT makes sense in "dualistic" decision problems that are already factorized as "this policy leads to these consequences". Extending it to a nondualist case brings up difficulties, including the free will / determinism issue. Policy-dependent source code is a way of interpreting UDT in a setting with deterministic, knowable physics.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-05T22:03:54.269Z · score: 6 (3 votes) · LW · GW

I think it's worth examining more closely what it means to be "not a pure optimizer". Formally, a VNM utility function is a rationalization of a coherent policy. Say that you have some idea about what your utility function is, U. Suppose you then decide to follow a policy that does not maximize U. Logically, it follows that U is not really your utility function; either your policy doesn't coherently maximize any utility function, or it maximizes some other utility function. (Because the utility function is, by definition, a rationalization of the policy)

Failing to disambiguate these two notions of "the agent's utility function" is a map-territory error.

Decision theories require, as input, a utility function to maximize, and output a policy. If a decision theory is adopted by an agent who is using it to determine their policy (rather than already knowing their policy), then they are operating on some preliminary idea about what their utility function is. Their "actual" utility function is dependent on their policy; it need not match up with their idea.

So, it is very much possible for an agent who is operating on an idea U of their utility function, to evaluate counterfactuals in which their true behavioral utility function is not U. Indeed, this is implied by the fact that utility functions are rationalizations of policies.

Let's look at the "turn left/right" example. The agent is operating on a utility function idea U, which is higher the more the agent turns left. When they evaluate the policy of turning "right" on the 10th time, they must conclude that, in this hypothetical, either (a) "right" maximizes U, (b) they are maximizing some utility function other than U, or (c) they aren't a maximizer at all.

The logical counterfactual framework says the answer is (a): that the fixed computation of U-maximization results in turning right, not left. But, this is actually the weirdest of the three worlds. It is hard to imagine ways that "right" maximizes U, whereas it is easy to imagine that the agent is maximizing a utility function other than U, or is not a maximizer.

Yes, the (b) and (c) worlds may be weird in a problematic way. However, it is hard to imagine these being nearly as weird as (a).

One way they could be weird is that an agent having a complex utility function is likely to have been produced by a different process than an agent with a simple utility function. So the more weird exceptional decisions you make, the greater the evidence is that you were produced by the sort of process that produces complex utility functions.

This is pretty similar to the smoking lesion problem, then. I expect that policy-dependent source code will have a lot in common with EDT, as they both consider "what sort of agent I am" to be a consequence of one's policy. (However, as you've pointed out, there are important complications with the framing of the smoking lesion problem)

I think further disambiguation on this could benefit from re-analyzing the smoking lesion problem (or a similar problem), but I'm not sure if I have the right set of concepts for this yet.

Comment by jessica-liu-taylor on Referencing the Unreferencable · 2020-04-04T18:54:37.634Z · score: 4 (2 votes) · LW · GW

If you fix a notion of referenceability rather that equivocating, then the point that talking of unreferenceable entities is absurd will stand.

If you equivocate, then very little can be said in general about referenceability.

(I would say that "our universe's simulators" is referenceable, since it's positing something that causes sensory inputs)

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-04T18:46:10.569Z · score: 3 (2 votes) · LW · GW

It seems the approaches we're using are similar, in that they both are starting from observation/action history with posited falsifiable laws, with the agent's source code not known a priori, and the agent considering different policies.

Learning "my source code is A" is quite similar to learning "Omega predicts my action is equal to A()", so these would lead to similar results.

Policy-dependent source code, then, corresponds to Omega making different predictions depending on the agent's intended policy, such that when comparing policies, the agent has to imagine Omega predicting differently (as it would imagine learning different source code under policy-dependent source code).

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-03T21:48:37.739Z · score: 5 (3 votes) · LW · GW

I agree this is a problem, but isn't this a problem for logical counterfactual approaches as well? Isn't it also weird for a known fixed optimizer source code to produce a different result on this decision where it's obvious that 'left' is the best decision?

If you assume that the agent chose 'right', it's more reasonable to think it's because it's not a pure optimizer than that a pure optimizer would have chosen 'right', in my view.

If you form the intent to, as a policy, go 'right' on the 100th turn, you should anticipate learning that your source code is not the code of a pure optimizer.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-03T21:43:43.291Z · score: 2 (1 votes) · LW · GW

This indeed makes sense when "obs" is itself a logical fact. If obs is a sensory input, though, 'A(obs) = act' is a logical fact, not a logical counterfactual. (I'm not trying to avoid causal interpretations of source code interpreters here, just logical counterfactuals)

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-02T22:20:08.537Z · score: 9 (5 votes) · LW · GW

In the happy dance problem, when the agent is considering doing a happy dance, the agent should have already updated on M. This is more like timeless decision theory than updateless decision theory.

Conditioning on 'A(obs) = act' is still a conditional, not a counterfactual. The difference between conditionals and counterfactuals is the difference between "If Oswald didn't kill Kennedy, then someone else did" and "If Oswald didn't kill Kennedy, then someone else would have".

Indeed, troll bridge will present a problem for "playing chicken" approaches, which are probably necessary in counterfactual nonrealism.

For policy-dependent source code, I intend for the agent to be logically updateful, while updateless about observations.

Why is this much better than counterfactuals which keep the source code fixed but imagine the execution trace being different?

Because it doesn't lead to logical incoherence, so reasoning about counterfactuals doesn't have to be limited.

This seems to only push the rough spots further back—there can still be contradictions, e.g. between the source code and the process by which programmers wrote the source code.

If you see your source code is B instead of A, you should anticipate learning that the programmers programmed B instead of A, which means something was different in the process. So the counterfactual has implications backwards in physical time.

At some point it will ground out in: different indexical facts, different laws of physics, different initial conditions, different random events...

This theory isn't worked out yet but it doesn't yet seem that it will run into logical incoherence, the way logical counterfactuals do.

But then we are faced with the usual questions about spurious counterfactuals, chicken rule, exploration, and Troll Bridge.

Maybe some of these.

Spurious counterfactuals require getting a proof of "I will take action X". The proof proceeds by showing "source code A outputs action X". But an agent who accepts policy-dependent source code will believe they have source code other than A if they don't take action X. So the spurious proof doesn't prevent the counterfactual from being evaluated.

Chicken rule is hence unnecessary.

Exploration is a matter of whether the world model is any good; the world model may, for example, map a policy to a distribution of expected observations. (That is, the world model already has policy counterfactuals as part of it; theories such as physics provide constraints on the world model rather than fully determining it). Learning a good world model is of course a problem in any approach.

Whether troll bridge is a problem depends on how the source code counterfactual is evaluated. Indeed, many ways of running this counterfactual (e.g. inserting special cases into the source code) are "stupid" and could be punished in a troll bridge problem.

I by no means think "policy-dependent source code" is presently a well worked-out theory; the advantage relative to logical counterfactuals is that in the latter case, there is a strong theoretical obstacle to ever having a well worked-out theory, namely logical incoherence of the counterfactuals. Hence, coming up with a theory of policy-dependent source code seems more likely to succeed than coming up with a theory of logical counterfactuals.

Comment by jessica-liu-taylor on The absurdity of un-referenceable entities · 2020-04-02T20:39:16.322Z · score: 6 (3 votes) · LW · GW

It seems fine to have categories that are necessarily empty. Such as "numbers that are both odd and even". "Non-ontologizable thing" may be such a set. Or it may be more vague than that, I'm not sure.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-02T18:10:55.234Z · score: 2 (1 votes) · LW · GW

I'm not using "free will" to mean something distinct from "the ability of an agent, from its perspective, to choose one of multiple possible actions". Maybe this usage is nonstandard but find/replace yields the right meaning.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-02T00:50:02.362Z · score: 4 (2 votes) · LW · GW

For counterfactual nonrealism, it's simply the uncertainty an agent has about their own action, while believing themselves to control their action.

For policy-dependent source code, the "different possibilities" correspond to different source code. An agent with fixed source code can only take one possible action (from a logically omniscent perspective), but the counterfactuals change the agent's source code, getting around this constraint.

Comment by jessica-liu-taylor on The absurdity of un-referenceable entities · 2020-04-01T22:33:26.774Z · score: 4 (2 votes) · LW · GW

The absurdity comes not from believing that some agent lacks the ability to reference some entity that you can reference, but from believing that you lack the ability to reference some entity that you are nonetheless talking about.

In the second case, you are ontologizing something that is by definition not ontologizable.

If there's a particular agent thinking about me, then I can refer to that agent ("the one thinking about me"), hence referring to whatever they can refer to. It is indeed easy to neglect the possibility that someone is thinking about me, but that differs from in-principle unreferenceability.

I don't believe in views from nowhere; I don't think the concept holds up to scrutiny. In contrast, particular directions of zoom-out lead to views from particular referenceable places.