Many-worlds versus discrete knowledge 2020-08-13T18:35:53.442Z
Modeling naturalized decision problems in linear logic 2020-05-06T00:15:15.400Z
Topological metaphysics: relating point-set topology and locale theory 2020-05-01T03:57:11.899Z
Two Alternatives to Logical Counterfactuals 2020-04-01T09:48:29.619Z
The absurdity of un-referenceable entities 2020-03-14T17:40:37.750Z
Puzzles for Physicalists 2020-03-12T01:37:13.353Z
A conversation on theory of mind, subjectivity, and objectivity 2020-03-10T04:59:23.266Z
Subjective implication decision theory in critical agentialism 2020-03-05T23:30:42.694Z
A critical agential account of free will, causation, and physics 2020-03-05T07:57:38.193Z
On the falsifiability of hypercomputation, part 2: finite input streams 2020-02-17T03:51:57.238Z
On the falsifiability of hypercomputation 2020-02-07T08:16:07.268Z
Philosophical self-ratification 2020-02-03T22:48:46.985Z
High-precision claims may be refuted without being replaced with other high-precision claims 2020-01-30T23:08:33.792Z
On hiding the source of knowledge 2020-01-26T02:48:51.310Z
On the ontological development of consciousness 2020-01-25T05:56:43.244Z
Is requires ought 2019-10-28T02:36:43.196Z
Metaphorical extensions and conceptual figure-ground inversions 2019-07-24T06:21:54.487Z
Dialogue on Appeals to Consequences 2019-07-18T02:34:52.497Z
Why artificial optimism? 2019-07-15T21:41:24.223Z
The AI Timelines Scam 2019-07-11T02:52:58.917Z
Self-consciousness wants to make everything about itself 2019-07-03T01:44:41.204Z
Writing children's picture books 2019-06-25T21:43:45.578Z
Conditional revealed preference 2019-04-16T19:16:55.396Z
Boundaries enable positive material-informational feedback loops 2018-12-22T02:46:48.938Z
Act of Charity 2018-11-17T05:19:20.786Z
EDT solves 5 and 10 with conditional oracles 2018-09-30T07:57:35.136Z
Reducing collective rationality to individual optimization in common-payoff games using MCMC 2018-08-20T00:51:29.499Z
Buridan's ass in coordination games 2018-07-16T02:51:30.561Z
Decision theory and zero-sum game theory, NP and PSPACE 2018-05-24T08:03:18.721Z
In the presence of disinformation, collective epistemology requires local modeling 2017-12-15T09:54:09.543Z
Autopoietic systems and difficulty of AGI alignment 2017-08-20T01:05:10.000Z
Current thoughts on Paul Christano's research agenda 2017-07-16T21:08:47.000Z
Why I am not currently working on the AAMLS agenda 2017-06-01T17:57:24.000Z
A correlated analogue of reflective oracles 2017-05-07T07:00:38.000Z
Finding reflective oracle distributions using a Kakutani map 2017-05-02T02:12:06.000Z
Some problems with making induction benign, and approaches to them 2017-03-27T06:49:54.000Z
Maximally efficient agents will probably have an anti-daemon immune system 2017-02-23T00:40:47.000Z
Are daemons a problem for ideal agents? 2017-02-11T08:29:26.000Z
How likely is a random AGI to be honest? 2017-02-11T03:32:22.000Z
My current take on the Paul-MIRI disagreement on alignability of messy AI 2017-01-29T20:52:12.000Z
On motivations for MIRI's highly reliable agent design research 2017-01-29T19:34:37.000Z
Strategies for coalitions in unit-sum games 2017-01-23T04:20:31.000Z
An impossibility result for doing without good priors 2017-01-20T05:44:26.000Z
Pursuing convergent instrumental subgoals on the user's behalf doesn't always require good priors 2016-12-30T02:36:48.000Z
Predicting HCH using expert advice 2016-11-28T03:38:05.000Z
ALBA requires incremental design of good long-term memory systems 2016-11-28T02:10:53.000Z
Modeling the capabilities of advanced AI systems as episodic reinforcement learning 2016-08-19T02:52:13.000Z
Generative adversarial models, informed by arguments 2016-06-27T19:28:27.000Z
In memoryless Cartesian environments, every UDT policy is a CDT+SIA policy 2016-06-11T04:05:47.000Z
Two problems with causal-counterfactual utility indifference 2016-05-26T06:21:07.000Z


Comment by jessicata (jessica.liu.taylor) on What 2026 looks like (Daniel's Median Future) · 2021-08-06T21:49:09.287Z · LW · GW

This is quite good concrete AI forecasting compared to what I've seen elsewhere, thanks for doing it! It seems really plasusible based on how fast AI progress has been going over the past decade and which problems are most tractable.

Comment by jessicata (jessica.liu.taylor) on Thoughts on Voting Methods · 2020-11-27T19:18:44.932Z · LW · GW

I agree that moving to distributions and scalar utility is a good way of avoiding Pareto suboptimal outcomes.

Comment by jessicata (jessica.liu.taylor) on Thoughts on Voting Methods · 2020-11-27T19:17:15.350Z · LW · GW

How do you think specifying a distribution rather than a single candidate changes the game? E.g. is there an example where the Nash equilibria differ?

It seems like, in the game you specify, flattening either player's distribution (converting a distribution over distributions over candidates into a distribution over candidates in the obvious way) doesn't change either player's expected utility.

Comment by jessicata (jessica.liu.taylor) on Thoughts on Voting Methods · 2020-11-25T17:10:28.801Z · LW · GW

FPTP would be if there weren't more points awarded for winning by more votes.

Here is an example of an election.

3 prefer A > B > C

4 prefer B > C > A

5 prefer C > A > B

(note, this is a Condorcet cycle)

Now we construct the following payoff matrix for a zero sum game, where the number given is for the utility of the row player:

\ A B C

A 0 4 -6

B -4 0 2

C 6 -2 0

This is basically rock paper scissors, except that the A strategy wins twice as much when it wins as the C strategy does, and the B strategy wins 3 times as much as the C strategy does.

This game's unique Nash equilibrium picks A 1/6 of the time, B 1/2 of the time, C 1/3 of the time. So this is the probability of the candidates being elected.

Comment by jessicata (jessica.liu.taylor) on Thoughts on Voting Methods · 2020-11-23T10:54:27.131Z · LW · GW

I don't know what you are saying is the case.

Comment by jessicata (jessica.liu.taylor) on Thoughts on Voting Methods · 2020-11-18T20:11:37.082Z · LW · GW

Curious what you think of Consistent Probabilistic Social Choice.

My summary:

There is a unique consistent voting system in cases where the system may return a stochastic distribution of candidates!

(where consistent means: grouping together populations that agree doesn't change the result, and neither does duplicating candidates)

What is the rule? Take a symmetric zero-sum game where each player picks a candidate, and someone wins if their candidate is preferred by the majority to the other, winning more points if they are preferred by a larger majority. This game's Nash equilibrium is the distribution.

Comment by jessicata (jessica.liu.taylor) on The Bayesian Tyrant · 2020-08-21T03:03:25.904Z · LW · GW

The basic point here is that Bayesians lose zero sum games in the long term. Which is to be expected, because Bayesianism is a non adversarial epistemology. (Adversarial Bayesianism is simply game theory)

This sentence is surprising, though: "It is a truth more fundamental than Bayes’ Law that money will flow from the unclever to the clever".

Clearly, what wins zero sum games wins zero sum games, but what wins zero sum games need not correspond to collective epistemology.

As a foundation for epistemology, many things are superior to "might makes right", including Bayes' rule (despite its limitations).

Legislating Bayesianism in an adversarial context is futile; mechanism design is what is needed.

Comment by jessicata (jessica.liu.taylor) on Many-worlds versus discrete knowledge · 2020-08-14T18:24:34.414Z · LW · GW

Thanks! To the extent that discrete branches can be identified this way, that solves the problem. This is pushing the limits of my knowledge of QM at this point so I'll tag this as something to research further at a later point.

Comment by jessicata (jessica.liu.taylor) on Many-worlds versus discrete knowledge · 2020-08-14T18:21:35.481Z · LW · GW

I'm not asking for there to be a function to the entire world state, just a function to observations. Otherwise the theory does not explain observations!

(aside: I think Bohm does say there is a definite answer in the cat case, as there is a definite configuration that is the true one; it's Copenhagen that fails to say it is one way or the other)

Comment by jessicata (jessica.liu.taylor) on Many-worlds versus discrete knowledge · 2020-08-14T17:17:27.545Z · LW · GW

Then you need a theory of how the continuous microstate determines the discrete macrostate. E.g. as a function from reals to booleans. What is that theory in the case of the wave function determining photon measurements?

Comment by jessicata (jessica.liu.taylor) on Many-worlds versus discrete knowledge · 2020-08-14T16:25:56.375Z · LW · GW

I'm saying that our microphysical theories should explain our macrophysical observations. If they don't then we toss out the theory (Occam's razor).

Macrophysical observations are discrete.

Comment by jessicata (jessica.liu.taylor) on Many-worlds versus discrete knowledge · 2020-08-14T16:23:50.379Z · LW · GW

Let me know if anyone succeeds at that. I've thought in this direction and found it very difficult.

Comment by jessicata (jessica.liu.taylor) on Many-worlds versus discrete knowledge · 2020-08-14T16:21:11.766Z · LW · GW

See my reply here.

Consistent histories may actually solve the problem I'm talking about, because it discusses evolving configurations, not just an evolving wave function.

Comment by jessicata (jessica.liu.taylor) on Many-worlds versus discrete knowledge · 2020-08-14T16:20:09.332Z · LW · GW

The wave function is a fluid in configuration space that evolves over time. You need more theory than that to talk about discrete branches of it (configurations) evolving over time.

I agree that once you have this, you can say the knowledge gained is indexical.

Comment by jessicata (jessica.liu.taylor) on Many-worlds versus discrete knowledge · 2020-08-13T20:12:37.423Z · LW · GW

It's rather nonstandard to consider things like photon measurements to be nonphysical facts. Presumably, these come within the domain of physical theories.

Suppose we go with Solomonoff induction. Then we only adopt physical theories that explain observations happening over subjective time. These observations include discrete physical measurements.

It's not hard to see how Bohm explains these measurements: they are facts about the true configuration history.

It is hard to see how many worlds explains these measurements. Some sort of bridge law is required. The straightworward way of specifying the bridge law is the Bohm interpretation.

Comment by jessicata (jessica.liu.taylor) on Many-worlds versus discrete knowledge · 2020-08-13T20:09:25.067Z · LW · GW

Yes the argument has to be changed but that's mostly an issue of wording. Just replace discrete knowledge with discrete factual evidence.

If a Bayesian sees that the detector has detected a photon, how is that evidence about the wave function?

Comment by jessicata (jessica.liu.taylor) on Many-worlds versus discrete knowledge · 2020-08-13T20:07:06.647Z · LW · GW

Many worlds plus a location tag is the Bohm interpretation. You need theory for how locations evolve into other locations (in order to talk about multiple events happening in observed time), hence the nontriviality of the Bohm interpretation.

Comment by jessicata (jessica.liu.taylor) on Many-worlds versus discrete knowledge · 2020-08-13T19:02:40.076Z · LW · GW

I believe there are physical theories and physical facts, but that not all facts are straightforwardly physical (although, perhaps these are indirectly physical in a way that requires significant philosophical and conceptual work to determine, and which has degrees of freedom).

The issue in this post is about physical facts, e.g. measurements, needing to be interpreted in terms of a physical reality. These interpretations are required to have explanatory physical theories even if there are also non-physical facts.

Comment by jessicata (jessica.liu.taylor) on Many-worlds versus discrete knowledge · 2020-08-12T23:21:22.052Z · LW · GW

Bayesianism still believes in events, which are facts about the world. So the same problem comes up there, even if no fact can be known with certainty.

(in other words: the same problems that apply to 100% justification of belief apply to 99% justification of belief)

Comment by jessicata (jessica.liu.taylor) on Why artificial optimism? · 2020-06-13T06:49:05.078Z · LW · GW

I don't have a great theory here, but some pointers at non-hedonic values are:

  • "Wanting" as a separate thing from "liking"; what is planned/steered towards, versus what affective states are generated? See this. In a literal sense, people don't very much want to be happy.
  • It's common to speak in terms of "mental functions", e.g. perception and planning. The mind has a sort of "telos"/direction, which is not primarily towards maximizing happiness (if it were, we'd be happier); rather, the happiness signal has a function as part of the mind's functioning.
  • The desire to not be deceived, or to be correct, requires a correspondence between states of mind and objective states. To be deceived about, say, which mathematical results are true/interesting, means to explore a much more impoverished space of mathematical reasoning, than one could by having intact mathematical judgment.
  • Related to deception, social emotions are referential: they refer to other beings. The emotion can be present without the other beings existing, but this is a case of deception. Living in a simulation in which all apparent intelligent beings are actually (convincing) nonsentient robots seems undesirable.
  • Desire for variety. Having the same happy mind replicated everywhere is unsatisfying compared to having a diversity of mental states being explored. Perhaps you could erase your memory so you could re-experience the same great movie/art/whatever repeatedly, but would you want to?
  • Relatedly, the best art integrates positive and negative emotions. Having only positive emotions is like painting using only warm colors.

In epistemic matters we accept that beliefs about what is true may be wrong, in the sense that they may be incoherent, incompatible with other information, fail to take into account certain hypotheses, etc. Similarly, we may accept that beliefs about the quality of one's experience may be wrong, in that they may be incoherent, incompatible with other information, fail to take into account certain hypotheses, etc. There has to be a starting point for investigation (as there is in epistemic matters), which might or might not be hedonic, but coherence criteria and so on will modify the starting point.

I suspect that some of my opinions here are influenced by certain meditative experiences that reduce the degree to which experiential valence seems important, in comparison to variety, coherence, and functionality.

Comment by jessicata (jessica.liu.taylor) on Why artificial optimism? · 2020-06-13T05:25:06.864Z · LW · GW

Experiences of sentient beings are valuable, but have to be "about" something to properly be experiences, rather than, say, imagination.

I would rather that conditions in the universe are good for the lifeforms, and that the lifeforms' emotions track the situation, such that the lifeforms are happy. But if the universe is bad, then it's better (IMO) for the lifeforms to be sad about that.

The issue with evolution is that it's a puzzle that evolution would create animals that try to wirehead themselves, it's not a moral argument against wireheading.

Comment by jessicata (jessica.liu.taylor) on Why artificial optimism? · 2020-06-13T00:06:18.689Z · LW · GW

"Isn't the score I get in the game I'm playing one of the most important part of the 'actual state of affairs'? How would you measure the value of the actual state of affairs other than according to how it affects your (or others') scores?"

I'm not sure if this analogy is, by itself, convincing. But, it's suggestive, in that happiness is a simple, scalar-like thing, and it would be strange for such a simple thing to have a high degree of intrinsic value. Rather, on a broad perspective, it would seem that those things of most intrinsic value are those things that are computationally interesting, which can explore and cohere different sources of information, etc, rather than very simple scalars. (Of course, scalars can offer information about other things)

On an evolutionary account, why would it be fit for an organism care about a scalar quantity, except in that that quantity is correlated with the organism's fitness? It would seem that wireheading is a bug, from a design perspective.

Comment by jessicata (jessica.liu.taylor) on Jimrandomh's Shortform · 2020-06-10T15:45:15.495Z · LW · GW

It's been over 72 hours and the case count is under 110, as would be expected from linear extrapolation.

Comment by jessicata (jessica.liu.taylor) on Jimrandomh's Shortform · 2020-06-10T15:44:49.912Z · LW · GW

It's been over 72 hours and the case count is under 110.

Comment by jessicata (jessica.liu.taylor) on Estimating COVID-19 Mortality Rates · 2020-06-07T22:50:07.470Z · LW · GW

The intro paragraph seems to be talking about IFR ("around 2% of people who got COVID-19 would die") and suggesting that "we have enough data to check", i.e. that you're estimating IFR and have good data on it.

Comment by jessicata (jessica.liu.taylor) on The Presumptuous Philosopher, self-locating information, and Solomonoff induction · 2020-06-01T15:45:47.437Z · LW · GW

I mean efficiently in terms of number of bits, not computation time. Which contributes to posterior probability.

Comment by jessicata (jessica.liu.taylor) on The Presumptuous Philosopher, self-locating information, and Solomonoff induction · 2020-06-01T15:10:04.132Z · LW · GW

Yes, I agree. "Reference class" is a property of some models, not all models.

Comment by jessicata (jessica.liu.taylor) on The Presumptuous Philosopher, self-locating information, and Solomonoff induction · 2020-06-01T14:48:19.817Z · LW · GW

At this point it seems simplest to construct your reference class so as to only contain agents that can be found using the same procedure as yourself. Since you have to be decidable for the hypothesis to predict your observations, all others in your reference class are also decidable.

Comment by jessicata (jessica.liu.taylor) on The Presumptuous Philosopher, self-locating information, and Solomonoff induction · 2020-06-01T03:39:32.762Z · LW · GW

If there's a constant-length function mapping the universe description to the number of agents in that universe, doesn't that mean K(n) can't be more than the Kolmogorov complexity of the universe by more than that constant length?

If it isn't constant-length, then it seems strange to assume Solomonoff induction would posit a large objective universe, given that such positing wouldn't help it predict its inputs efficiently (since such prediction requires locating agents).

This still leads to the behavior I'm talking about in the limit; the sum of 1/2^K(n) over all n can be at most 1 so the probabilities on any particular n have to go arbitrarily small in the limit.

Comment by jessicata (jessica.liu.taylor) on The Presumptuous Philosopher, self-locating information, and Solomonoff induction · 2020-05-31T20:12:40.757Z · LW · GW

My understanding is that Solomonoff induction leads to more SSA-like behavior than SIA-like, at least in the limit, so will reject the presumptuous philosopher's argument.

Asserting that there are n people takes at least K(n) bits, so large universe sizes have to get less likely at some point.

Comment by jessicata (jessica.liu.taylor) on Nihilism doesn't matter · 2020-05-21T19:09:35.365Z · LW · GW

Active nihilism described in the paragraph definitely includes, but is not limited to, the negation of values. The active nihilists of a moral parliament may paralyze the parliament as a means to an end; perhaps, to cause systems other than the moral parliament to be the primary determinants of action, rather than the moral parliament.

Comment by jessicata (jessica.liu.taylor) on Nihilism doesn't matter · 2020-05-21T18:39:04.151Z · LW · GW

What you are describing is a passive sort of nihilism. Active nihilism, on the other hand, would actively try to negate the other values. Imagine a parliament where whenever a non-nihilist votes in favor of X, a nihilist votes against X, such that these votes exactly cancel out. Now, if (active) nihilists are a majority, they will ensure that the parliament as a whole has no aggregate preferences.

Comment by jessicata (jessica.liu.taylor) on Modeling naturalized decision problems in linear logic · 2020-05-17T15:23:00.849Z · LW · GW

CDT and EDT have known problems on 5 and 10. TDT/UDT are insufficiently formalized, and seem like they might rely on known-to-be-unfomalizable logical counterfactuals.

So 5 and 10 isn't trivial even without spurious counterfactuals.

What does this add over modal UDT?

  • No requirement to do infinite proof search
  • More elegant handling of multi-step decision problems
  • Also works on problems where the agent doesn't know its source code (of course, this prevents logical dependencies due to source code from being taken into account)

Philosophically, it works as a nice derivation of similar conclusions to modal UDT. The modal UDT algorithm doesn't by itself seem entirely well-motivated; why would material implication be what to search for? On the other hand, every step in the linear logic derivation is quite natural, building action into the logic, and encoding facts about what the agent can be assured of upon taking different actions. This makes it easier to think clearly about what the solution says about counterfactuals, e.g. in a section of this post.

Comment by jessicata (jessica.liu.taylor) on Consistent Glomarization should be feasible · 2020-05-04T20:59:41.157Z · LW · GW

Why lie on the d100 coming up 1 instead of "can neither confirm nor deny"?

Comment by jessicata (jessica.liu.taylor) on "Don't even think about hell" · 2020-05-03T02:10:13.458Z · LW · GW

Note: the provided utility function is incredibly insecure; even a not-very-powerful individual can manipulate the AI by writing down that hash code under certain conditions.

Also, the best way to minimize V + W is to minimize both V and W (i.e. write the hash code and create hell). If we replace this with min(V, W) then the AI becomes nihilistic if someone writes down the hash code, also a significant security vulnerability.

Comment by jessicata (jessica.liu.taylor) on Topological metaphysics: relating point-set topology and locale theory · 2020-05-01T19:32:32.726Z · LW · GW

Reals are still defined as sets of (a, b) rational intervals. The locale contains countable unions of these, but all these are determined by which (a, b) intervals contain the real number.

Comment by jessicata (jessica.liu.taylor) on Topological metaphysics: relating point-set topology and locale theory · 2020-05-01T17:07:56.148Z · LW · GW

Good point; I've changed the wording to make it clear that the rational-delimited open intervals are the basis, not all the locale elements. Luckily, points can be defined as sets of basis elements containing them, since all other properties follow. (Making the locale itself countable requires weakening the definition by making the sets to form unions over countable, e.g. by requiring them to be recursively enumerable)

Comment by jessicata (jessica.liu.taylor) on Motivating Abstraction-First Decision Theory · 2020-04-29T20:36:42.574Z · LW · GW

I've also been thinking about the application of agency abstractions to decision theory, from a somewhat different angle.

It seems like what you're doing is considering relations between high-level third-person abstractions and low-level third-person abstractions. In contrast, I'm primarily considering relations between high-level first-person abstractions and low-level first-person abstractions.

The VNM abstraction itself assumes that "you" are deciding between different options, each of which has different (stochastic) consequences; thus, it is inherently first-personal. (Applying it to some other agent requires conjecturing things about that agent's first-person perspective: the consequences it expects from different actions)

In general, conditions of rationality are first-personal, in the sense that they tell a given perspective what they must believe in order to be consistent.

The determinism vs. free will paradox comes about when trying to determine when a VNM-like choice abstraction is valid of a third-personal physical world.

My present view of physics is that it is also first-personal, in the sense that:

  1. If physical entities are considered perceptible, then there is an assumed relation between them and first-personal observations.
  2. If physical entities are causal in a Pearlian sense, then there is an assumed relation between them and metaphysically-real interventions, which are produced through first-personal actions.

Decision theory problems, considered linguistically, are also first-personal. In the five and ten problem, things are said about "you" being in a given room, choosing between two items on "the" table, presumably the one in front of "you". If the ability to choose different dollar bills is, linguistically, considered a part of the decision problem, then the decision problem already contains in it a first-personal VNM-like choice abstraction.

The naturalization problem is to show how such high-level, first-personal decision theory problems could be compatible with physics. Such naturalization is hard, perhaps impossible, if physics is assumed to be third-personal, but may be possible if physics is assumed to be first-personal.

Comment by jessicata (jessica.liu.taylor) on Subjective implication decision theory in critical agentialism · 2020-04-28T21:54:24.444Z · LW · GW

Looking back on this, it does seem quite similar to EDT. I'm actually, at this point, not clear on how EDT and TDT differ, except in that EDT has potential problems in cases where it's sure about its own action. I'll change the text so it notes the similarity to EDT.

On XOR blackmail, SIDT will indeed pay up.

Comment by jessicata (jessica.liu.taylor) on Two Alternatives to Logical Counterfactuals · 2020-04-12T05:52:57.863Z · LW · GW

Yes, it's about no backwards assumption. Linear has lots of meanings, I'm not concerned about this getting confused with linear algebra, but you can suggest a better term if you have one.

Comment by jessicata (jessica.liu.taylor) on Seemingly Popular Covid-19 Model is Obvious Nonsense · 2020-04-12T00:23:35.484Z · LW · GW

Epistemic Status: Something Is Wrong On The Internet.

If you think this applies, it would seem that "The Internet" is being construed so broadly that it includes the mainstream media, policymaking, and a substantial fraction of people, such that the "Something Is Wrong On The Internet" heuristic points against correction of public disinformation in general.

This is a post that is especially informative, aligned with justice, and likely to save lives, and so it would be a shame if this heuristic were to dissuade you from writing it.

Comment by jessicata (jessica.liu.taylor) on In Defense of Politics · 2020-04-10T22:03:00.977Z · LW · GW

The presumption with conspiracies is that they are engaged in for some local benefit by the conspiracy at the detriment of the broader society. Hence, the "unilateralist's curse" is a blessing in this case, as the overestimation by one member of a conspiracy of their own utility in having the secret exposed, brings their estimation more in line with the estimation of the broader society, whose interests differ from those of the conspirators.

If differences between the interests of different groups were not a problem, then there would be no motive to form a conspiracy.

In general, I am quite annoyed at the idea of the unilateralist's curse being used as a general argument against the revelation of the truth, without careful checking of the correspondence between the decision theoretic model of the unilateralist's curse and the actual situation, which includes crime and conflict.

Comment by jessicata (jessica.liu.taylor) on Solipsism is Underrated · 2020-04-10T20:44:25.573Z · LW · GW

A major problem with physicalist dismissal of experiential evidence (as I've discussed previously) is that the conventional case for believing in physics is that it explains experiential evidence, e.g. experimental results. Solomonoff induction, among the best formalizations of Occam's razor, believes in "my observations".

If basic facts like "I have observations" are being doubted, then any case for belief in physics has to go through something independent of its explanations of experiential evidence. This looks to be a difficult problem.

You could potentially resolve the problem by saying that only some observations, such as those of mechanical measuring devices, count; however, this still leads to an analogous problem to the hard problem of consciousness, namely, what is the mapping between physics and the outputs of the mechanical measuring devices that are being explained by theories? (The same problem comes up of "what data is the theorizing trying to explain" whether the theorizing happens in a single brain or in a distributed intelligence, e.g. a collection of people using the scientific method)

Comment by jessicata (jessica.liu.taylor) on Two Alternatives to Logical Counterfactuals · 2020-04-08T23:52:37.208Z · LW · GW

Basically, the assumption that you're participating in a POMDP. The idea is that there's some hidden state that your actions interact with in a temporally linear fashion (i.e. action 1 affects state 2), such that your late actions can't affect early states/observations.

Comment by jessicata (jessica.liu.taylor) on Two Alternatives to Logical Counterfactuals · 2020-04-07T03:49:35.526Z · LW · GW

The way you are using it doesn’t necessarily imply real control, it may be imaginary control.

I'm discussing a hypothetical agent who believes itself to have control. So its beliefs include "I have free will". Its belief isn't "I believe that I have free will".

It’s a “para-consistent material conditional” by which I mean the algorithm is limited in such a way as to prevent this explosion.

Yes, that makes sense.

However, were you flowing this all the way back in time?

Yes (see thread with Abram Demski).

What do you mean by dualistic?

Already factorized as an agent interacting with an environment.

Comment by jessicata (jessica.liu.taylor) on Two Alternatives to Logical Counterfactuals · 2020-04-06T18:55:42.139Z · LW · GW

Secondly, “free will” is such a loaded word that using it in a non-standard fashion simply obscures and confuses the discussion.

Wikipedia says "Free will is the ability to choose between different possible courses of action unimpeded." SEP says "The term “free will” has emerged over the past two millennia as the canonical designator for a significant kind of control over one’s actions." So my usage seems pretty standard.

For example, recently I’ve been arguing in favour of what counts as a valid counterfactual being at least partially a matter of social convention.

All word definitions are determined in large part by social convention. The question is whether the social convention corresponds to a definition (e.g. with truth conditions) or not. If it does, then the social convention is realist, if not, it's nonrealist (perhaps emotivist, etc).

Material conditions only provide the outcome when we have a consistent counterfactual.

Not necessarily. An agent may be uncertain over its own action, and thus have uncertainty about material conditionals involving its action. The "possible worlds" represented by this uncertainty may be logically inconsistent, in ways the agent can't determine before making the decision.

Proof-based UDT doesn’t quite use material conditionals, it uses a paraconsistent version of them instead.

I don't understand this? I thought it searched for proofs of the form "if I take this action, then I get at least this much utility", which is a material conditional.

So, to imagine counterfactually taking action Y we replace the agent doing X with another agent doing Y and flow causation both forwards and backwards.

Policy-dependent source code does this; one's source code depends on one's policy.

I guess from a philosophical perspective it makes sense to first consider whether policy-dependent source code makes sense and then if it does further ask whether UDT makes sense.

I think UDT makes sense in "dualistic" decision problems that are already factorized as "this policy leads to these consequences". Extending it to a nondualist case brings up difficulties, including the free will / determinism issue. Policy-dependent source code is a way of interpreting UDT in a setting with deterministic, knowable physics.

Comment by jessicata (jessica.liu.taylor) on Two Alternatives to Logical Counterfactuals · 2020-04-05T22:03:54.269Z · LW · GW

I think it's worth examining more closely what it means to be "not a pure optimizer". Formally, a VNM utility function is a rationalization of a coherent policy. Say that you have some idea about what your utility function is, U. Suppose you then decide to follow a policy that does not maximize U. Logically, it follows that U is not really your utility function; either your policy doesn't coherently maximize any utility function, or it maximizes some other utility function. (Because the utility function is, by definition, a rationalization of the policy)

Failing to disambiguate these two notions of "the agent's utility function" is a map-territory error.

Decision theories require, as input, a utility function to maximize, and output a policy. If a decision theory is adopted by an agent who is using it to determine their policy (rather than already knowing their policy), then they are operating on some preliminary idea about what their utility function is. Their "actual" utility function is dependent on their policy; it need not match up with their idea.

So, it is very much possible for an agent who is operating on an idea U of their utility function, to evaluate counterfactuals in which their true behavioral utility function is not U. Indeed, this is implied by the fact that utility functions are rationalizations of policies.

Let's look at the "turn left/right" example. The agent is operating on a utility function idea U, which is higher the more the agent turns left. When they evaluate the policy of turning "right" on the 10th time, they must conclude that, in this hypothetical, either (a) "right" maximizes U, (b) they are maximizing some utility function other than U, or (c) they aren't a maximizer at all.

The logical counterfactual framework says the answer is (a): that the fixed computation of U-maximization results in turning right, not left. But, this is actually the weirdest of the three worlds. It is hard to imagine ways that "right" maximizes U, whereas it is easy to imagine that the agent is maximizing a utility function other than U, or is not a maximizer.

Yes, the (b) and (c) worlds may be weird in a problematic way. However, it is hard to imagine these being nearly as weird as (a).

One way they could be weird is that an agent having a complex utility function is likely to have been produced by a different process than an agent with a simple utility function. So the more weird exceptional decisions you make, the greater the evidence is that you were produced by the sort of process that produces complex utility functions.

This is pretty similar to the smoking lesion problem, then. I expect that policy-dependent source code will have a lot in common with EDT, as they both consider "what sort of agent I am" to be a consequence of one's policy. (However, as you've pointed out, there are important complications with the framing of the smoking lesion problem)

I think further disambiguation on this could benefit from re-analyzing the smoking lesion problem (or a similar problem), but I'm not sure if I have the right set of concepts for this yet.

Comment by jessicata (jessica.liu.taylor) on Referencing the Unreferencable · 2020-04-04T18:54:37.634Z · LW · GW

If you fix a notion of referenceability rather that equivocating, then the point that talking of unreferenceable entities is absurd will stand.

If you equivocate, then very little can be said in general about referenceability.

(I would say that "our universe's simulators" is referenceable, since it's positing something that causes sensory inputs)

Comment by jessicata (jessica.liu.taylor) on Two Alternatives to Logical Counterfactuals · 2020-04-04T18:46:10.569Z · LW · GW

It seems the approaches we're using are similar, in that they both are starting from observation/action history with posited falsifiable laws, with the agent's source code not known a priori, and the agent considering different policies.

Learning "my source code is A" is quite similar to learning "Omega predicts my action is equal to A()", so these would lead to similar results.

Policy-dependent source code, then, corresponds to Omega making different predictions depending on the agent's intended policy, such that when comparing policies, the agent has to imagine Omega predicting differently (as it would imagine learning different source code under policy-dependent source code).

Comment by jessicata (jessica.liu.taylor) on Two Alternatives to Logical Counterfactuals · 2020-04-03T21:48:37.739Z · LW · GW

I agree this is a problem, but isn't this a problem for logical counterfactual approaches as well? Isn't it also weird for a known fixed optimizer source code to produce a different result on this decision where it's obvious that 'left' is the best decision?

If you assume that the agent chose 'right', it's more reasonable to think it's because it's not a pure optimizer than that a pure optimizer would have chosen 'right', in my view.

If you form the intent to, as a policy, go 'right' on the 100th turn, you should anticipate learning that your source code is not the code of a pure optimizer.