Modeling naturalized decision problems in linear logic 2020-05-06T00:15:15.400Z · score: 15 (5 votes)
Topological metaphysics: relating point-set topology and locale theory 2020-05-01T03:57:11.899Z · score: 13 (4 votes)
Two Alternatives to Logical Counterfactuals 2020-04-01T09:48:29.619Z · score: 30 (10 votes)
The absurdity of un-referenceable entities 2020-03-14T17:40:37.750Z · score: 22 (8 votes)
Puzzles for Physicalists 2020-03-12T01:37:13.353Z · score: 43 (18 votes)
A conversation on theory of mind, subjectivity, and objectivity 2020-03-10T04:59:23.266Z · score: 13 (4 votes)
Subjective implication decision theory in critical agentialism 2020-03-05T23:30:42.694Z · score: 16 (5 votes)
A critical agential account of free will, causation, and physics 2020-03-05T07:57:38.193Z · score: 18 (8 votes)
On the falsifiability of hypercomputation, part 2: finite input streams 2020-02-17T03:51:57.238Z · score: 21 (5 votes)
On the falsifiability of hypercomputation 2020-02-07T08:16:07.268Z · score: 26 (5 votes)
Philosophical self-ratification 2020-02-03T22:48:46.985Z · score: 25 (6 votes)
High-precision claims may be refuted without being replaced with other high-precision claims 2020-01-30T23:08:33.792Z · score: 63 (26 votes)
On hiding the source of knowledge 2020-01-26T02:48:51.310Z · score: 109 (33 votes)
On the ontological development of consciousness 2020-01-25T05:56:43.244Z · score: 37 (13 votes)
Is requires ought 2019-10-28T02:36:43.196Z · score: 23 (10 votes)
Metaphorical extensions and conceptual figure-ground inversions 2019-07-24T06:21:54.487Z · score: 34 (9 votes)
Dialogue on Appeals to Consequences 2019-07-18T02:34:52.497Z · score: 35 (21 votes)
Why artificial optimism? 2019-07-15T21:41:24.223Z · score: 64 (21 votes)
The AI Timelines Scam 2019-07-11T02:52:58.917Z · score: 44 (79 votes)
Self-consciousness wants to make everything about itself 2019-07-03T01:44:41.204Z · score: 43 (30 votes)
Writing children's picture books 2019-06-25T21:43:45.578Z · score: 114 (37 votes)
Conditional revealed preference 2019-04-16T19:16:55.396Z · score: 18 (7 votes)
Boundaries enable positive material-informational feedback loops 2018-12-22T02:46:48.938Z · score: 30 (12 votes)
Act of Charity 2018-11-17T05:19:20.786Z · score: 181 (68 votes)
EDT solves 5 and 10 with conditional oracles 2018-09-30T07:57:35.136Z · score: 62 (19 votes)
Reducing collective rationality to individual optimization in common-payoff games using MCMC 2018-08-20T00:51:29.499Z · score: 58 (18 votes)
Buridan's ass in coordination games 2018-07-16T02:51:30.561Z · score: 55 (19 votes)
Decision theory and zero-sum game theory, NP and PSPACE 2018-05-24T08:03:18.721Z · score: 111 (37 votes)
In the presence of disinformation, collective epistemology requires local modeling 2017-12-15T09:54:09.543Z · score: 127 (44 votes)
Autopoietic systems and difficulty of AGI alignment 2017-08-20T01:05:10.000Z · score: 9 (4 votes)
Current thoughts on Paul Christano's research agenda 2017-07-16T21:08:47.000Z · score: 19 (9 votes)
Why I am not currently working on the AAMLS agenda 2017-06-01T17:57:24.000Z · score: 19 (10 votes)
A correlated analogue of reflective oracles 2017-05-07T07:00:38.000Z · score: 4 (4 votes)
Finding reflective oracle distributions using a Kakutani map 2017-05-02T02:12:06.000Z · score: 1 (1 votes)
Some problems with making induction benign, and approaches to them 2017-03-27T06:49:54.000Z · score: 3 (3 votes)
Maximally efficient agents will probably have an anti-daemon immune system 2017-02-23T00:40:47.000Z · score: 4 (4 votes)
Are daemons a problem for ideal agents? 2017-02-11T08:29:26.000Z · score: 5 (2 votes)
How likely is a random AGI to be honest? 2017-02-11T03:32:22.000Z · score: 1 (1 votes)
My current take on the Paul-MIRI disagreement on alignability of messy AI 2017-01-29T20:52:12.000Z · score: 17 (9 votes)
On motivations for MIRI's highly reliable agent design research 2017-01-29T19:34:37.000Z · score: 19 (10 votes)
Strategies for coalitions in unit-sum games 2017-01-23T04:20:31.000Z · score: 3 (3 votes)
An impossibility result for doing without good priors 2017-01-20T05:44:26.000Z · score: 1 (1 votes)
Pursuing convergent instrumental subgoals on the user's behalf doesn't always require good priors 2016-12-30T02:36:48.000Z · score: 7 (5 votes)
Predicting HCH using expert advice 2016-11-28T03:38:05.000Z · score: 5 (4 votes)
ALBA requires incremental design of good long-term memory systems 2016-11-28T02:10:53.000Z · score: 1 (1 votes)
Modeling the capabilities of advanced AI systems as episodic reinforcement learning 2016-08-19T02:52:13.000Z · score: 4 (2 votes)
Generative adversarial models, informed by arguments 2016-06-27T19:28:27.000Z · score: 0 (0 votes)
In memoryless Cartesian environments, every UDT policy is a CDT+SIA policy 2016-06-11T04:05:47.000Z · score: 21 (5 votes)
Two problems with causal-counterfactual utility indifference 2016-05-26T06:21:07.000Z · score: 3 (3 votes)
Anything you can do with n AIs, you can do with two (with directly opposed objectives) 2016-05-04T23:14:31.000Z · score: 2 (2 votes)


Comment by jessica-liu-taylor on Why artificial optimism? · 2020-06-13T06:49:05.078Z · score: 4 (2 votes) · LW · GW

I don't have a great theory here, but some pointers at non-hedonic values are:

  • "Wanting" as a separate thing from "liking"; what is planned/steered towards, versus what affective states are generated? See this. In a literal sense, people don't very much want to be happy.
  • It's common to speak in terms of "mental functions", e.g. perception and planning. The mind has a sort of "telos"/direction, which is not primarily towards maximizing happiness (if it were, we'd be happier); rather, the happiness signal has a function as part of the mind's functioning.
  • The desire to not be deceived, or to be correct, requires a correspondence between states of mind and objective states. To be deceived about, say, which mathematical results are true/interesting, means to explore a much more impoverished space of mathematical reasoning, than one could by having intact mathematical judgment.
  • Related to deception, social emotions are referential: they refer to other beings. The emotion can be present without the other beings existing, but this is a case of deception. Living in a simulation in which all apparent intelligent beings are actually (convincing) nonsentient robots seems undesirable.
  • Desire for variety. Having the same happy mind replicated everywhere is unsatisfying compared to having a diversity of mental states being explored. Perhaps you could erase your memory so you could re-experience the same great movie/art/whatever repeatedly, but would you want to?
  • Relatedly, the best art integrates positive and negative emotions. Having only positive emotions is like painting using only warm colors.

In epistemic matters we accept that beliefs about what is true may be wrong, in the sense that they may be incoherent, incompatible with other information, fail to take into account certain hypotheses, etc. Similarly, we may accept that beliefs about the quality of one's experience may be wrong, in that they may be incoherent, incompatible with other information, fail to take into account certain hypotheses, etc. There has to be a starting point for investigation (as there is in epistemic matters), which might or might not be hedonic, but coherence criteria and so on will modify the starting point.

I suspect that some of my opinions here are influenced by certain meditative experiences that reduce the degree to which experiential valence seems important, in comparison to variety, coherence, and functionality.

Comment by jessica-liu-taylor on Why artificial optimism? · 2020-06-13T05:25:06.864Z · score: 6 (3 votes) · LW · GW

Experiences of sentient beings are valuable, but have to be "about" something to properly be experiences, rather than, say, imagination.

I would rather that conditions in the universe are good for the lifeforms, and that the lifeforms' emotions track the situation, such that the lifeforms are happy. But if the universe is bad, then it's better (IMO) for the lifeforms to be sad about that.

The issue with evolution is that it's a puzzle that evolution would create animals that try to wirehead themselves, it's not a moral argument against wireheading.

Comment by jessica-liu-taylor on Why artificial optimism? · 2020-06-13T00:06:18.689Z · score: 4 (2 votes) · LW · GW

"Isn't the score I get in the game I'm playing one of the most important part of the 'actual state of affairs'? How would you measure the value of the actual state of affairs other than according to how it affects your (or others') scores?"

I'm not sure if this analogy is, by itself, convincing. But, it's suggestive, in that happiness is a simple, scalar-like thing, and it would be strange for such a simple thing to have a high degree of intrinsic value. Rather, on a broad perspective, it would seem that those things of most intrinsic value are those things that are computationally interesting, which can explore and cohere different sources of information, etc, rather than very simple scalars. (Of course, scalars can offer information about other things)

On an evolutionary account, why would it be fit for an organism care about a scalar quantity, except in that that quantity is correlated with the organism's fitness? It would seem that wireheading is a bug, from a design perspective.

Comment by jessica-liu-taylor on Jimrandomh's Shortform · 2020-06-10T15:45:15.495Z · score: 10 (4 votes) · LW · GW

It's been over 72 hours and the case count is under 110, as would be expected from linear extrapolation.

Comment by jessica-liu-taylor on Jimrandomh's Shortform · 2020-06-10T15:44:49.912Z · score: 2 (1 votes) · LW · GW

It's been over 72 hours and the case count is under 110.

Comment by jessica-liu-taylor on Estimating COVID-19 Mortality Rates · 2020-06-07T22:50:07.470Z · score: 6 (3 votes) · LW · GW

The intro paragraph seems to be talking about IFR ("around 2% of people who got COVID-19 would die") and suggesting that "we have enough data to check", i.e. that you're estimating IFR and have good data on it.

Comment by jessica-liu-taylor on The Presumptuous Philosopher, self-locating information, and Solomonoff induction · 2020-06-01T15:45:47.437Z · score: 2 (1 votes) · LW · GW

I mean efficiently in terms of number of bits, not computation time. Which contributes to posterior probability.

Comment by jessica-liu-taylor on The Presumptuous Philosopher, self-locating information, and Solomonoff induction · 2020-06-01T15:10:04.132Z · score: 4 (2 votes) · LW · GW

Yes, I agree. "Reference class" is a property of some models, not all models.

Comment by jessica-liu-taylor on The Presumptuous Philosopher, self-locating information, and Solomonoff induction · 2020-06-01T14:48:19.817Z · score: 4 (2 votes) · LW · GW

At this point it seems simplest to construct your reference class so as to only contain agents that can be found using the same procedure as yourself. Since you have to be decidable for the hypothesis to predict your observations, all others in your reference class are also decidable.

Comment by jessica-liu-taylor on The Presumptuous Philosopher, self-locating information, and Solomonoff induction · 2020-06-01T03:39:32.762Z · score: 7 (4 votes) · LW · GW

If there's a constant-length function mapping the universe description to the number of agents in that universe, doesn't that mean K(n) can't be more than the Kolmogorov complexity of the universe by more than that constant length?

If it isn't constant-length, then it seems strange to assume Solomonoff induction would posit a large objective universe, given that such positing wouldn't help it predict its inputs efficiently (since such prediction requires locating agents).

This still leads to the behavior I'm talking about in the limit; the sum of 1/2^K(n) over all n can be at most 1 so the probabilities on any particular n have to go arbitrarily small in the limit.

Comment by jessica-liu-taylor on The Presumptuous Philosopher, self-locating information, and Solomonoff induction · 2020-05-31T20:12:40.757Z · score: 4 (2 votes) · LW · GW

My understanding is that Solomonoff induction leads to more SSA-like behavior than SIA-like, at least in the limit, so will reject the presumptuous philosopher's argument.

Asserting that there are n people takes at least K(n) bits, so large universe sizes have to get less likely at some point.

Comment by jessica-liu-taylor on Nihilism doesn't matter · 2020-05-21T19:09:35.365Z · score: 2 (1 votes) · LW · GW

Active nihilism described in the paragraph definitely includes, but is not limited to, the negation of values. The active nihilists of a moral parliament may paralyze the parliament as a means to an end; perhaps, to cause systems other than the moral parliament to be the primary determinants of action, rather than the moral parliament.

Comment by jessica-liu-taylor on Nihilism doesn't matter · 2020-05-21T18:39:04.151Z · score: 2 (1 votes) · LW · GW

What you are describing is a passive sort of nihilism. Active nihilism, on the other hand, would actively try to negate the other values. Imagine a parliament where whenever a non-nihilist votes in favor of X, a nihilist votes against X, such that these votes exactly cancel out. Now, if (active) nihilists are a majority, they will ensure that the parliament as a whole has no aggregate preferences.

Comment by jessica-liu-taylor on Modeling naturalized decision problems in linear logic · 2020-05-17T15:23:00.849Z · score: 2 (1 votes) · LW · GW

CDT and EDT have known problems on 5 and 10. TDT/UDT are insufficiently formalized, and seem like they might rely on known-to-be-unfomalizable logical counterfactuals.

So 5 and 10 isn't trivial even without spurious counterfactuals.

What does this add over modal UDT?

  • No requirement to do infinite proof search
  • More elegant handling of multi-step decision problems
  • Also works on problems where the agent doesn't know its source code (of course, this prevents logical dependencies due to source code from being taken into account)

Philosophically, it works as a nice derivation of similar conclusions to modal UDT. The modal UDT algorithm doesn't by itself seem entirely well-motivated; why would material implication be what to search for? On the other hand, every step in the linear logic derivation is quite natural, building action into the logic, and encoding facts about what the agent can be assured of upon taking different actions. This makes it easier to think clearly about what the solution says about counterfactuals, e.g. in a section of this post.

Comment by jessica-liu-taylor on Consistent Glomarization should be feasible · 2020-05-04T20:59:41.157Z · score: 4 (2 votes) · LW · GW

Why lie on the d100 coming up 1 instead of "can neither confirm nor deny"?

Comment by jessica-liu-taylor on "Don't even think about hell" · 2020-05-03T02:10:13.458Z · score: 5 (3 votes) · LW · GW

Note: the provided utility function is incredibly insecure; even a not-very-powerful individual can manipulate the AI by writing down that hash code under certain conditions.

Also, the best way to minimize V + W is to minimize both V and W (i.e. write the hash code and create hell). If we replace this with min(V, W) then the AI becomes nihilistic if someone writes down the hash code, also a significant security vulnerability.

Comment by jessica-liu-taylor on Topological metaphysics: relating point-set topology and locale theory · 2020-05-01T19:32:32.726Z · score: 3 (2 votes) · LW · GW

Reals are still defined as sets of (a, b) rational intervals. The locale contains countable unions of these, but all these are determined by which (a, b) intervals contain the real number.

Comment by jessica-liu-taylor on Topological metaphysics: relating point-set topology and locale theory · 2020-05-01T17:07:56.148Z · score: 5 (3 votes) · LW · GW

Good point; I've changed the wording to make it clear that the rational-delimited open intervals are the basis, not all the locale elements. Luckily, points can be defined as sets of basis elements containing them, since all other properties follow. (Making the locale itself countable requires weakening the definition by making the sets to form unions over countable, e.g. by requiring them to be recursively enumerable)

Comment by jessica-liu-taylor on Motivating Abstraction-First Decision Theory · 2020-04-29T20:36:42.574Z · score: 12 (6 votes) · LW · GW

I've also been thinking about the application of agency abstractions to decision theory, from a somewhat different angle.

It seems like what you're doing is considering relations between high-level third-person abstractions and low-level third-person abstractions. In contrast, I'm primarily considering relations between high-level first-person abstractions and low-level first-person abstractions.

The VNM abstraction itself assumes that "you" are deciding between different options, each of which has different (stochastic) consequences; thus, it is inherently first-personal. (Applying it to some other agent requires conjecturing things about that agent's first-person perspective: the consequences it expects from different actions)

In general, conditions of rationality are first-personal, in the sense that they tell a given perspective what they must believe in order to be consistent.

The determinism vs. free will paradox comes about when trying to determine when a VNM-like choice abstraction is valid of a third-personal physical world.

My present view of physics is that it is also first-personal, in the sense that:

  1. If physical entities are considered perceptible, then there is an assumed relation between them and first-personal observations.
  2. If physical entities are causal in a Pearlian sense, then there is an assumed relation between them and metaphysically-real interventions, which are produced through first-personal actions.

Decision theory problems, considered linguistically, are also first-personal. In the five and ten problem, things are said about "you" being in a given room, choosing between two items on "the" table, presumably the one in front of "you". If the ability to choose different dollar bills is, linguistically, considered a part of the decision problem, then the decision problem already contains in it a first-personal VNM-like choice abstraction.

The naturalization problem is to show how such high-level, first-personal decision theory problems could be compatible with physics. Such naturalization is hard, perhaps impossible, if physics is assumed to be third-personal, but may be possible if physics is assumed to be first-personal.

Comment by jessica-liu-taylor on Subjective implication decision theory in critical agentialism · 2020-04-28T21:54:24.444Z · score: 6 (3 votes) · LW · GW

Looking back on this, it does seem quite similar to EDT. I'm actually, at this point, not clear on how EDT and TDT differ, except in that EDT has potential problems in cases where it's sure about its own action. I'll change the text so it notes the similarity to EDT.

On XOR blackmail, SIDT will indeed pay up.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-12T05:52:57.863Z · score: 2 (1 votes) · LW · GW

Yes, it's about no backwards assumption. Linear has lots of meanings, I'm not concerned about this getting confused with linear algebra, but you can suggest a better term if you have one.

Comment by jessica-liu-taylor on Seemingly Popular Covid-19 Model is Obvious Nonsense · 2020-04-12T00:23:35.484Z · score: 12 (11 votes) · LW · GW

Epistemic Status: Something Is Wrong On The Internet.

If you think this applies, it would seem that "The Internet" is being construed so broadly that it includes the mainstream media, policymaking, and a substantial fraction of people, such that the "Something Is Wrong On The Internet" heuristic points against correction of public disinformation in general.

This is a post that is especially informative, aligned with justice, and likely to save lives, and so it would be a shame if this heuristic were to dissuade you from writing it.

Comment by jessica-liu-taylor on In Defense of Politics · 2020-04-10T22:03:00.977Z · score: 21 (6 votes) · LW · GW

The presumption with conspiracies is that they are engaged in for some local benefit by the conspiracy at the detriment of the broader society. Hence, the "unilateralist's curse" is a blessing in this case, as the overestimation by one member of a conspiracy of their own utility in having the secret exposed, brings their estimation more in line with the estimation of the broader society, whose interests differ from those of the conspirators.

If differences between the interests of different groups were not a problem, then there would be no motive to form a conspiracy.

In general, I am quite annoyed at the idea of the unilateralist's curse being used as a general argument against the revelation of the truth, without careful checking of the correspondence between the decision theoretic model of the unilateralist's curse and the actual situation, which includes crime and conflict.

Comment by jessica-liu-taylor on Solipsism is Underrated · 2020-04-10T20:44:25.573Z · score: 3 (2 votes) · LW · GW

A major problem with physicalist dismissal of experiential evidence (as I've discussed previously) is that the conventional case for believing in physics is that it explains experiential evidence, e.g. experimental results. Solomonoff induction, among the best formalizations of Occam's razor, believes in "my observations".

If basic facts like "I have observations" are being doubted, then any case for belief in physics has to go through something independent of its explanations of experiential evidence. This looks to be a difficult problem.

You could potentially resolve the problem by saying that only some observations, such as those of mechanical measuring devices, count; however, this still leads to an analogous problem to the hard problem of consciousness, namely, what is the mapping between physics and the outputs of the mechanical measuring devices that are being explained by theories? (The same problem comes up of "what data is the theorizing trying to explain" whether the theorizing happens in a single brain or in a distributed intelligence, e.g. a collection of people using the scientific method)

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-08T23:52:37.208Z · score: 2 (1 votes) · LW · GW

Basically, the assumption that you're participating in a POMDP. The idea is that there's some hidden state that your actions interact with in a temporally linear fashion (i.e. action 1 affects state 2), such that your late actions can't affect early states/observations.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-07T03:49:35.526Z · score: 2 (1 votes) · LW · GW

The way you are using it doesn’t necessarily imply real control, it may be imaginary control.

I'm discussing a hypothetical agent who believes itself to have control. So its beliefs include "I have free will". Its belief isn't "I believe that I have free will".

It’s a “para-consistent material conditional” by which I mean the algorithm is limited in such a way as to prevent this explosion.

Yes, that makes sense.

However, were you flowing this all the way back in time?

Yes (see thread with Abram Demski).

What do you mean by dualistic?

Already factorized as an agent interacting with an environment.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-06T18:55:42.139Z · score: 4 (2 votes) · LW · GW

Secondly, “free will” is such a loaded word that using it in a non-standard fashion simply obscures and confuses the discussion.

Wikipedia says "Free will is the ability to choose between different possible courses of action unimpeded." SEP says "The term “free will” has emerged over the past two millennia as the canonical designator for a significant kind of control over one’s actions." So my usage seems pretty standard.

For example, recently I’ve been arguing in favour of what counts as a valid counterfactual being at least partially a matter of social convention.

All word definitions are determined in large part by social convention. The question is whether the social convention corresponds to a definition (e.g. with truth conditions) or not. If it does, then the social convention is realist, if not, it's nonrealist (perhaps emotivist, etc).

Material conditions only provide the outcome when we have a consistent counterfactual.

Not necessarily. An agent may be uncertain over its own action, and thus have uncertainty about material conditionals involving its action. The "possible worlds" represented by this uncertainty may be logically inconsistent, in ways the agent can't determine before making the decision.

Proof-based UDT doesn’t quite use material conditionals, it uses a paraconsistent version of them instead.

I don't understand this? I thought it searched for proofs of the form "if I take this action, then I get at least this much utility", which is a material conditional.

So, to imagine counterfactually taking action Y we replace the agent doing X with another agent doing Y and flow causation both forwards and backwards.

Policy-dependent source code does this; one's source code depends on one's policy.

I guess from a philosophical perspective it makes sense to first consider whether policy-dependent source code makes sense and then if it does further ask whether UDT makes sense.

I think UDT makes sense in "dualistic" decision problems that are already factorized as "this policy leads to these consequences". Extending it to a nondualist case brings up difficulties, including the free will / determinism issue. Policy-dependent source code is a way of interpreting UDT in a setting with deterministic, knowable physics.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-05T22:03:54.269Z · score: 6 (3 votes) · LW · GW

I think it's worth examining more closely what it means to be "not a pure optimizer". Formally, a VNM utility function is a rationalization of a coherent policy. Say that you have some idea about what your utility function is, U. Suppose you then decide to follow a policy that does not maximize U. Logically, it follows that U is not really your utility function; either your policy doesn't coherently maximize any utility function, or it maximizes some other utility function. (Because the utility function is, by definition, a rationalization of the policy)

Failing to disambiguate these two notions of "the agent's utility function" is a map-territory error.

Decision theories require, as input, a utility function to maximize, and output a policy. If a decision theory is adopted by an agent who is using it to determine their policy (rather than already knowing their policy), then they are operating on some preliminary idea about what their utility function is. Their "actual" utility function is dependent on their policy; it need not match up with their idea.

So, it is very much possible for an agent who is operating on an idea U of their utility function, to evaluate counterfactuals in which their true behavioral utility function is not U. Indeed, this is implied by the fact that utility functions are rationalizations of policies.

Let's look at the "turn left/right" example. The agent is operating on a utility function idea U, which is higher the more the agent turns left. When they evaluate the policy of turning "right" on the 10th time, they must conclude that, in this hypothetical, either (a) "right" maximizes U, (b) they are maximizing some utility function other than U, or (c) they aren't a maximizer at all.

The logical counterfactual framework says the answer is (a): that the fixed computation of U-maximization results in turning right, not left. But, this is actually the weirdest of the three worlds. It is hard to imagine ways that "right" maximizes U, whereas it is easy to imagine that the agent is maximizing a utility function other than U, or is not a maximizer.

Yes, the (b) and (c) worlds may be weird in a problematic way. However, it is hard to imagine these being nearly as weird as (a).

One way they could be weird is that an agent having a complex utility function is likely to have been produced by a different process than an agent with a simple utility function. So the more weird exceptional decisions you make, the greater the evidence is that you were produced by the sort of process that produces complex utility functions.

This is pretty similar to the smoking lesion problem, then. I expect that policy-dependent source code will have a lot in common with EDT, as they both consider "what sort of agent I am" to be a consequence of one's policy. (However, as you've pointed out, there are important complications with the framing of the smoking lesion problem)

I think further disambiguation on this could benefit from re-analyzing the smoking lesion problem (or a similar problem), but I'm not sure if I have the right set of concepts for this yet.

Comment by jessica-liu-taylor on Referencing the Unreferencable · 2020-04-04T18:54:37.634Z · score: 4 (2 votes) · LW · GW

If you fix a notion of referenceability rather that equivocating, then the point that talking of unreferenceable entities is absurd will stand.

If you equivocate, then very little can be said in general about referenceability.

(I would say that "our universe's simulators" is referenceable, since it's positing something that causes sensory inputs)

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-04T18:46:10.569Z · score: 3 (2 votes) · LW · GW

It seems the approaches we're using are similar, in that they both are starting from observation/action history with posited falsifiable laws, with the agent's source code not known a priori, and the agent considering different policies.

Learning "my source code is A" is quite similar to learning "Omega predicts my action is equal to A()", so these would lead to similar results.

Policy-dependent source code, then, corresponds to Omega making different predictions depending on the agent's intended policy, such that when comparing policies, the agent has to imagine Omega predicting differently (as it would imagine learning different source code under policy-dependent source code).

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-03T21:48:37.739Z · score: 5 (3 votes) · LW · GW

I agree this is a problem, but isn't this a problem for logical counterfactual approaches as well? Isn't it also weird for a known fixed optimizer source code to produce a different result on this decision where it's obvious that 'left' is the best decision?

If you assume that the agent chose 'right', it's more reasonable to think it's because it's not a pure optimizer than that a pure optimizer would have chosen 'right', in my view.

If you form the intent to, as a policy, go 'right' on the 100th turn, you should anticipate learning that your source code is not the code of a pure optimizer.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-03T21:43:43.291Z · score: 2 (1 votes) · LW · GW

This indeed makes sense when "obs" is itself a logical fact. If obs is a sensory input, though, 'A(obs) = act' is a logical fact, not a logical counterfactual. (I'm not trying to avoid causal interpretations of source code interpreters here, just logical counterfactuals)

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-02T22:20:08.537Z · score: 9 (5 votes) · LW · GW

In the happy dance problem, when the agent is considering doing a happy dance, the agent should have already updated on M. This is more like timeless decision theory than updateless decision theory.

Conditioning on 'A(obs) = act' is still a conditional, not a counterfactual. The difference between conditionals and counterfactuals is the difference between "If Oswald didn't kill Kennedy, then someone else did" and "If Oswald didn't kill Kennedy, then someone else would have".

Indeed, troll bridge will present a problem for "playing chicken" approaches, which are probably necessary in counterfactual nonrealism.

For policy-dependent source code, I intend for the agent to be logically updateful, while updateless about observations.

Why is this much better than counterfactuals which keep the source code fixed but imagine the execution trace being different?

Because it doesn't lead to logical incoherence, so reasoning about counterfactuals doesn't have to be limited.

This seems to only push the rough spots further back—there can still be contradictions, e.g. between the source code and the process by which programmers wrote the source code.

If you see your source code is B instead of A, you should anticipate learning that the programmers programmed B instead of A, which means something was different in the process. So the counterfactual has implications backwards in physical time.

At some point it will ground out in: different indexical facts, different laws of physics, different initial conditions, different random events...

This theory isn't worked out yet but it doesn't yet seem that it will run into logical incoherence, the way logical counterfactuals do.

But then we are faced with the usual questions about spurious counterfactuals, chicken rule, exploration, and Troll Bridge.

Maybe some of these.

Spurious counterfactuals require getting a proof of "I will take action X". The proof proceeds by showing "source code A outputs action X". But an agent who accepts policy-dependent source code will believe they have source code other than A if they don't take action X. So the spurious proof doesn't prevent the counterfactual from being evaluated.

Chicken rule is hence unnecessary.

Exploration is a matter of whether the world model is any good; the world model may, for example, map a policy to a distribution of expected observations. (That is, the world model already has policy counterfactuals as part of it; theories such as physics provide constraints on the world model rather than fully determining it). Learning a good world model is of course a problem in any approach.

Whether troll bridge is a problem depends on how the source code counterfactual is evaluated. Indeed, many ways of running this counterfactual (e.g. inserting special cases into the source code) are "stupid" and could be punished in a troll bridge problem.

I by no means think "policy-dependent source code" is presently a well worked-out theory; the advantage relative to logical counterfactuals is that in the latter case, there is a strong theoretical obstacle to ever having a well worked-out theory, namely logical incoherence of the counterfactuals. Hence, coming up with a theory of policy-dependent source code seems more likely to succeed than coming up with a theory of logical counterfactuals.

Comment by jessica-liu-taylor on The absurdity of un-referenceable entities · 2020-04-02T20:39:16.322Z · score: 6 (3 votes) · LW · GW

It seems fine to have categories that are necessarily empty. Such as "numbers that are both odd and even". "Non-ontologizable thing" may be such a set. Or it may be more vague than that, I'm not sure.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-02T18:10:55.234Z · score: 2 (1 votes) · LW · GW

I'm not using "free will" to mean something distinct from "the ability of an agent, from its perspective, to choose one of multiple possible actions". Maybe this usage is nonstandard but find/replace yields the right meaning.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-02T00:50:02.362Z · score: 4 (2 votes) · LW · GW

For counterfactual nonrealism, it's simply the uncertainty an agent has about their own action, while believing themselves to control their action.

For policy-dependent source code, the "different possibilities" correspond to different source code. An agent with fixed source code can only take one possible action (from a logically omniscent perspective), but the counterfactuals change the agent's source code, getting around this constraint.

Comment by jessica-liu-taylor on The absurdity of un-referenceable entities · 2020-04-01T22:33:26.774Z · score: 4 (2 votes) · LW · GW

The absurdity comes not from believing that some agent lacks the ability to reference some entity that you can reference, but from believing that you lack the ability to reference some entity that you are nonetheless talking about.

In the second case, you are ontologizing something that is by definition not ontologizable.

If there's a particular agent thinking about me, then I can refer to that agent ("the one thinking about me"), hence referring to whatever they can refer to. It is indeed easy to neglect the possibility that someone is thinking about me, but that differs from in-principle unreferenceability.

I don't believe in views from nowhere; I don't think the concept holds up to scrutiny. In contrast, particular directions of zoom-out lead to views from particular referenceable places.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-01T19:05:53.402Z · score: 2 (1 votes) · LW · GW

Those are the conditions in which logical counterfactuals are most well-motivated. If there isn't determinism or known source code then there isn't an obvious reason to be considering impossible possible worlds.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-01T18:38:58.125Z · score: 2 (1 votes) · LW · GW

They are logically incoherent in themselves though. Suppose the agent's source code is "A". Suppose that in fact, A returns action X. Consider a logical counterfactual "possible world" where A returns action Y. In this logical counterfactual, it is possible to deduce a contradiction: A returns X (by computation/logic) and returns Y (by assumption) and X is not equal to Y. Hence by the principle of explosion, everything is true.

It isn't necessary to observe that A returns X in real life, it can be deduced from logic.

(Note that this doesn't exclude the logical material conditionals described in the post, only logical counterfactuals)

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-01T18:14:04.589Z · score: 4 (2 votes) · LW · GW

Yes, this is a specific way of doing policy-dependent source code, which minimizes how much the source code has to change to handle the counterfactual.

Haven't looked deeply into the paper yet but the basic idea seems sound.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-01T18:06:45.065Z · score: 2 (1 votes) · LW · GW

They're logically incoherent so your reasoning about them is limited. If you gain in computing power then you need to stop being a realist about them or else your reasoning explodes.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-01T17:42:20.725Z · score: 2 (1 votes) · LW · GW

This is exactly what is described in the counterfactual nonrealism section.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-01T16:56:26.539Z · score: 4 (2 votes) · LW · GW

Without some assumption similar to "free will" it is hard to do any decision theory at all, as you can't compare different actions; there is only one possible action.

The counterfactual nonrealist position is closer to determinism than the policy-dependent source code position. This assumes that the algorithm controls the decision while the output of the algorithm is unknown.

Comment by jessica-liu-taylor on Two Alternatives to Logical Counterfactuals · 2020-04-01T16:52:36.461Z · score: 4 (2 votes) · LW · GW

The summary is correct.

Indeed, it is underdetermined what the alternative source code is. Sometimes it doesn't matter (this is the case in most decision problems), and sometimes there is a family of programs that can be assumed. But this still presents theoretical problems.

The motivation is to be a nonrealist about logical counterfactuals while being a realist about some counterfactuals.

Comment by jessica-liu-taylor on How special are human brains among animal brains? · 2020-04-01T05:56:25.070Z · score: 7 (4 votes) · LW · GW

The most quintessentially human intellectual accomplishments (e.g. proving theorems, composing symphonies, going into space) were only made possible by culture post-agricultural revolution.

I'm guessing you mean the beginning of agriculture and not the Agricultural Revolution (18th century), which came much later than math and after Baroque music. But the wording is ambiguous.

Comment by jessica-liu-taylor on A critical agential account of free will, causation, and physics · 2020-03-30T22:26:48.316Z · score: 2 (1 votes) · LW · GW

It's a subjectivist approach similar to Bayesianism, starting from the perspective of a given subject. Unlike in idealism, there is no assertion that everything is mental.

Comment by jessica-liu-taylor on Thinking About Filtered Evidence Is (Very!) Hard · 2020-03-23T18:24:36.348Z · score: 2 (1 votes) · LW · GW

Thanks to the rich hypothesis space assumption, the listener will assign some probability to the speaker enumerating theorems of PA (Peano Arithmetic). Since this hypothesis makes distinct predictions, it is possible for the confidence to rise above 50% after finitely many observations.

But this isn't computable so it won't be contained even in a rich hypothesis space. It seems like you're simultaneously assuming that PA is computable (so it's in the rich hypothesis space) and uncomputable (to reach contradiction).

On the other hand, you could consider the listener's beliefs to be expressed as a function of the speaker's statements, in which case the function is computable (being the identity function). However, in this case it is unsurprising that the listener's beliefs (as the output of the function) are uncomputable, as they are a function of the speaker's statements, which were not assumed to be computable.

Comment by jessica-liu-taylor on Why Telling People They Don't Need Masks Backfired · 2020-03-18T07:42:33.872Z · score: 8 (6 votes) · LW · GW

It takes 10x as much work to refute bullshit than to produce it, and 100x as much work to show that the bullshit has net negative consequences as it does to produce it (numbers approximate, of course). This has strong implications for social epistemology, constitutional law, and security.

Comment by jessica-liu-taylor on A critical agential account of free will, causation, and physics · 2020-03-16T20:39:30.237Z · score: 2 (1 votes) · LW · GW

Let R be a relation. Then "R(X, Y)" contradicts "not R(X, Y)". Of course people can misuse language to be tricky about this.

Special relativity is falsifiable even though it defines position/velocity relationally.

Falsifiability, properly understood, is subjective, in that the falsifier must be some cognitive process that can make observations. Experimental results can only falsify theories if those results are observed by some cognitive process that can conceptualize the theory. Unobservable experimental results are of no use.

(Yes, the cognitive process may be a standardized intersubjective, if the observations and theories are common knowledge; Popper emphasizes this intersubjectivity in The Logic of Scientific Discovery. However, if Robinson Crusoe is theoretically capable of science, this intersubjectivity is not strictly necessary)

Comment by jessica-liu-taylor on A Sketch of Answers for Physicalists · 2020-03-15T21:09:42.039Z · score: 4 (2 votes) · LW · GW

See this thread on Game of Life.

The simulator's perspective is outside "our universe" but not outside the totality; there are multiple possible simulable universes, like there are different video games. Mario's (hypothetical) notion of "a view outside this world" refers to a view of the world of Super Mario Bros, and this differs depending on the video game character. Additionally, the video game players / simulators live in their own world, which is part of the totality.

Any given perspective can imagine zooming out by a given "distance" (in terms of space, time, simulation level, multiversal branch, perhaps others). This yields a sequence of views, each of which is dependent on the initial perspective. Perhaps the "view from nowhere" may be considered as the limit of this process. I am not convinced this limit may be coherently reified as a referenceable thing, however. In addition, such a view would be infinitely far from our own, and it would take an infinite time to zoom in from there to our actual here-and-now location.