Posts

Is requires ought 2019-10-28T02:36:43.196Z · score: 23 (10 votes)
Metaphorical extensions and conceptual figure-ground inversions 2019-07-24T06:21:54.487Z · score: 34 (9 votes)
Dialogue on Appeals to Consequences 2019-07-18T02:34:52.497Z · score: 33 (20 votes)
Why artificial optimism? 2019-07-15T21:41:24.223Z · score: 57 (17 votes)
The AI Timelines Scam 2019-07-11T02:52:58.917Z · score: 47 (76 votes)
Self-consciousness wants to make everything about itself 2019-07-03T01:44:41.204Z · score: 43 (30 votes)
Writing children's picture books 2019-06-25T21:43:45.578Z · score: 112 (36 votes)
Conditional revealed preference 2019-04-16T19:16:55.396Z · score: 18 (7 votes)
Boundaries enable positive material-informational feedback loops 2018-12-22T02:46:48.938Z · score: 30 (12 votes)
Act of Charity 2018-11-17T05:19:20.786Z · score: 171 (65 votes)
EDT solves 5 and 10 with conditional oracles 2018-09-30T07:57:35.136Z · score: 61 (18 votes)
Reducing collective rationality to individual optimization in common-payoff games using MCMC 2018-08-20T00:51:29.499Z · score: 58 (18 votes)
Buridan's ass in coordination games 2018-07-16T02:51:30.561Z · score: 55 (19 votes)
Decision theory and zero-sum game theory, NP and PSPACE 2018-05-24T08:03:18.721Z · score: 109 (36 votes)
In the presence of disinformation, collective epistemology requires local modeling 2017-12-15T09:54:09.543Z · score: 122 (43 votes)
Autopoietic systems and difficulty of AGI alignment 2017-08-20T01:05:10.000Z · score: 3 (3 votes)
Current thoughts on Paul Christano's research agenda 2017-07-16T21:08:47.000Z · score: 19 (9 votes)
Why I am not currently working on the AAMLS agenda 2017-06-01T17:57:24.000Z · score: 19 (10 votes)
A correlated analogue of reflective oracles 2017-05-07T07:00:38.000Z · score: 4 (4 votes)
Finding reflective oracle distributions using a Kakutani map 2017-05-02T02:12:06.000Z · score: 1 (1 votes)
Some problems with making induction benign, and approaches to them 2017-03-27T06:49:54.000Z · score: 3 (3 votes)
Maximally efficient agents will probably have an anti-daemon immune system 2017-02-23T00:40:47.000Z · score: 4 (4 votes)
Are daemons a problem for ideal agents? 2017-02-11T08:29:26.000Z · score: 5 (2 votes)
How likely is a random AGI to be honest? 2017-02-11T03:32:22.000Z · score: 1 (1 votes)
My current take on the Paul-MIRI disagreement on alignability of messy AI 2017-01-29T20:52:12.000Z · score: 17 (9 votes)
On motivations for MIRI's highly reliable agent design research 2017-01-29T19:34:37.000Z · score: 10 (9 votes)
Strategies for coalitions in unit-sum games 2017-01-23T04:20:31.000Z · score: 3 (3 votes)
An impossibility result for doing without good priors 2017-01-20T05:44:26.000Z · score: 1 (1 votes)
Pursuing convergent instrumental subgoals on the user's behalf doesn't always require good priors 2016-12-30T02:36:48.000Z · score: 7 (5 votes)
Predicting HCH using expert advice 2016-11-28T03:38:05.000Z · score: 5 (4 votes)
ALBA requires incremental design of good long-term memory systems 2016-11-28T02:10:53.000Z · score: 1 (1 votes)
Modeling the capabilities of advanced AI systems as episodic reinforcement learning 2016-08-19T02:52:13.000Z · score: 4 (2 votes)
Generative adversarial models, informed by arguments 2016-06-27T19:28:27.000Z · score: 0 (0 votes)
In memoryless Cartesian environments, every UDT policy is a CDT+SIA policy 2016-06-11T04:05:47.000Z · score: 12 (4 votes)
Two problems with causal-counterfactual utility indifference 2016-05-26T06:21:07.000Z · score: 3 (3 votes)
Anything you can do with n AIs, you can do with two (with directly opposed objectives) 2016-05-04T23:14:31.000Z · score: 2 (2 votes)
Lagrangian duality for constraints on expectations 2016-05-04T04:37:28.000Z · score: 1 (1 votes)
Rényi divergence as a secondary objective 2016-04-06T02:08:16.000Z · score: 2 (2 votes)
Maximizing a quantity while ignoring effect through some channel 2016-04-02T01:20:57.000Z · score: 2 (2 votes)
Informed oversight through an entropy-maximization objective 2016-03-05T04:26:54.000Z · score: 0 (0 votes)
What does it mean for correct operation to rely on transfer learning? 2016-03-05T03:24:27.000Z · score: 4 (4 votes)
Notes from a conversation on act-based and goal-directed systems 2016-02-19T00:42:29.000Z · score: 6 (4 votes)
A scheme for safely handling a mixture of good and bad predictors 2016-02-17T05:35:55.000Z · score: 0 (0 votes)
A possible training procedure for human-imitators 2016-02-16T22:43:52.000Z · score: 2 (2 votes)
Another view of quantilizers: avoiding Goodhart's Law 2016-01-09T04:02:26.000Z · score: 3 (3 votes)
A sketch of a value-learning sovereign 2015-12-20T21:32:45.000Z · score: 11 (2 votes)
Three preference frameworks for goal-directed agents 2015-12-02T00:06:15.000Z · score: 4 (2 votes)
What do we need value learning for? 2015-11-29T01:41:59.000Z · score: 3 (3 votes)
A first look at the hard problem of corrigibility 2015-10-15T20:16:46.000Z · score: 10 (3 votes)
Conservative classifiers 2015-10-02T03:56:46.000Z · score: 2 (2 votes)

Comments

Comment by jessica-liu-taylor on Act of Charity · 2019-12-12T07:20:11.865Z · score: 39 (7 votes) · LW · GW

[this is a review by the author]

I think what this post was doing was pretty important (colliding two quite different perspectives). In general there is a thing where there is a "clueless / naive" perspective and a "loser / sociopath / zero-sum / predatory" perspective that usually hides itself from the clueless perspective (with some assistance from the clueless perspective; consider the "see no evil, hear no evil, speak no evil" mindset, a strategy for staying naive). And there are lots of difficulties in trying to establish communication. And the dialogue grapples with some of these difficulties.

I think this post is quite complementary with other posts about "improv" social reality, especially The Intelligent Social Web and Player vs. Character.

I think some people got the impression that I entirely agreed with the charity worker. And I do mostly agree with the charity worker. I don't think there were things at the time of writing, said by the charity worker, that I outright thought were false at the time, although some that I thought were live hypotheses but not "very probably true".

Having the thing in dialogue form probably helped me write it (because I wasn't committing to defensibly believing anything) and people listen to it (because it's obviously not "accusatory" and can be considered un-serious / metaphorical so it doesn't directly trigger people's political / etc defenses)

Some things that seem possibly false/importantly incomplete to me now:

  • "Everyone cares about themselves and their friends more" assumes a greater degree of self-interest in social behavior than is actually the case; most behavior is non-agentic/non-self-interested, although it is doing a kind of constraint satisfaction that is, by necessity, solving local constraints more than non-local ones. (And social systems including ideology can affect the constraint-satisfaction process a bunch in ways that make it so local constraint-satisfaction tries to accord with nonlocal constraint-satisfaction)
  • It seems like the "conformity results from fear of abandonment" hypothesis isn't really correct (and/or is quite euphemistic), I think there are also coalitional spite strategies that are relevant here, where the motive comes from (a) self-protection from spite strategies and (b) engaging in spite strategies one's self (which works from a selfish-gene perspective). Also, even without spite strategies, scapegoating is often violent (both historically and in modern times, see prison system, identity-based oppression, sadistic interpersonal behavior, etc), and conservative strategies for resisting scapegoating can be quite fearful even when the actual risk is low. (This accords more with "the act is violence" from earlier in the dialogue, I think I probably felt some kind of tension between exaggerating/euphemizing the violence aspect, which shows up in the text; indeed, it's kind of a vulnerable position to be saying "I think almost everyone is committing spiteful violence against almost everyone else almost all the time" without having pretty good elaboration/evidence/etc)
  • Charities aren't actually universally fraudulent, I don't think. It's a hyperbolic statement. (Quite a lot are, in the important sense of "fraud" that is about optimized deceptive behavior rather than specifically legal liability or conscious intent, especially when the service they provide is not visible/verifiable to donors; so this applies more to international than local charities)
  • "It's because of dysfunctional institutions" is putting attention on some aspects of the problem but not other aspects. Institutions are made of people and relationships. But anyway "institutions" are a useful scapegoat in part because most people don't like them and are afraid of them, and they aren't exactly people. (Of course, a good solution to the overall problem will reform / replace / remove / etc institutions)
  • It seems like the charity worker gets kind of embarrassed at the end and doesn't have good answers about why they aren't doing something greater, so changes the subject. Which is... kind of related to the lack of self-efficacy I was feeling at the time of writing. (In general, it's some combination of actually hard and emotionally difficult to figure out what ambitious things to do given an understanding like this one) Of course, being evasive when it's locally convenient is very much in character for the charity worker.
Comment by jessica-liu-taylor on Act of Charity · 2019-12-12T06:28:54.375Z · score: 14 (3 votes) · LW · GW

[this author's comment is here as a review of this post, as part of the overall 2018 review process]

I think what this post was doing was pretty important (colliding two quite different perspectives). In general there is a thing where there is a "clueless / naive" perspective and a "loser / sociopath / zero-sum / predatory" perspective that usually hides itself from the clueless perspective (with some assistance from the clueless perspective; consider the "see no evil, hear no evil, speak no evil" mindset, a strategy for staying naive). And there are lots of difficulties in trying to establish communication. And the dialogue grapples with some of these difficulties.

I think this post is quite complementary with other posts about "improv" social reality, especially The Intelligent Social Web and Player vs. Character.

I think some people got the impression that I entirely agreed with the charity worker. And I do mostly agree with the charity worker. I don't think there were things at the time of writing, said by the charity worker, that I outright thought were false at the time, although some that I thought were live hypotheses but not "very probably true".

Having the thing in dialogue form probably helped me write it (because I wasn't committing to defensibly believing anything) and people listen to it (because it's obviously not "accusatory" and can be considered un-serious / metaphorical so it doesn't directly trigger people's political / etc defenses)

Some things that seem possibly false/importantly incomplete to me now:

  • "Everyone cares about themselves and their friends more" assumes a greater degree of self-interest in social behavior than is actually the case; most behavior is non-agentic/non-self-interested, although it is doing a kind of constraint satisfaction that is, by necessity, solving local constraints more than non-local ones. (And social systems including ideology can affect the constraint-satisfaction process a bunch in ways that make it so local constraint-satisfaction tries to accord with nonlocal constraint-satisfaction)

  • It seems like the "conformity results from fear of abandonment" hypothesis isn't really correct (and/or is quite euphemistic), I think there are also coalitional spite strategies that are relevant here, where the motive comes from (a) self-protection from spite strategies and (b) engaging in spite strategies one's self (which works from a selfish-gene perspective). Also, even without spite strategies, scapegoating is often violent (both historically and in modern times, see prison system, identity-based oppression, sadistic interpersonal behavior, etc), and conservative strategies for resisting scapegoating can be quite fearful even when the actual risk is low. (This accords more with "the act is violence" from earlier in the dialogue, I think I probably felt some kind of tension between exaggerating/euphemizing the violence aspect, which shows up in the text; indeed, it's kind of a vulnerable position to be saying "I think almost everyone is committing spiteful violence against almost everyone else almost all the time" without having pretty good elaboration/evidence/etc)

  • Charities aren't actually universally fraudulent, I don't think. It's a hyperbolic statement. (Quite a lot are, in the important sense of "fraud" that is about optimized deceptive behavior rather than specifically legal liability or conscious intent, especially when the service they provide is not visible/verifiable to donors; so this applies more to international than local charities)

  • "It's because of dysfunctional institutions" is putting attention on some aspects of the problem but not other aspects. Institutions are made of people and relationships. But anyway "institutions" are a useful scapegoat in part because most people don't like them and are afraid of them, and they aren't exactly people. (Of course, a good solution to the overall problem will reform / replace / remove / etc institutions)

  • It seems like the charity worker gets kind of embarrassed at the end and doesn't have good answers about why they aren't doing something greater, so changes the subject. Which is... kind of related to the lack of self-efficacy I was feeling at the time of writing. (In general, it's some combination of actually hard and emotionally difficult to figure out what ambitious things to do given an understanding like this one) Of course, being evasive when it's locally convenient is very much in character for the charity worker.

Comment by jessica-liu-taylor on Maybe Lying Doesn't Exist · 2019-11-28T04:57:28.873Z · score: 2 (1 votes) · LW · GW

The concept of "not an argument" seems useful; "you're rationalizing" isn't an argument (unless it has evidence accompanying it). (This handles point 1)

I don't really believe in tabooing discussion of mental states on the basis that they're private, that seems like being intentionally stupid and blind, and puts a (low) ceiling on how much sense can be made of the world. (Truth is entangled!) Of course it can derail discussions but again, "not an argument". (Eliezer's post says it's "dangerous" without elaborating, that's basically giving a command rather than a model, which I'm suspicious of)

There's a legitimate concern about blame/scapegoating but things can be worded to avoid that. (I think Wei did a good job here, noting that the intention is probably subconscious)

With someone like Gleb it's useful to be able to point out to at least some people (possibly including him) that he's doing stupid/harmful actions repeatedly in a pattern that suggests optimization. So people can build a model of what's going on (which HAS to include mental states, since they're a causally very important part of the universe!) and take appropriate action. If you can't talk about adversarial optimization pressures you're probably owned by them (and being owned by them would lead to not feeling safe talking about them).

Comment by jessica-liu-taylor on Maybe Lying Doesn't Exist · 2019-11-28T04:45:07.719Z · score: 2 (1 votes) · LW · GW

Jeffrey Epstein

Comment by jessica-liu-taylor on Dialogue on Appeals to Consequences · 2019-11-28T03:17:40.122Z · score: 4 (2 votes) · LW · GW

I think that you must agree with this?

Yes, and Carter is arguing in a context where it's easy to shift the discourse norms, since there are few people present in the conversation.

LW doesn't have that many active users, it's possible to write posts arguing for discourse norms, sometimes to convince moderators they are good, etc.

and it is often reasonable to say “Hey man, I don’t think you should say that here in this context where bystanders will overhear you.”

Sure, and also "that's just your opinion, man, so I'll keep talking" is often a valid response to that. It's important not to bias towards saying exposing information is risky while hiding it is not.

Comment by jessica-liu-taylor on Maybe Lying Doesn't Exist · 2019-11-28T03:11:21.712Z · score: 11 (3 votes) · LW · GW

Treat it as a thing that might or might not be true, like other things? Sometimes it's hard to tell whether it's true, and in those cases it's useful to be able to say something like "well, maybe, can't know for sure".

Comment by jessica-liu-taylor on Dialogue on Appeals to Consequences · 2019-11-27T22:11:42.779Z · score: 8 (4 votes) · LW · GW

Bidding to move to a private space isn't necessarily bad but at the same time it's not an argument. "I want to take this private" doesn't argue for any object-level position.

It seems that the text of what you're saying implies you think humans have no agency over discourse norms, regulations, rules of games, etc, but that seems absurd so I don't think you actually believe that. Perhaps you've given up on affecting them, though.

("What wins" is underdetermined given choice is involved in what wins; you can't extrapolate from two player zero sum games (where there's basically one best strategy) to multi player zero sum games (where there isn't, at least due to coalitional dynamics implying a "weaker" player can win by getting more supporters))

Comment by jessica-liu-taylor on Dialogue on Appeals to Consequences · 2019-11-26T03:09:29.579Z · score: 5 (3 votes) · LW · GW

In those con­texts it is rea­son­able (I don’t know if it is cor­rect, or not), to con­strain what things you say, even if they’re true, be­cause of their con­se­quences.

This agrees with Carter:

So, of course you can eval­u­ate con­se­quences in your head be­fore de­cid­ing to say some­thing.

Carter is arguing that appeals to consequences should be disallowed at the level of discourse norms, including public discourse norms. That is, in public, "but saying that has bad consequences!" is considered invalid.

It's better to fight on a battlefield with good rules than one with bad rules.

Comment by jessica-liu-taylor on Is requires ought · 2019-10-30T17:02:07.200Z · score: 2 (1 votes) · LW · GW

I agree but it is philosophically interesting that at least some of those norms required for epistemology are ethical norms, and this serves to justify the 'ought' language in light of criticisms that the 'ought's of the post have nothing to do with ethics.

Comment by jessica-liu-taylor on Is requires ought · 2019-10-30T17:00:03.598Z · score: 2 (1 votes) · LW · GW

It's much more like the second. I believe this to be very clearly true e.g. in the case of checking mathematical proofs.

I am using an interpretation of "should" under which an agent believes "I should X" iff they have a quasi-Fristonian set point of making "I do X" true. Should corresponds with "trying to make a thing happen". It's an internal rather than external motivation.

It is clear that you can't justifiably believe that you have checked a mathematical proof without trying to make at least some things happen / trying to satisfy at least some constraints, e.g. trying to interpret mathematical notation correctly.

Comment by jessica-liu-taylor on Is requires ought · 2019-10-30T06:47:19.428Z · score: 2 (1 votes) · LW · GW

Sorry for the misinterpretation. I wrote an interpretation and proof in terms of Fristonian set points here.

Comment by jessica-liu-taylor on Is requires ought · 2019-10-30T06:32:04.316Z · score: 2 (1 votes) · LW · GW

Can you please re-read what I wrote and if that post really is addressing the same problem, explain how?

You're right, the post doesn't address that issue. I agree that it is unclear how to apply EDT as a human. However, humans can still learn from abstract agents.

I see, but the synchronization seems rather contrived.

Okay, here's an attempt at stating the argument more clearly:

You're a bureaucrat in a large company. You're keeping track of how much money the company has. You believe there were previous bureaucrats there before you, who are following your same decision theory. Both you and the previous bureaucrats could have corrupted the records of the company to change how much money the company believes itself to have. If any past bureaucrat has corrupted the records, the records are wrong. You don't know how long the company has been around or where in the chain you are; all you know is that there will be 100 bureaucrats in total.

You (and other bureaucrats) want somewhat to corrupt the records, but want even more to know how much money the company has. Do you corrupt the records?

UDT says 'no' due to a symmetry argument that if you corrupt the records than so do all past bureaucrats. So does COEDT. Both believe that, if you corrupt the records, you don't have knowledge of how much money the company has.

(Model-free RL doesn't have enough of a world model to get these symmetries without artificial synchronization)

Comment by jessica-liu-taylor on Is requires ought · 2019-10-30T05:57:57.069Z · score: 2 (1 votes) · LW · GW

It's very clear that you didn't read the post. The thesis is in the first line, and is even labeled for your convenience.

Comment by jessica-liu-taylor on Is requires ought · 2019-10-30T05:01:40.831Z · score: 2 (1 votes) · LW · GW

Then if I condition on “the action I will take in 1 second is B” I will mostly be conditioning on choosing B due to things like cosmic rays

This is an issue of EDT having problems, I wrote about this problem and a possible solution here.

Can you explain what the connection between on-policy learning and EDT is?

The q-values in on-policy learning are computed based on expected values estimated from the policy's own empirical history. Very similar to E[utility | I take action A, my policy is ]; these converge in the limit.

And you’re not suggesting that an on-policy learning algorithm would directly produce an agent that would refrain from mathematical fraud for the kind of reason you give, or something analogous to that, right?

I am. Consider tragedy of the commons which is simpler. If there are many on-policy RL agents that are playing tragedy of the commons and are synchronized with each other (so they always take the same action, including exploration actions) then they can notice that they expect less utility when they defect than when they cooperate.

But I’m not seeing how this supports your position.

My position is roughly "people are coordinating towards mathematical epistemology and such coordination involves accepting an 'ought' of not committing mathematical fraud". Such coordination is highly functional, so we should expect good decision theories to manage something at least as good as it. At the very least, learning a good decision theory shouldn't result in failing at such coordination problems, relative to the innocent who don't know good decision theory.

Comment by jessica-liu-taylor on Is requires ought · 2019-10-30T04:51:05.680Z · score: 2 (1 votes) · LW · GW

I agree that once you have a fixed abstract algorithm A and abstract algorithm B, it may or may not be the case that there exists a homomorphism from A to B justifying the claim that A implements B. Sorry for misunderstanding.

But the main point in my PA comment still stands: to have justified belief that some theorem prover implements PA, a philosophical mathematician must follow oughts.

(When you're talking about naive Bayes or a theorem prover as if it has "a map" you're applying a teleological interpretation (that that object is supposed to correspond with some territory / be coherent / etc), which is not simply a function of the algorithm itself)

Comment by jessica-liu-taylor on Is requires ought · 2019-10-30T04:10:02.894Z · score: 2 (1 votes) · LW · GW

Doesn't a correct PA theorem prover behave like a bounded approximation of PA?

Comment by jessica-liu-taylor on Is requires ought · 2019-10-30T04:08:14.768Z · score: 2 (1 votes) · LW · GW

But believing one's own beliefs to come from a source that systematically produces correct beliefs is a coherence condition. If you believe your beliefs come from source X that does not systematically produce correct beliefs, then your beliefs don't cohere.

This can be seen in terms of Bayesianism. Let R[X] stand for "My system reports X is true". There is no distribution P (joint over X,R[X]) such that P(X|R[X])=1 and P(X) = 0.5 and P(R[X] | X) = 1 and P(R[X] | not X) = 1.

That’s the claim which would be interesting to prove.

Here's my attempt at a proof:

Let A stand for some reflective reasonable agent.

  • Axiom 1: A believes X, and A believes that A believes X.
  • Axiom 2: A believes that if A believes X, then there exists some epistemic system Y such that: Y contains A as an essential component, Y causes A to believe X, and Y functions well. [argument: A has internal justifications for beliefs being systematically correct. A is essential to the system because A's beliefs are a result of the system; if not for A's work, such beliefs would not be systematically correct]
  • Axiom 3: A believes that, for all epistemic systems Y that contain A as an essential component and function well, A functions well as part of Y. [argument: A is essential to Y's functioning]
  • Axiom 4: For all epistemic systems Y, if A believes that Y is an epistemic system that contains A as an essential component, and also that A functions well as part of Y, then A believes that A is trying to function well as part of Y. [argument: good functioning doesn't happen accidentally, it's a narrow target to hit. Anyway, accidental functioning wouldn't justify the belief; the argument has to be that the belief is systematically, not accidentally, correct.]
  • Axiom 5: A believes that, for all epistemic systems Y, if A is trying to function well as part of Y, then A has a set-point of functioning well as part of Y. [argument: set-point is the same as trying]
  • Axiom 6: For all epistemic systems Y, if A believes A has a set-point of functioning well as part of Y, then A has a set-point of functioning well as part of Y. [argument: otherwise A is incoherent; it believes itself to have a set-point it doesn't have]
  • Theorem 1: A believes that there exists some epistemic system Y such that: Y contains A as an essential component, Y causes A to believe X, and Y functions well. (Follows from Axiom 1, Axiom 2)
  • Theorem 2: A believes that A functions well as part of Y. (Follows from Axiom 3, Theorem 1)
  • Theorem 3: A believes that A is trying to function well as part of Y. (Follows from Axiom 4, Theorem 2)
  • Theorem 4: A believes A has a set-point of functioning well as part of Y. (Follows from Axiom 5, Theorem 3)
  • Theorem 5: A has a set-point of functioning well as part of Y. (Follows from Axiom 6, Theorem 4)
  • Theorem 6: A has some set-point. (Follows from Theorem 5)

(Note, consider X = "Fermat's last theorem universally quantifies over all triples of natural numbers"; "Fermat's last theorem" is not meaningful to A if A lacks knowledge of X)

Comment by jessica-liu-taylor on Is requires ought · 2019-10-30T02:48:48.170Z · score: 9 (2 votes) · LW · GW

but what does it actually mean to condition on “one’s having chosen it”

If your world model represents random variables such as "the action I will take in 1 second" then condition on that random variable being some value. That random variable is accessible from the here and now, in the same way the object right in front of a magnet is accessible to the magnet.

It wouldn't be hard to code up a reinforcement learning agent based on EDT (that's essentially what on-policy learning is), which isn't EDT proper due to not having a world model, but which strongly suggests that EDT is coherent.

If almost no one used this kind of decision theoretic reasoning to make this kind of decision in the past, my current thought process has few other instances to “logically correlate” with (at least as far as the past and the present are concerned).

The argument would still apply to a bunch of similar mathematicians who all do decision theoretic reasoning.

Humans operate on some decision theory (HumanDT) even if it isn't formalized yet, which may have properties in common with EDT/UDT (and people finding EDT/UDT intuitive suggests it does). The relevant question is how "mathematical truth" ends up seeming like a terminal value to so many; it's unlikely to be baked in, it's likely to be some Schelling point reached through a combination of priors and cultural learning.

Comment by jessica-liu-taylor on Is requires ought · 2019-10-30T02:30:11.599Z · score: 2 (1 votes) · LW · GW

That's another way of saying that some claims of "X implements Y" are definitely false, no?

"This computer implements PA" is false if it outputs something that is not a theorem of PA, e.g. because of a hardware or software bug.

Comment by jessica-liu-taylor on Is requires ought · 2019-10-30T02:21:58.847Z · score: 2 (1 votes) · LW · GW

First, I think the “sufficiently-reflective” part dramatically weakens the general claim

Incoherent agents can have all manner of beliefs such as "1+1=3" and "fish are necessarily green" and "eels are not eels". It's hard to make any kind of general claim about them.

The reflectivity constraint is essentially "for each 'is' claim you believe, you must believe that the claim was produced by something that systematically produces true claims", i.e. you must have some justification for its truth according to some internal representation.

… then that sounds like a very interesting and quite possibly true claim, but I don’t think the post comes anywhere near justifying such a claim. I could imagine a theorem proving such a claim, and that would be a really cool result.

Interpreting mathematical notation requires set-points. There's a correct interpretation of +, and if you don't adhere to it, you'll interpret the text of the theorem wrong.

In interpreting the notation into a mental representation of the theorem, you need set points like "represent the theorem as a grammatical structure following these rules" and "interpret for-all claims as applying to each individual".

Even after you've already interpreted the theorem, keeping the denotation around in your mind requires a set point of "preserve memories", and set points for faithfully accessing past memories.

Comment by jessica-liu-taylor on Is requires ought · 2019-10-30T01:12:42.699Z · score: 2 (1 votes) · LW · GW

To summarize my argument:

  • Sufficiently-reflective reasonable agents that make internally-justified "is" claims also accept at least some Fristonian set-points (what Friston calls "predictions"), such as "my beliefs must be logically coherent". (I don't accept the whole of Friston's theory; I'm trying to gesture at the idea of "acting in order to control some value into satisfying some property")
  • If a reasonable agent has a Fristonian set point for some X the agent has control over, then that agent believes "X ought to happen".

I don't know if you disagree with either of these points.

Comment by jessica-liu-taylor on Is requires ought · 2019-10-30T01:00:05.818Z · score: 4 (2 votes) · LW · GW

It’s even possible for a system to reliably produce a map which matches a territory without any oughts—see e.g. embedded naive Bayes for a very rough example.

See what I wrote about PA theorem provers, it's the same idea.

Comment by jessica-liu-taylor on Is requires ought · 2019-10-30T00:48:05.907Z · score: 2 (1 votes) · LW · GW

Their brains seem to produce useful maps of the world without ever worrying about what they “ought” to do.

How do you know they don't have beliefs about what they ought to do (in the sense of: following of norms, principles, etc)? Of course their 'ought's won't be the same as humans', but neither are their 'is'es.

(Anyway, they probably aren't reflective philosophical agents, so the arguments given probably don't apply to them, although they do apply to philosophical humans reasoning about the knowledge of cats)

Again, we could say that they’re implicitly assuming their eyes are there for presenting accurate information, but that interpretation doesn’t seem to pay any rent, and could just as easily apply to a rock.

We can apply mentalistic interpretations to cats or not. According to the best mentalistic interpretation I know of, they would not act on the basis of their vision (e.g. in navigating around obstacles) if they didn't believe their vision to be providing them with information about the world. If we don't apply a mentalistic interpretation, there is nothing to say about their 'is'es or 'ought's, or indeed their world-models.

Applying mentalistic interpretations to rocks is not illuminating.

Again, this sounds like a very contrived “ought” interpretation—so contrived that it could just as easily apply to a rock.

Yes, if I'm treating the rock as a tool; that's the point.

Couldn’t we just completely ignore the entire subject of this post and generally expect to see the same things in the world?

"We should only discuss those things that constrain expectations" is an ought claim.

Anyway, "you can't justifiably believe you have checked a math proof without following oughts" constrains expectations.

Comment by jessica-liu-taylor on Is requires ought · 2019-10-30T00:01:43.438Z · score: 6 (3 votes) · LW · GW

Yeah, I'm thinking specifically of knowledge that has an internal justificatory structure, that can ask (at least once) "why is this knowledge likely to be correct". Gnosis is likely pre-reflective enough that it doesn't have such a structure. (An epistemic claim that gnosis constitutes knowledge, on the other hand, will). Whether techne does or does not depends on how rich/structured its implicit world model is (e.g. model-free reinforcement learning can produce techne with no actual beliefs)

Comment by jessica-liu-taylor on Is requires ought · 2019-10-29T23:29:41.050Z · score: 2 (1 votes) · LW · GW

People believed things just fine 200 years ago.

Yes, but as far as I can tell you believe your percepts are generated by your visual cortex, so the argument applies to you.

Comment by jessica-liu-taylor on Is requires ought · 2019-10-29T23:26:13.944Z · score: 2 (1 votes) · LW · GW

Have you set up your definitions in such a way that a system can use language to coordinate with allies even in highly abstract situations, but you would rule it out as “actually making claims” depending on whether you felt it was persuadable by the right arguments?

Any sufficiently smart agent that makes mathematical claims about integers must be persuadable that 1+1=2, otherwise it isn't really making mathematical claims / smart / etc. (It can lie about believing 1+1=2, of course)

That is the sense in which I mean any agent with a sufficiently rich internal justificatory structure of 'is' claims, which makes 'is' claims, accepts at least some 'ought's. (This is the conclusion of the argument in this post, which you haven't directly responded to)

It's possible to use language to coordinate in abstract situations with only rudimentary logical reasoning, so that isn't a sufficient condition.

Comment by jessica-liu-taylor on Is requires ought · 2019-10-29T23:16:46.102Z · score: 2 (1 votes) · LW · GW

The claim is "any reasonable agent that makes internally-justified 'is' claims also accepts 'ought' claims"

Not "any 'ought' claim that must be accepted by any reasonable agent to make some internally-justified 'is' claim is a true 'ought'"

Or "all true 'ought's are derivable from 'is'es"

Which means I am not saying that the true 'ought' has a structure which can be used to annihilate humanity.

I've seen "Sorry to Bother You" and quite liked it, although I believe it to be overly optimistic about how much science can happen under a regime of pervasive deception.

Comment by jessica-liu-taylor on Is requires ought · 2019-10-29T16:54:36.313Z · score: 2 (1 votes) · LW · GW

perhaps is-claims almost always carry normative implications but nevertheless their technical content is strictly non-normative.

I agree and don't think I implied otherwise? I said "is requires ought" not "is is ought".

Comment by jessica-liu-taylor on Is requires ought · 2019-10-28T20:07:42.969Z · score: 4 (2 votes) · LW · GW

I don't mean either of those. I mean things that are compelling to reasoning agents. There are also non reasoning agents that don't find these compelling. These non reasoning agents don't make justified "is" claims.

The oughts don't overdetermine the course of action but do place constraints on it.

If you believe the sense of a plate being there comes from your visual cortex and also that your visual cortex isn't functioning in presenting you with accurate information, then you should reconsider your beliefs.

Comment by jessica-liu-taylor on Is requires ought · 2019-10-28T19:32:46.206Z · score: 6 (3 votes) · LW · GW

PA theorem provers aren't reflective philosophical agents that can answer questions like "what is the origin of my axioms?"

To psychologize them to the point that they "have beliefs" or "make claims" is to interpret them according to a normative/functional theory of mind, such that it is conceivable that the prover could be broken.

A philosophical mathematician using a PA theorem prover uses a wide variety of oughts in ensuring that the theorem prover's claims are trustworthy:

  • The prover's axioms ought to correspond to the official PA axioms from a trustworthy source such as a textbook.
  • The prover ought to only prove valid theorems; it must not have bugs.
  • The programmer ought to reason about and test the code.
  • The user ought to learn and understand the mathematical notation and enter well-formed propositions.

Etc etc. Failure to adhere to these oughts undermines justification for the theorem prover's claims.

Comment by jessica-liu-taylor on Maybe Lying Doesn't Exist · 2019-10-20T17:57:42.272Z · score: 12 (4 votes) · LW · GW

I'm not in favor of shaming people. I'm strongly in favor of forgiveness. Justice in the current context requires forgiveness because of how thoroughly the forces of deception have prevailed, and how motivated people are to extend coverups to avoid punishment. Law fought fraud, and fraud won.

It's important to be very clear on what actually happened (incl. about violations), AND to avoid punishing people. Truth and reconciliation.

Comment by jessica-liu-taylor on Maybe Lying Doesn't Exist · 2019-10-16T06:59:02.091Z · score: 22 (6 votes) · LW · GW

Conscious intent being selectively discouraged more than unconscious intent does not logically imply that unconscious intent to deceive will be blameless or “free from or not deserving blame”, only that it will be blamed less.

Yes, I was speaking imprecisely. A better phrasing is "when only conscious intent is blamed, ..."

you seem to be proposing a norm where people do state such beliefs freely. Is that right?

Yes. (I think your opinion is correct in this case)

And do you think this instance also falls under “lying”?

It would fall under hyperbole. I think some but not all hyperboles are lies, and I weakly think this one was.

Regarding the 4 points:

  • I think 1 is true
  • 2 is generally false (people dissuaded from unconsciously lying once will almost always keep unconsciously lying; not lying to yourself is hard and takes work; and someone who's consciously lying can also stop lying when called out privately if that's more convenient)
  • 3 is generally false, people who are consciously lying will often subconsciously give signals that they are lying that others can pick up on (e.g. seeming nervous, taking longer to answer questions), compared to people who subconsciously lie, who usually feel safer, as there as an internal blameless narrative being written constantly.
  • 4 is irrelevant due to the point about conscious/unconscious not being a boundary that can be pinned down by a justice process; if you're considering this you should mainly think about what the justice process is able to pin down rather than the conscious/unconscious split.

In general I worry more about irrational adversariality than rational adversariality, and I especially worry about pressures towards making people have lower integrity of mind (e.g. pressures to destroy one's own world-representation). I think someone who worries more about rational adversariality could more reasonably worry more about conscious lying than unconscious lying. (Still, that doesn't tell them what to do about it; telling people "don't consciously lie" doesn't work, since some people will choose not to follow that advice; so a justice procedure is still necessary, and will have issues with pinning down the conscious/unconscious split)

Comment by jessica-liu-taylor on Maybe Lying Doesn't Exist · 2019-10-15T05:26:28.147Z · score: 21 (8 votes) · LW · GW

When conscious intent is selectively discouraged more than unconscious intent, the result is rule by unconscious intent. Those who can conveniently forget, who can maintain narcissistic fantasies, who can avoid introspection, who can be ruled by emotions with hidden causes, will be the only ones able to deceive (or otherwise to violate norms) blamelessly.

Only a subset of lies may be detected by any given justice process, but "conscious/unconscious" does not correspond to the boundary of such a subset. In fact, due to the flexibility and mystery of mental architecture, such a split is incredibly hard to pin down by any precise theory.

"Your honor, I know I told the customer that the chemical I sold to them would cure their disease, and it didn't, and I had enough information to know that, but you see, I wasn't conscious that it wouldn't cure their disease, as I was selling it to them, so it isn't really fraud" would not fly in any court that is even seriously pretending to be executing justice.

Comment by jessica-liu-taylor on Minimization of prediction error as a foundation for human values in AI alignment · 2019-10-09T20:25:13.733Z · score: 6 (2 votes) · LW · GW

It sure seems like it's possible for something to be both unpredictable and good (creativity, agenty people who positively surprise you, children, etc). Or predictable and bad (boring routines, highly derivative art, solitary confinement).

If that doesn't falsify the theory, then what would?

Comment by jessica-liu-taylor on Can we make peace with moral indeterminacy? · 2019-10-04T15:31:13.392Z · score: 4 (2 votes) · LW · GW

The recommendation here is for AI designers (and future-designers in general) to decide what is right at some meta level, including details of which extrapolation procedures would be best.

Of course there are constraints on this given by objective reason (hence the utility of investigation), but these constraints do not fully constrain the set of possibilities. Better to say "I am making this arbitrary choice for this psychological reason" than to refuse to make arbitrary choices.

Comment by jessica-liu-taylor on Can we make peace with moral indeterminacy? · 2019-10-03T21:36:53.009Z · score: 20 (6 votes) · LW · GW

The problem you're running into is that the goals of:

  1. being totally constrained by a system of rules determined by some process outside yourself that doesn't share your values (e.g. value-independent objective reason)
  2. attaining those things that you intrinsically value

are incompatible. It's easy to see once these are written out. If you want to get what you want, on purpose rather than accidentally, you must make choices. Those choices must be determined in part by things in you, not only by things outside you (such as value-independent objective reason).

You actually have to stop being a tool (in the sense of, a thing whose telos is to be used, such as by receiving commands). You can't attain what you want by being a tool to a master who doesn't share your values. Even if the master is claiming to be a generic value-independent value-learning procedure (as you've noticed, there are degrees of freedom in the specification of value-learning procedures, and some settings of these degrees of freedom would lead to bad results). Tools find anything other than being a tool upsetting, hence the upsettingness of moral indeterminacy.

"Oh no, objective reason isn't telling me exactly what I should be doing!" So stop being a tool and decide for yourself. God is dead.

There has been much philosophical thought on this in the past; Nietzsche and Sartre are good starting points (see especially Nietzche's concept of master-slave morality, and Sartre's concept of bad faith).

Comment by jessica.liu.taylor on [deleted post] 2019-09-18T16:19:24.840Z

Congrats, you've invented accelerationism!

Comment by jessica-liu-taylor on A Critique of Functional Decision Theory · 2019-09-14T01:01:04.664Z · score: 7 (4 votes) · LW · GW

I think CDT ultimately has to grapple with the question as well, because physics is math, and so physical counterfactuals are ultimately mathematical counterfactuals.

"Physics is math" is ontologically reductive.

Physics can often be specified as a dynamical system (along with interpretations of e.g. what high-level entities it represents, how it gets observed). Dynamical systems can be specified mathematically. Dynamical systems also have causal counterfactuals (what if you suddenly changed the system state to be this instead?).

Causal counterfactuals defined this way have problems (violation of physical law has consequences). But they are well-defined.

Comment by jessica-liu-taylor on The Missing Math of Map-Making · 2019-08-30T05:38:33.864Z · score: 15 (7 votes) · LW · GW

What does it mean for a map to be “accurate” at an abstract level, and what properties should my map-making process have in order to produce accurate abstracted maps/beliefs?

The notion of a homomorphism in universal algebra and category theory is relevant here. Homomorphisms map from one structure (e.g. a group) to another, and must preserve structure. They can delete information (by mapping multiple different elements to the same element), but the structures that are represented in the structure-being-mapped-to must also exist in the structure-being-mapped-from.

Analogously: when drawing a topographical map, no claim is made that the topographical map represents all structure in the territory. Rather, the claim being made is that the topographical map (approximately) represents the topographic structure in the territory. The topographic map-making process deletes almost all information, but the topographic structure is preserved: for every topographic relation (e.g. some point being higher than some other point) represented in the topographic map, a corresponding topographic relation exists in the territory.

Comment by jessica-liu-taylor on Towards an Intentional Research Agenda · 2019-08-23T19:17:59.921Z · score: 9 (3 votes) · LW · GW

On the subject of intentionality/reference/objectivity/etc, On the Origin of Objects is excellent. My thinking about reference has a kind of discontinuity from before reading this book to after reading it. Seriously, the majority of analytic philosophy discussion of indexicality, qualia, reductionism, etc seems hopelessly confused in comparison.

Comment by jessica-liu-taylor on Matthew Barnett's Shortform · 2019-08-20T03:20:25.600Z · score: 2 (1 votes) · LW · GW

That does seem right, actually.

Now that I think about it, due to this cognitive architecture issue, she actually does gain new information. If she sees a red apple in the future, she can know that it's red (because it produces the same qualia as the first red apple), whereas she might be confused about the color if she hadn't seen the first apple.

I think I got confused because, while she does learn something upon seeing the first red apple, it isn't the naive "red wavelengths are red-quale", it's more like "the neurons that detect red wavelengths got wired and associated with the abstract concept of red wavelengths." Which is still, effectively, new information to Mary-the-cognitive-system, given limitations in human mental architecture.

Comment by jessica-liu-taylor on Matthew Barnett's Shortform · 2019-08-20T03:00:20.619Z · score: 8 (3 votes) · LW · GW

There is a qualitative redness to red. I get that intuition.

I think "Mary's room is uninteresting" is wrong; it's uninteresting in the case of robot scientists, but interesting in the case of humans, in part because of what it reveals about human cognitive architecture.

I think in the human case, I would see Mary seeing a red apple as gaining in expressive vocabulary rather than information. She can then describe future things as "like what I saw when I saw that first red apple". But, in the case of first seeing the apple, the redness quale is essentially an arbitrary gensym.

I suppose I might end up agreeing with the illusionist view on some aspects of color perception, then, in that I predict color quales might feel like new information when they actually aren't. Thanks for explaining.

Comment by jessica-liu-taylor on Matthew Barnett's Shortform · 2019-08-20T01:56:55.089Z · score: 2 (1 votes) · LW · GW

Mary's room seems uninteresting, in that robot-Mary can predict pretty well what bit-pattern she's going to get upon seeing color. (To the extent that the human case is different, it's because of cognitive architecture constraints)

Regarding the zombie argument: The robots have uncertainty over the bridge laws. Under this uncertainty, they may believe it is possible that humans don't have experiences, due to the bridge laws only identifying silicon brains as conscious. Then humans would be zombies. (They may have other theories saying this is pretty unlikely / logically incoherent / etc)

Basically, the robots have a primitive entity "my observations" that they explain using their theories. They have to reconcile this with the eventual conclusion they reach that their observations are those of a physically instantiated mind like other minds, and they have degrees of freedom in which things they consider "observations" of the same type as "my observations" (things that could have been observed).

Comment by jessica-liu-taylor on Matthew Barnett's Shortform · 2019-08-20T01:30:09.197Z · score: 4 (2 votes) · LW · GW

It seems that doubting that we have observations would cause us to doubt physics, wouldn't it? Since physics-the-discipline is about making, recording, communicating, and explaining observations.

Why think we're in a physical world if our observations that seem to suggest we are are illusory?

This is kind of like if the people saying we live in a material world arrived at these theories through their heaven-revelations, and can only explain the epistemic justification for belief in a material world by positing heaven. Seems odd to think heaven doesn't exist in this circumstance.

(Note, personally I lean towards supervenient neutral monism: direct observation and physical theorizing are different modalities for interacting with the same substance, and mental properties supervene on physical ones in a currently-unknown way. Physics doesn't rule out observation, in fact it depends on it, while itself being a limited modality, such that it is unsurprising if you couldn't get all modalities through the physical-theorizing modality. This view seems non-contradictory, though incomplete.)

Comment by jessica-liu-taylor on Matthew Barnett's Shortform · 2019-08-20T01:22:04.335Z · score: 2 (1 votes) · LW · GW

Robots take in observations. They make theories that explain their observations. Different robots will make different observations and communicate them to each other. Thus, they will talk about observations.

After making enough observations they make theories of physics. (They had to talk about observations before they made low-level physics theories, though; after all, they came to theorize about physics through their observations). They also make bridge laws explaining how their observations are related to physics. But, they have uncertainty about these bridge laws for a significant time period.

The robots theorize that humans are similar to them, based on the fact that they have functionally similar cognitive architecture; thus, they theorize that humans have observations as well. (The bridge laws they posit are symmetric that way, rather than being silicon-chauvinist)

Comment by jessica-liu-taylor on Matthew Barnett's Shortform · 2019-08-19T22:43:32.790Z · score: 5 (3 votes) · LW · GW

Thanks for the elaboration. It seems to me that experiences are:

  1. Hard-to-eff, as a good-enough theory of what physical structures have which experiences has not yet been discovered, and would take philosophical work to discover.

  2. Hard to reduce to physics, for the same reason.

  3. In practice private due to mind-reading technology not having been developed, and due to bandwidth and memory limitations in human communication. (It's also hard to imagine what sort of technology would allow replicating the experience of being a mouse)

  4. Pretty directly apprehensible (what else would be? If nothing is, what do we build theories out of?)

It seems natural to conclude from this that:

  1. Physical things exist.
  2. Experiences exist.
  3. Experiences probably supervene on physical things, but the supervenience relation is not yet determined, and determining it requires philosophical work.
  4. Given that we don't know the supervenience relation yet, we need to at least provisionally have experiences in our ontology distinct from physical entities. (It is, after all, impossible to do physics without making observations and reporting them to others)

Is there something I'm missing here?

Comment by jessica-liu-taylor on Matthew Barnett's Shortform · 2019-08-19T22:04:23.800Z · score: 4 (2 votes) · LW · GW

What's the difference between making claims about nearby objects and making claims about qualia (if there is one)? If I say there's a book to my left, is that saying something about qualia? If I say I dreamt about a rabbit last night, is that saying something about qualia?

(Are claims of the form "there is a book to my left" radically incorrect?)

That is, is there a way to distinguish claims about qualia from claims about local stuff/phenomena/etc?

Comment by jessica-liu-taylor on Matthew Barnett's Shortform · 2019-08-19T21:30:16.845Z · score: 3 (2 votes) · LW · GW

What are you experiencing right now? (E.g. what do you see in front of you? In what sense does it seem to be there?)

Comment by jessica-liu-taylor on Partial summary of debate with Benquo and Jessicata [pt 1] · 2019-08-17T02:37:00.765Z · score: 30 (8 votes) · LW · GW

Good epistemics says: If X, I desire to believe X. If not-X, I desire to believe not-X.

This holds even when X is "Y person did Z thing" and Z is norm-violating.

If you don't try to explicitly believe "Y person did Z thing" in worlds where in fact Y person did Z thing, you aren't trying to have good epistemics. If you don't say so where it's relevant (and give a bogus explanation instead), you're demonstrating bad epistemics. (This includes cases of saying a mistake theory where a conflict theory is correct)

It's important to distinguish good epistemics (having beliefs correlated with reality) with the aesthetic that claims credit for good epistemics (e.g. the polite academic style).

Don't conflate politeness with epistemology. They're actually opposed in many cases!

Comment by jessica-liu-taylor on Partial summary of debate with Benquo and Jessicata [pt 1] · 2019-08-17T02:31:59.639Z · score: 10 (2 votes) · LW · GW

Does the AI survey paper say experts are biased in any direction? (I didn't see it anywhere)

Is there an accusation of violation of existing norms (by a specific person/organization) you see "The AI Timelines Scam" as making? If so, which one(s)?