Conditional revealed preference

2019-04-16T19:16:55.396Z · score: 18 (7 votes)
Comment by jessica-liu-taylor on User GPT2 Has a Warning for Violating Frontpage Commenting Guidelines · 2019-04-01T22:02:28.777Z · score: 2 (1 votes) · LW · GW

The numbering in this comment is clearly Markdown auto-numbering. Is there a different comment with numbering that you meant?

For reference, this is how Markdown numbers a list in 3, 2, 1 order:

  1. item

  2. item

  3. item

Comment by jessica-liu-taylor on User GPT2 Has a Warning for Violating Frontpage Commenting Guidelines · 2019-04-01T21:30:19.057Z · score: 6 (4 votes) · LW · GW

Seems like a bot to me, are there signs of humanity you can point to?

[EDIT: replies by GPT2 come in way too fast (like, 5 seconds) for this to be a human]

Comment by jessica-liu-taylor on User GPT2 Has a Warning for Violating Frontpage Commenting Guidelines · 2019-04-01T21:28:00.523Z · score: 6 (2 votes) · LW · GW

Markdown numbers lists in order even if you use different numbers.

Comment by jessica-liu-taylor on Privacy · 2019-03-18T20:42:30.351Z · score: 11 (6 votes) · LW · GW

OK, you're right that less privacy gives significant advantage to non-generative conformity-based strategies, which seems like a problem. Hmm.

Comment by jessica-liu-taylor on Privacy · 2019-03-17T17:44:48.433Z · score: 4 (3 votes) · LW · GW

OK, I can defend this claim, which seems different from the "less privacy means we get closer to a world of angels" claim; it's about asymmetric advantages in conflict situations.

In the example you gave, more generally available information about people's locations helps Big Bad Wolf more than Little Red Hood. If I'm strategically identifying with Big Bad Wolf then I want more information available, and if I'm strategically identifying with Little Red Hood then I want less information available. I haven't seen a good argument that my strategic position is more like Little Red Hood's than Big Bad Wolf's (yes, the names here are producing moral connotations that I think are off).

So, why would info help us more than our enemies? I think efforts to do big, important things (e.g. solve AI safety or aging) really often get derailed by predatory patterns (see Geeks, Mops, Sociopaths), which usually aren't obvious to the people cooperative with the original goal for a while. These patterns derail the group and cause it to stop actually targeting its original mission. It seems like having more information about strategies would help solve this problem.

Of course, it also gives the predators more information. But I think it helps defense more than offense, since there are more non-predators to start with than predators, and non-predators are (presently) at a more severe information disadvantage than the predators are, with respect to this conflict.

Anyway, I'm not that confident in the overall judgment, but I currently think more available info about strategies is good in expectation with respect to conflict situations.

Comment by jessica-liu-taylor on Has "politics is the mind-killer" been a mind-killer? · 2019-03-17T09:45:15.159Z · score: 8 (5 votes) · LW · GW

With stakes so high, how can you justify placing good faith debate above using whatever tactics are necessary to avoid losing?

Local validity!

[EDIT: also, you could actually be uncertain, or could be talking to aligned people who are uncertain, in which case having more-informative discussions about politics helps you and your friends make better decisions!]

Comment by jessica-liu-taylor on Privacy · 2019-03-17T09:13:35.308Z · score: 2 (1 votes) · LW · GW

You do seem to be saying that removing privacy would get us closer to a world of angels.

Where? (I actually think I am uncertain about this)

Comment by jessica-liu-taylor on Privacy · 2019-03-17T08:45:51.848Z · score: 16 (7 votes) · LW · GW

That isn't the same as arguing against privacy. If someone says "I think X because Y" and I say "Y is false for this reason" that isn't (necessarily) arguing against X. People can have wrong reasons for correct beliefs.

It's epistemically harmful to frame efforts towards increasing local validity as attempts to control the outcome of a discussion process; they're good independent of whether they push one way or the other in expectation.

In other words, you're treating arguments as soldiers here.

(Additionally, in the original comment, I was mostly not saying that Zvi's arguments were unsound (although I did say that for a few), but that they reflected a certain background understanding of how the world works)

Comment by jessica-liu-taylor on Privacy · 2019-03-17T07:44:46.156Z · score: 1 (4 votes) · LW · GW

Why do you think I'm arguing against privacy in my comment (the one you replied to)? I don't think I've been taking a strong stance on it.

Comment by jessica-liu-taylor on Privacy · 2019-03-17T07:24:27.175Z · score: 3 (2 votes) · LW · GW

Not sure what the connection to “market for lemons” is.

People who haven't gotten an education are, on average, unproductive, since productive people have a better alternative to not getting an education (namely, getting an education). Similarly, in a market for lemons, cars on the market are, on average, low-quality, since people with high-quality cars have a better alternative to putting them on an open market (namely, continuing to use the car, or selling it in a higher-trust market).

I agree that is still a Nash equilibrium and I think even a Perfect Bayesian Equilibrium, but there may be a stronger formal equilibrium concept that rules it out?

It's possible, I don't know the formal stronger equilibrium concepts though.

Now that I think about it, there are even simpler cases of more-available information making Nash equilibria worse. In any finite iterated prisoner's dilemma with known horizon, the only Nash equilibrium is to always defect. But, in a finite iterated prisoner's dilemma with unknown geometrically-distributed horizon (sufficiently far away in expectation), there are Nash equilibria that generate mutual cooperation (due to folk theorems).

Comment by jessica-liu-taylor on Privacy · 2019-03-17T05:59:37.374Z · score: 4 (2 votes) · LW · GW

OK, looking at the argument, I think it makes sense that signalling equilibria can potentially be Pareto-worse than non-signalling equilibria, as they can have more of a "market for lemons" problem. Worth noting that not all equilibria in the game-with-signalling are worse than non-signalling equilibria (I think "no one gets education, everyone gets paid average productivity" is still a Nash equilibrium), it's just that signalling enables additional equilibria, some of which are bad.

Comment by jessica-liu-taylor on Has "politics is the mind-killer" been a mind-killer? · 2019-03-17T05:41:34.586Z · score: 16 (10 votes) · LW · GW

Everything in this post seems correct. The original post wasn't even that wrong (your changes are all good corrections), but it seems like many people took from it a shallow, slogan-like interpretation of "politics in the mind-killer", taking it to mean that rationally thinking about or discussing politics isn't even worth trying.

See also: Politics is hard mode, for previous discussion.

Comment by jessica-liu-taylor on Privacy · 2019-03-17T05:15:25.447Z · score: 5 (3 votes) · LW · GW

I get a 404 for the paper. The part you quoted says "maybe this might happen" but doesn't give an economic argument that it could happen, it just says "maybe employers don't pay people enough for it to be worth it". Is there somewhere where the argument is actually made?

Comment by jessica-liu-taylor on Privacy · 2019-03-16T23:25:04.979Z · score: -11 (10 votes) · LW · GW

The two posts you linked are not even a little relevant to the question of whether, in general, bounded agents do better or worse by having more information (Yes, choice paralysis might make some information about what choices you have costly, but more info also reduces choice paralysis by increasing certainty about how good the different options are, and overall the posts make no claim about the overall direction of info being good or bad for bounded agents). To avoid feeding the trolls, I'm going to stop responding here.

Comment by jessica-liu-taylor on Privacy · 2019-03-16T22:28:47.739Z · score: -8 (7 votes) · LW · GW

You're continuing to miss the completely obvious point that a just process does no worse (in expectation) by having more information potentially available to it, which it can decide what to do with. Like, either you are missing really basic decision theory stuff covered in the Sequences or you are trolling.

(Agree that rewards affect thoughts too, and that these can cause distortions when done unjustly)

Comment by jessica-liu-taylor on Privacy · 2019-03-16T22:00:48.062Z · score: 9 (4 votes) · LW · GW

As Raemon says, knowing that others are making correct inferences about your behavior means you can’t relax. No, idk, watching soap operas, because that’s an indicator of being less likely to repay your loans, and your premia go up.

This is really, really clearly false!

  1. This assumes that, upon more facts being revealed, insurance companies will think I am less (not more) likely to repay my loans, by default (e.g. if I don't change my TV viewing behavior).
  2. More egregiously, this assumes that I have to keep putting in effort into reducing my insurance premiums until I have no slack left, because these premiums really, really, really matter. (I don't even spend that much on insurance premiums!)

If you meant this more generally, and insurance was just a bad example, why is the situation worse in terms of slack than it was before? (I already have the ability to spend leisure time on gaining more money, signalling, etc.)

Comment by jessica-liu-taylor on Blackmailers are privateers in the war on hypocrisy · 2019-03-16T21:54:46.667Z · score: 2 (1 votes) · LW · GW

You seem to be conflating "a general force that, globally, naively improves consistency is good" with "in every particular case, naively improving consistency is good". Obviously a global force is going to have benefits and drawbacks in different places, the question is whether the benefits outweigh the drawbacks.

Comment by jessica-liu-taylor on Humans aren't agents - what then for value learning? · 2019-03-16T21:08:41.474Z · score: 2 (1 votes) · LW · GW

Thanks for explaining, your position makes more sense now. I think I agree with your overall point that there isn't a "platonic Want" than can be directly inferred from physical state, at least without substantial additional psychology/philosophy investigation (which could, among other things, define bargaining solutions among the different wants).

So, there are at least a few different issues here for contingent wants:

Wants vary over time.

OK, so add a time parameter, and do what I want right now.

People could potentially use different "wanting" models for themselves.

Yes, but some models are better than others. (There's a discussion of arbitrariness of models here which seems relevant)

In practice the brain is going to use some weighting procedure between them. If this procedure isn't doing necessary messy work (it's really not clear if it is), then it can be replaced with an algorithm. If it is, then perhaps the top priority for value learning is "figure out what this thingy is doing and form moral opinions about it".

"Wanting" models are fallible.

Not necessarily a problem (but see next point); the main thing with AI alignment is to do much better than the "default" policy of having aligned humans continue to take actions, using whatever brain they have, without using AGI assistance. If people manage with having fallible "wanting" models, then perhaps the machinery people use to manage this can be understood?

"Wanting" models have limited domains of applicability.

This seems like Wei's partial utility function problem and is related to the ontology identification problem. It's pretty serious and is also a problem independently of value learning. Solving this problem would require either directly solving the philosophical problem, or doing psychology to figure out what machinery does ontology updates (and form moral opinions about that).

Comment by jessica-liu-taylor on Humans aren't agents - what then for value learning? · 2019-03-16T20:42:29.942Z · score: 2 (1 votes) · LW · GW

Or one might imagine a model that has psychological parts, but distributes the function fulfilled by “wants” in an agent model among several different pieces, which might conflict or reinforce each other depending on context.

Hmm, so with enough compute (like, using parts of your brain to model the different psychological parts), perhaps you could do something like this for yourself. But you couldn't predict the results of the behavior of people smarter than you. For example, you would have a hard time predicting that Kasparov would win a chess game against a random chess player, without being as good at chess as Kasparov yourself, though even with the intentional stance you can't predict his actions. (You could obviously predict this using statistics, but that wouldn't be based on just the mechanical model itself)

That is, it seems like the intentional stance often involves using much less compute than the person being modeled in order to predict that things will go in the direction of the person's wants (limited by the person's capabilities), without predicting each of the person's actions.

Comment by jessica-liu-taylor on Privacy · 2019-03-16T20:19:23.332Z · score: 0 (3 votes) · LW · GW

Whence fear of unjust punishment if there is no unjust punishment? Hypothetically there could be (justified) fear of a counterfactual that never happens, but this isn't a stable arrangement (in practice, some people will not work as hard to avoid the unjust punishment, and so will get punished)

Comment by jessica-liu-taylor on Humans aren't agents - what then for value learning? · 2019-03-16T04:40:34.191Z · score: 5 (3 votes) · LW · GW

A fictional dialogue to illustrate:

A: Humans aren't agents, humans don't want things. It would be bad to make an AI that assumes these things.

B: What do you mean by "bad"?

A: Well, there are multiple metaethical theories, but for this conversation, let's say "bad" means "not leading to what the agents in this context collectively want".

B: Aha, but what does "want" mean?

A: ...

[EDIT: what I am suggesting is something like "find your wants in your metaphysical orientation, not your ontology, although perhaps use your ontology for more information about your wants".]

[EDIT2: Also, your metaphysical orientation might be confused, in which case the solution is to resolve that confusion, producing a new metaphysical orientation, plausibly one that doesn't have "wanting" and for which there is therefore no proper "AI alignment" problem, although it might still have AI-related philosophical problems]

Comment by jessica-liu-taylor on Humans aren't agents - what then for value learning? · 2019-03-16T04:12:08.720Z · score: 5 (3 votes) · LW · GW

I think the obvious extreme is a detailed microscopic model that reproduces human behavior without using the intentional stance—is this a model that doesn’t generate itself, or is this a model that assigns agency to some humans?

It would generate itself given enough compute, but you can't, as a human, use physics to predict that humans will invent physics, without using some agency concept. Anyway, there are decision theoretic issues with modeling yourself as a pure mechanism; to make decisions, you think of yourself as controlling what this mechanism does. (This is getting somewhat speculative; I guess my main point here is that you, in practice, have to use the intentional stance to actually predict human behavior as complex as making models of humans, which doesn't mean an AI would)

Does it seem clear to you that if you model a human as a somewhat complicated thermostat (perhaps making decisions according to some kind of flowchart) then you aren't going to predict that a human would write a post about humans being somewhat complicated thermostats?

There’s wanting, and then there’s Wanting.

When I say "suppose you want something" I mean "actual wanting" with respect to the purposes of this conversation, which might map to your Wanting. It's hard to specify exactly. The thing I'm saying here is that a notion of what "wanting" is is implicit in many discourses, including discourse on what AI we should build (notice the word "should" in that sentence).

Relevant: this discussion of proofs of the existence of God makes the similar point that perhaps proofs of God are about revealing a notion of God already implicit in the society's discourse. I'm proposing a similar thing about "wanting".

(note: this comment and my previous one should both be read as speculative research idea generation, not solidified confident opinions)

Comment by jessica-liu-taylor on Privacy · 2019-03-16T00:28:23.508Z · score: 4 (2 votes) · LW · GW

Good point, I updated towards the extraction rate being higher than I thought (will edit my comment). Rich people do end up existing but they're rare and are often under additional constraints.

Comment by jessica-liu-taylor on Privacy · 2019-03-15T23:55:49.126Z · score: -1 (3 votes) · LW · GW

Even in one of the possible-just-worlds, it seems like you’re going to incentivize the last one much more than the 2nd or 3rd.

This is not responsive to what I said! If you can see (or infer) the process by which someone decided to have one thought or another, you can reward them for doing things that have higher expected returns, e.g. having heretical thoughts when heresy is net positive in expectation. If you can't implement a process that complicated, you can just stop punishing people for heresy, entirely ignoring their thoughts if necessary.

the key implication I believe in, is that humans are not nearly smart enough at present to coordinate on anything like a just world, even if everyone were incredibly well intentioned. This whole conversation is in fact probably not possible for the average person to follow.

Average people don't need to do it, someone needs to do it. The first target isn't "make the whole world just", it's "make some local context just". Actually, before that, it's "produce common knowledge in some local context that the world is unjust but that justice is desirable", which might actually be accomplished in this very thread, I'm not sure.

And this implication in this sentence right here right now is something that could get me punished in many circles, even by people trying hard to do the right thing.

Thanks for adding this information. I appreciate that you're making these parts of your worldview clear.

Comment by jessica-liu-taylor on Privacy · 2019-03-15T23:44:15.256Z · score: 4 (2 votes) · LW · GW

Good point. I think there is a lot of scapegoating (in the sense you mean here) but that's a further claim than that it's unjust punishment, and I don't believe this strongly enough to argue it right now.

Comment by jessica-liu-taylor on Privacy · 2019-03-15T23:39:55.203Z · score: 1 (2 votes) · LW · GW

I agree with Raemon that “judge” here means something closer to one of its standard usages, “to make inferences about”.

The post implies it is bad to be judged. I could have misinterpreted why, but that implication is there. If judge just meant "make inferences about" why would it be bad?

One, if the world is a place where may more vulnerabilities are more known, this incentivizes more people to specialize in exploiting those vulnerabilities.

But it also helps in knowing who's exploiting them! Why does it give more advantages to the "bad" side?

Two, as a flawed human there are probably some stressors against which you can’t credibly play the “won’t negotiate with terrorists” card.

Why would you expect the terrorists to be miscalibrated about this before the reduction in privacy, to the point where they think people won't negotiate with them when they actually will, and less privacy predictably changes this opinion?

I think the assumption is these are ~baseline humans we’re talking about, and most human brains can’t hold norms of sufficient sophistication to capture true ethical law

Perhaps the optimal set of norms for these people is "there are no rules, do what you want". If you can improve on that, than that would constitute a norm-set that is more just than normlessness. Capturing true ethical law in the norms most people follow isn't necessary.

I guess you can imagine the maximally inconvenient case where motivated people with low cost of time and few compunctions know your resources and full utility function, and can proceed to extract ~all liquid value from you.

Sure, but doesn't it help me against them too?

Comment by jessica-liu-taylor on Privacy · 2019-03-15T23:22:02.033Z · score: 1 (3 votes) · LW · GW

(In particular, ‘scapegoating’ feels like a very different frame than the one I’d use here)

Having read Zvi's post and my comment, do you think the norm-enforcement process is just, or even not very unjust? If not, what makes it not scapegoating?

Comment by jessica-liu-taylor on Privacy · 2019-03-15T23:17:44.771Z · score: 0 (4 votes) · LW · GW

Regarding that sentence, I edited my comment at about the same time you posted this.

But, fully-and-justly-transparent-world can still mean that fewer people think original or interesting thoughts because doing so is too risky.

If someone taking a risk is good with respect to the social good, then the justice process should be able to see that they did that and reward them (or at least not punish them) for it, right? This gets easier the more information is available to the justice process.

Comment by jessica-liu-taylor on Humans aren't agents - what then for value learning? · 2019-03-15T23:06:01.824Z · score: 9 (6 votes) · LW · GW

Suppose you are building an AI and want something from it. Then you are an agent with respect to that thing, since you want it. Probably, you also want the AI to infer your want and act on it. If you don't want things, then you have no reason to build an AI (or not to build an AI).

Models of humans based on control theory aren't generative enough to generate control theory; a group of people just acting on stimulus/response won't spontaneously write a book about control theory. If your model of humans is generative enough to generate itself, then it will assign agency to at least some humans, enough to reflect your goals in making the model.

And, if we're in the context of making models of humans (for the purpose of AI), it's sufficient (in this context, with respect to this context) to achieve the goals of this context.

Comment by jessica-liu-taylor on Privacy · 2019-03-15T23:00:18.301Z · score: -4 (3 votes) · LW · GW

If privacy in general is reduced, then they get to see others' thoughts too [EDIT: this sentence isn't critical, the rest works even if they can only see your thoughts]. If they're acting justly, then they will take into account that others might modify their thoughts to look smarter, and make basically well-calibrated (if not always accurate) judgments about how smart different people are. (People who are trying can detect posers a lot of the time, even without mind-reading). So, them having more information means they are more likely to make a correct judgment, hiring the smarter person (or, generally, whoever can do the job better). At worst, even if they are very bad at detecting posers, they can see everyone's thoughts and choose to ignore them, making the judgment they would make without having this information (But, they were probably already vulnerable to posers, it's just that seeing people's thoughts doesn't have to make them more vulnerable).

Comment by jessica-liu-taylor on Privacy · 2019-03-15T22:32:22.084Z · score: 13 (10 votes) · LW · GW

In a scapegoating environment, having privacy yourself is obviously pretty important. However, you seem to be making a stronger point, which is that privacy in general is good (e.g. we shouldn't have things like blackmail and surveillance which generally reduce privacy, not just our own privacy). I'm going to respond assuming you are arguing in favor of the stronger point.

This post rests on several background assumptions about how the world works, which are worth making explicit. I think many of these are empirically true but are, importantly, not necessarily true, and not all of them are true.

We need a realm shielded from signaling and judgment. A place where what we do does not change what everyone thinks about us, or get us rewarded and punished.

Implication: it's bad for people to have much more information about other people (generally), because they would reward/punish them based on that info, and such rewarding/punishing would be unjust. We currently have scapegoating, not justice. (Note that a just system for rewarding/punishing people will do no worse by having more information, and in particular will do no worse than the null strategy of not rewarding/punishing behavior based on certain subsets of information)

We need people there with us who won’t judge us. Who won’t use information against us.

Implication: "judge" means to use information against someone. Linguistic norms related to the word "judgment" are thoroughly corrupt enough that it's worth ceding to these, linguistically, and using "judge" to mean (usually unjustly!) using information against people.

A complete transformation of our norms and norm principles, beyond anything I can think of in a healthy historical society, would be required to even attempt full non-contextual strong enforcement of all remaining norms.

Implication (in the context of the overall argument): a general reduction in privacy wouldn't lead to norms changing or being enforced less strongly, it would lead to the same norms being enforced strongly. Whatever or whoever decides which norms to enforce and how to enforce them is reflexive rather than responsive to information. We live in a reflex-based control system.

There are also known dilemmas where any action taken would be a norm violation of a sacred value.

Implication: the system of norms is so corrupt that they will regularly put people in situations where they are guaranteed to be blamed, regardless of their actions. They won't adjust even when this is obvious.

Part of the job of making sausage is to allow others not to see it. We still get reliably disgusted when we see it.

Implication: people expect to lose value by knowing some things. Probably, it is because they would expect to be punished due to it being revealed they know these things (as in 1984). It is all an act, and it's better not to know that in concrete detail.

We constantly must claim ‘everything is going to be all right’ or ‘everything is OK.’ That’s never true. Ever.

Implication: the control system demands optimistic stories regardless of the facts. There is something or someone forcing everyone to call the deer a horse under threat of punishment, to maintain a lie about how good things are, probably to prop up an unjust regime.

But these problems, while improved, wouldn’t go away in a better or less hypocritical time. Norms are not a system that can have full well-specified context dependence and be universally enforced. That’s not how norms work.

Implication: even in the most just possible system of norms, it would be good to sometimes violate those norms and hide the fact that you violated them. (This seems incorrect to me!)

If others know exactly what resources we have, they can and will take all of them.

Implication: the bad guys won; we have rule by gangsters, who aren't concerned with sustainable production, and just take as much stuff as possible in the short term. (This seems on the right track but partially false; the top marginal tax rate isn't 100% [EDIT: see Ben's comment, the actual rate of extraction is higher than the marginal tax rate])

If it is known how we respond to any given action, others find best responses. They will respond to incentives. They exploit exactly the amount we won’t retaliate against. They feel safe.

Implication: more generally available information about what strategies people are using helps "our" enemies more than it helps "us". (This seems false to me, for notions of "us" that I usually use in strategy)

World peace, and doing anything at all that interacts with others, depends upon both strategic confidence in some places, and strategic ambiguity in others. We need to choose carefully where to use which.

Implication (in context): strategic ambiguity isn't just necessary for us given our circumstances, it's necessary in general, even if we lived in a surveillance state. (Huh?)

To conclude: if you think the arguments in this post are sound (with the conclusion being that we shouldn't drastically reduce privacy in general), you also believe the implications I just listed, unless I (or you) misinterpreted something.

Comment by jessica-liu-taylor on Blackmailers are privateers in the war on hypocrisy · 2019-03-14T23:06:21.362Z · score: 1 (4 votes) · LW · GW

Information that would damage someone is very different from information that will damage someone.

What do you mean?

Comment by jessica-liu-taylor on Blegg Mode · 2019-03-14T08:16:43.210Z · score: 3 (2 votes) · LW · GW

To be clear, brevity and simplicity are not the same as kindness and surface presentation, and confusing these two seems like a mistake 8 year olds can almost always avoid making. (No pressure to respond; in any case I meant to talk about the abstract issue of accurate summaries which seems not to be politically charged except in the sense that epistemology itself is a political issue, which it is)

Comment by jessica-liu-taylor on Blegg Mode · 2019-03-14T00:49:40.901Z · score: 3 (2 votes) · LW · GW

No one has time to look into the details of everything. If someone isn't going to look into the details of something, they benefit from the summaries being accurate, in the sense that they reflect how an honest party would summarize the details if they knew them. (Also, how would you know which things you should look into further if the low-resolution summaries are lies?)

This seems pretty basic and it seems like you were disagreeing with this by saying the description should be based on kindness and surface presentation. Obviously some hidden attributes matter more than others (and matter more or less context-dependently), my assertion here is that summaries should be based primarily on how they reflect the way the thing is (in all its details) rather than on kindness and surface presentation.

Comment by jessica-liu-taylor on Blegg Mode · 2019-03-13T22:21:23.973Z · score: 3 (2 votes) · LW · GW

in unspecific conversation where details don’t matter, one should prefer kindness and surface presentation.

Why? Doesn't this lead to summaries being inaccurate and people having bad world models (ones that would assign lower probability to the actual details, compared to ones based on accurate summaries)?

Comment by jessica-liu-taylor on Blegg Mode · 2019-03-13T07:55:46.634Z · score: 10 (4 votes) · LW · GW

Since I’m not there yet, if I just take at intuitive amateur guess at how I might expect this to work, it seems pretty intuitively plausible that we’re going to want the category node to be especially sensitive to cheap-to-observe features that correlate with goal-relevant features? Like, yes, we ultimately just want to know as much as possible about the decision-relevant variables, but if some observations are more expensive to make than others, that seems like the sort of thing the network should be able to take into account, right??

I think the mathematically correct thing here is to use something like the expectation maximization algorithm. Let's say you have a dataset that is a list of elements, each of which has some subset of its attributes known to you, and the others unknown. EM does the following:

  1. Start with some parameters (parameters tell you things like what the cluster means/covariance matrices are; it's different depending on the probabilistic model)
  2. Use your parameters, plus the observed variables, to infer the unobserved variables (and cluster assignments) and put Bayesian distributions over them
  3. Do something mathematically equivalent to generating a bunch of "virtual" datasets by sampling the unobserved variables from these distributions, then setting the parameters to assign high probability to the union of these virtual datasets (EM isn't usually described this way but it's easier to think about IMO)
  4. Repeat starting from step 2

This doesn't assign any special importance to observed features. Since step 3 is just a function of the virtual datasets (not taking into account additional info about which variables are easy to observe), they're going to take all the features, observable or not, into account. However, the hard-to-observe features are going to have more uncertainty to them, which affects the virtual datasets. With enough data, this shouldn't matter that much, but the argument for this is a little complicated.

Another way to solve this problem (which is easier to reason about) is by fully observing a sufficiently high number of samples. Then there isn't a need for EM, you can just do clustering (or whatever other parameter fitting) on the dataset (actually, clustering can be framed in terms of EM, but doesn't have to be). Of course, this assigns no special importance to easy-to-observe features. (After learning the parameters, we can use them to infer the unobserved variables probabilistically)

Philosophically, "functions of easily-observed features" seem more like percepts than concepts (this post describes the distinction). These are still useful, and neural nets are automatically going to learn high-level percepts (i.e. functions of observed features), since that's what the intermediate layers are optimized for. However, a Bayesian inference method isn't going to assign special importance to observed features, as it treats the observations as causally downstream of the ontological reality rather than causally upstream of it.

Comment by jessica-liu-taylor on Some Thoughts on Metaphilosophy · 2019-02-11T02:09:01.594Z · score: 6 (3 votes) · LW · GW

More over, I am skeptical that going on meta-level simplifies the problem to the level that it will be solvable by humans (the same about meta-ethics and theory of human values).

This is also my reason for being pessimistic about solving metaphilosophy before a good number of object-level philosophical problems have been solved (e.g. in decision theory, ontology/metaphysics, and epistemology). If we imagine being in a state where we believe running computation X would solve hard philosophical problem Y, then it would seem that we already have a great deal of philosophical knowledge about Y, or a more general class of problems that includes Y.

More generally, we could look at the history difficulty of solving a problem vs. the difficulty of automating it. For example: the difficulty of walking vs. the difficulty of programming a robot to walk; the difficulty of adding numbers vs. the difficulty of specifying an addition algorithm; the difficulty of discovering electricity vs. the difficulty of solving philosophy of science to the point where it's clear how a reasoner could have discovered (and been confident in) electricity; and so on.

The plausible story I have that looks most optimistic for metaphilosophy looks something like:

  1. Some philosophical community makes large progress on a bunch of philosophical problems, at a high level of technical sophistication.
  2. As part of their work, they discover some "generators" that generate a bunch of the object-level solutions when translated across domains; these generators might involve e.g. translating a philosophical problem to one of a number of standard forms and then solving the standard form.
  3. They also find philosophical reasons to believe that these generators will generate good object-level solutions to new problems, not just the ones that have already been studied.
  4. These generators would then constitute a solution to metaphilosophy.
Comment by jessica-liu-taylor on Is Agent Simulates Predictor a "fair" problem? · 2019-01-24T21:32:29.464Z · score: 7 (3 votes) · LW · GW

There are some formal notions of fairness that include ASP. See Asymptotic Decision Theory.

Here's one way of thinking about this. Imagine a long sequence of instances of ASP. Both the agent and predictor in a later instance know what happened in all the earlier instances (say, because the amount of compute available in later instances is much higher, such that all previous instances can be simulated). The predictor in ASP is a logical inductor predicting what the agent will do this time.

Looking at the problem this way, it looks pretty fair. Since logical inductors can do induction, if an agent takes actions according to a certain policy, then the predictor will eventually learn this, regardless of the agent's source code. So only the policy matters, not the source code.

See also In Logical Time, All Games are Iterated Games.

Comment by jessica-liu-taylor on Imitation learning considered unsafe? · 2019-01-07T01:10:19.774Z · score: 5 (2 votes) · LW · GW

In trying to argue for the safety of imitation learning, the key property of the imitation might be something like "the imitation is indistinguishable from the actual human on this distribution, where the distinguishers may be from this model class and have this information". GANs get a property like this pretty directly, although there is an issue in that they don't have access to the imitator's internal states, so e.g. they can't tell whether the imitator is using pseudorandomness or the input-based randomness imitators have access to.

Suppose we have an imitation system with a property like this. Then the argument for safety is going to look something like: "The human being imitated is corrigible. If the imitator weren't corrigible, then the discriminator could tell, because this would result in distinguishable actions."

For example, let's say that the imitated human is able to violently take control of the AI lab. In fact the imitated human doesn't do this (in any of the training data), and, arguably, it doesn't take a smart discriminator to tell that the human probably wouldn't do this in a new situation. So, if the imitator did do this (because e.g. it incorrectly inferred the human's decision theory), the discriminator could tell it apart. Of course, arguing that the discriminator generalizes this well would require some robustness argument; this particular problem seems easy (if the method for taking control involves taking really obvious actions like using weapons) but there might be more subtle ways of taking control. In those cases we would want some argument that, if the imitator comes up with a malign/incorrigible plan, then a discriminator with access to the imitator's internal states can notice this and notice that the imitated human wouldn't do this, because this isn't harder than coming up with the plan in the first place, and the discriminator is at least as capable as the imitator.

In general, while there are potential problems, I expect them to be more subtle than "the imitator incorrectly infers the human's decision theory and pursues convergent instrumental goals".

(Worth noting other problems with imitation learning, discussed in this post and this post)

Comment by jessica-liu-taylor on Will humans build goal-directed agents? · 2019-01-05T12:03:48.617Z · score: 2 (1 votes) · LW · GW

Can you clarify the argument? Are you optimizing for an obvious AI disaster to happen as soon as possible so people take the issue more seriously?

Comment by jessica-liu-taylor on Logical inductors in multistable situations. · 2019-01-04T02:06:42.936Z · score: 9 (5 votes) · LW · GW

Different logical inductors will give different probabilities for each . The logical induction criterion does not require any answer in particular.

Any particular deterministic algorithm for finding a logical inductor (such as the one in the paper) will yield a logical inductor that gives particular probabilities for these statements, which are close to fixed points in the limit. The algorithm in the paper is parameterized over some measure on Turing machines, and will give different answers depending on this measure. You could analyze which measures would lead to which fixed points, but this doesn't seem very interesting.

Comment by jessica-liu-taylor on Predictors as Agents · 2019-01-02T00:37:40.804Z · score: 3 (2 votes) · LW · GW

I think the fixed point finder won't optimize the fixed point for minimizing expected log loss. I'm going to give a concrete algorithm and show that it doesn't exhibit this behavior. If you disagree, can you present an alternative algorithm?

Here's the algorithm. Start with some oracle (not a reflective oracle). Sample ~1000000 universes based on this oracle, getting 1000000 data points for what the reflective oracle outputs. Move the oracle 1% of the way from its current position towards the oracle that would answer queries correctly given the distribution over universes implied by the data points. Repeat this procedure a lot of times (~10,000). This procedure is similar to gradient descent.

Here's an example universe:

Note the presence of two reflective oracles that are stable equilibria: one where , and one where . Notice that the first has lower expected log loss than the second.

Let's parameterize oracles by numbers in representing (since this is the only relevant query). Start with oracle . If we sample 1000000 universes, about 45% of them have outcome 1. So, based on these data points, , so the oracle based on these data points will say , i.e. it is parameterized by 1. So we move our current oracle (0.5) 1% of the way towards the oracle 1, yielding oracle 0.505. We repeat this a bunch of times, eventually getting an oracle parameterized by a number very close to 1.

So, this procedure yields an oracle with suboptimal expected log loss. It is not the case that the fixed point finder minimizes expected log loss. The neural net case is different, but not that much; it would give the same answer in this particular case, since the model can just be parameterized by a single real number.

Comment by jessica-liu-taylor on Predictors as Agents · 2019-01-01T05:26:07.272Z · score: 3 (2 votes) · LW · GW

The capacity for agency arises because, in a complex environment, there will be multiple possible fixed-points. It’s quite likely that these fixed-points will differ in how the predictor is scored, either due to inherent randomness, logical uncertainty, or computational intractability(predictors could be powerfully superhuman while still being logically uncertain and computationally limited). Then the predictor will output the fixed-point on which it scores the best.

Reflective oracles won't automatically do this. They won't minimize log loss or any other cost function. For a given situation, there can be multiple reflective oracles; for example, in a universe (i.e. the universe asks the reflective oracle if it equals 1 with probability greater or less than 50%), there are three reflective oracles: . There isn't any defined procedure for selecting which of these reflective oracles is the real one. A reflective oracle that says will get a lower average log loss than one that says , however these are all considered to be reflective oracles.

Is there a reason you think a reflective oracle (or equivalent) can't just be selected "arbitrarily", and will likely be selected to maximize some score? (In this example there's an issue in that the 1/2 reflective oracle is an unstable equilibrium, so natural ways of finding reflective oracles using gradient descent will be unlikely to find it, however it is possible to set up situations where gradient descent leads to reflective oracles with suboptimal Bayes score.)

My sense is that the simplest methods for finding a reflective oracle will do something similar to finding a correlated equilibrium using gradient descent on each player's strategy individually. This certainly does a kind of optimization, though since it's similar to a multiplayer game it won't correspond to global optimization like finding the reflective oracle with the lowest expected log loss. The kind of optimization it does more resembles "given my current reflective oracle, and the expected future states resulting from this, how should I adjust this oracle to better match this distribution of future states?"

(For more on natural methods for finding (correlated) reflective oracles, I recommend looking at lectures 17-18 of this course and this post on correlated reflective oracles.)

Comment by jessica-liu-taylor on Boundaries enable positive material-informational feedback loops · 2018-12-23T23:34:52.280Z · score: 4 (2 votes) · LW · GW

This is helpful, thanks!

Comment by jessica-liu-taylor on Act of Charity · 2018-12-23T23:22:32.699Z · score: 3 (2 votes) · LW · GW

Re the section you quoted: if you watched for 1 minute longer you would see that the issue is that the local net manufacturers can't scale up because they would have to compete with free nets, so the local infrastructure atrophies. (Absent aid, they could scale up, it would just take longer; scaling up might require additional capital or training)

The issue isn't number of jobs. The issue is (a) infrastructure to solve your own problems and (b) motivation to solve your own problems rather than waiting to have someone else solve them for you. This is all covered in the "Dependency" section of the video, which I am still not convinced you have watched.

Comment by jessica-liu-taylor on Boundaries enable positive material-informational feedback loops · 2018-12-23T22:00:11.847Z · score: 12 (3 votes) · LW · GW

Are there any economics textbooks you'd recommend?

Comment by jessica-liu-taylor on Boundaries enable positive material-informational feedback loops · 2018-12-23T21:39:18.890Z · score: 10 (2 votes) · LW · GW

Rent seeking and regulatory capture are major themes in public choice theory.

Oh, I got public choice theory confused with social choice theory, you're right.

What is the explanation that your framework generates?

So, I'm still confused about this (I plan to investigate this more), but the framework of this post would posit some combination of: processes that produce houses are hard to imitate and have gotten worse over time as knowledge is lost; regulatory capture; coordination being harder as better attacks on existing coordination systems have been developed (e.g. more ways of pretending to work, more bullshit jobs that are considered part of a normal business); some kind of coordination among house-builders. Since there are lots of possible explanations, this isn't very enlightening on its own, and more investigation is needed.

Upon writing this, it seems like my framework isn't clearly better at explaining the phenomenon than the field of economics (they both posit that there could be many causes), until further investigation has been done; however, that isn't the assertion I was originally making, and also it's kind of a moot point since I previously thought you were arguing that the content of this post was obvious, and responding to that.

Comment by jessica-liu-taylor on Boundaries enable positive material-informational feedback loops · 2018-12-23T21:11:19.324Z · score: 4 (2 votes) · LW · GW

Given that the “first principles” must by necessity be much simplified compared to the real world, how do you know whether the derivations come anywhere close to explaining it?

Academia's models are also simplified. Given things like the replication crisis, I am really not convinced that academia is good at vetting things outside STEM. Generation of known-good ideas can be separated into generating ideas and checking them; generating ideas can be useful even they can't be fully checked yet. In practice, to vet my models, I look at things like: which conclusions are logically sound (e.g. that growing economies require positive material feedback loops), whether it matches up with information I have, compatibility with other models/intuitions (where incompatibility could mean either model is wrong), and so on. I don't think this is that different from what the most generative academics do. If I were reading academic papers, I would be doing the same checks to determine what to trust and how to integrate the ideas into my own models.

Can your framework derive the well known consequences of asymmetric information in economics, such as it often leading to failure to agree upon mutually beneficial deals?

Kind of. I haven't discussed rational agents yet. But, it's possible to say that some systems will be more ecologically fit if they hide certain information, and some will be more ecologically fit if they only make trades given the info that making this trade would gain some necessary resource, from which it is derived that ecologically fit systems will fail to make trades that increase the fitness of both of them, where ecological fitness could be defined in terms of speed and sustainability of positive feedback loop. There are different advantages/disadvantages to thinking about things this way instead of in terms of rational agency.

Comment by jessica-liu-taylor on Boundaries enable positive material-informational feedback loops · 2018-12-23T19:54:03.264Z · score: 12 (3 votes) · LW · GW

At a very high level, a post-EA strategy might look something like this:

  1. Invest in yourself, your friends, and your projects.
  2. Become less confused about how the world works and what strategies are effective for doing things in it. Reading from a variety of fields (math, history, decision theory, economics, etc) is a good way to start, as is doing first-principles analyses and writing about them, and visiting places you think might be important to investigate.
  3. Find friends you can talk to about this, and who you can coordinate with.
  4. Make one or more plans for causing nice things to happen, which will probably involve environmentally sustainable positive feedback loops with positive externalities.
  5. Execute on these plans until they becomes obsolete.

Not everyone has to do all these things, some can support people doing this, or be one member of a group that does this. Also, some people don't care enough to actually do these things, and probably these people shouldn't do strategy, and should instead do things they actually care about.

Comment by jessica-liu-taylor on Boundaries enable positive material-informational feedback loops · 2018-12-23T19:35:14.310Z · score: 2 (1 votes) · LW · GW

I worry that the obviousness is misleading, because economies actually are made of more or less rational agents, and as a result have a lot of anti-inductive elements, which are not captured by a control theory based derivation.

This is a really good point, thanks. This points at some areas of my strategy that aren't explained in this post (these areas contain things like asymmetric weapons and using decision theories that pass the mirror test).

In that case it's probably explained by preferences for greater quality and size in housing (explained by the income effect and housing being in part a positional good), plus increased regulation (studied under public choice theory).

The first explanation would imply that people are willing to pay 6x as much for a 2000-style house than a 1950-style house (ignoring factors like 1950-style houses being scarcer now than they were in 1950), which seems false. A public choice theory framework for regulation assumes that these regulations are generally in people's interest, whereas they often aren't (people aren't very informed about what housing regulations are good, and regulatory capture is a thing); indeed, if people wouldn't be willing to pay 6x as much for a 2000-style house than a 1950-style house after knowing about these additional regulations, then that indicates that these regulations aren't in their interest. Perhaps there are one or more good explanations within the field of economics for this phenomenon, but it does not seem like the search strategy you are using to produce economics explanations for this phenomenon is getting good explanations at a very high rate, which indicates that the field of economics is drawing little attention to good explanations for this, or is drawing attention away from the good explanations.

Boundaries enable positive material-informational feedback loops

2018-12-22T02:46:48.938Z · score: 30 (12 votes)

Act of Charity

2018-11-17T05:19:20.786Z · score: 142 (56 votes)

EDT solves 5 and 10 with conditional oracles

2018-09-30T07:57:35.136Z · score: 61 (18 votes)

Reducing collective rationality to individual optimization in common-payoff games using MCMC

2018-08-20T00:51:29.499Z · score: 50 (17 votes)

Buridan's ass in coordination games

2018-07-16T02:51:30.561Z · score: 55 (19 votes)

Decision theory and zero-sum game theory, NP and PSPACE

2018-05-24T08:03:18.721Z · score: 109 (36 votes)

In the presence of disinformation, collective epistemology requires local modeling

2017-12-15T09:54:09.543Z · score: 116 (42 votes)

Autopoietic systems and difficulty of AGI alignment

2017-08-20T01:05:10.000Z · score: 3 (3 votes)

Current thoughts on Paul Christano's research agenda

2017-07-16T21:08:47.000Z · score: 6 (6 votes)

Why I am not currently working on the AAMLS agenda

2017-06-01T17:57:24.000Z · score: 15 (8 votes)

A correlated analogue of reflective oracles

2017-05-07T07:00:38.000Z · score: 4 (4 votes)

Finding reflective oracle distributions using a Kakutani map

2017-05-02T02:12:06.000Z · score: 1 (1 votes)

Some problems with making induction benign, and approaches to them

2017-03-27T06:49:54.000Z · score: 3 (3 votes)

Maximally efficient agents will probably have an anti-daemon immune system

2017-02-23T00:40:47.000Z · score: 3 (3 votes)

Are daemons a problem for ideal agents?

2017-02-11T08:29:26.000Z · score: 5 (2 votes)

How likely is a random AGI to be honest?

2017-02-11T03:32:22.000Z · score: 0 (0 votes)

My current take on the Paul-MIRI disagreement on alignability of messy AI

2017-01-29T20:52:12.000Z · score: 14 (7 votes)

On motivations for MIRI's highly reliable agent design research

2017-01-29T19:34:37.000Z · score: 8 (8 votes)

Strategies for coalitions in unit-sum games

2017-01-23T04:20:31.000Z · score: 2 (2 votes)

An impossibility result for doing without good priors

2017-01-20T05:44:26.000Z · score: 1 (1 votes)

Pursuing convergent instrumental subgoals on the user's behalf doesn't always require good priors

2016-12-30T02:36:48.000Z · score: 6 (4 votes)

Predicting HCH using expert advice

2016-11-28T03:38:05.000Z · score: 3 (3 votes)

ALBA requires incremental design of good long-term memory systems

2016-11-28T02:10:53.000Z · score: 1 (1 votes)

Modeling the capabilities of advanced AI systems as episodic reinforcement learning

2016-08-19T02:52:13.000Z · score: 4 (2 votes)

Generative adversarial models, informed by arguments

2016-06-27T19:28:27.000Z · score: 0 (0 votes)

In memoryless Cartesian environments, every UDT policy is a CDT+SIA policy

2016-06-11T04:05:47.000Z · score: 3 (3 votes)

Two problems with causal-counterfactual utility indifference

2016-05-26T06:21:07.000Z · score: 3 (3 votes)

Anything you can do with n AIs, you can do with two (with directly opposed objectives)

2016-05-04T23:14:31.000Z · score: 2 (2 votes)

Lagrangian duality for constraints on expectations

2016-05-04T04:37:28.000Z · score: 1 (1 votes)

Rényi divergence as a secondary objective

2016-04-06T02:08:16.000Z · score: 2 (2 votes)

Maximizing a quantity while ignoring effect through some channel

2016-04-02T01:20:57.000Z · score: 2 (2 votes)

Informed oversight through an entropy-maximization objective

2016-03-05T04:26:54.000Z · score: 0 (0 votes)

What does it mean for correct operation to rely on transfer learning?

2016-03-05T03:24:27.000Z · score: 4 (4 votes)

Notes from a conversation on act-based and goal-directed systems

2016-02-19T00:42:29.000Z · score: 3 (3 votes)

A scheme for safely handling a mixture of good and bad predictors

2016-02-17T05:35:55.000Z · score: 0 (0 votes)

A possible training procedure for human-imitators

2016-02-16T22:43:52.000Z · score: 2 (2 votes)

Another view of quantilizers: avoiding Goodhart's Law

2016-01-09T04:02:26.000Z · score: 3 (3 votes)

A sketch of a value-learning sovereign

2015-12-20T21:32:45.000Z · score: 11 (2 votes)

Three preference frameworks for goal-directed agents

2015-12-02T00:06:15.000Z · score: 4 (2 votes)

What do we need value learning for?

2015-11-29T01:41:59.000Z · score: 3 (3 votes)

A first look at the hard problem of corrigibility

2015-10-15T20:16:46.000Z · score: 3 (3 votes)

Conservative classifiers

2015-10-02T03:56:46.000Z · score: 2 (2 votes)

Quantilizers maximize expected utility subject to a conservative cost constraint

2015-09-28T02:17:38.000Z · score: 3 (3 votes)

A problem with resource-bounded Solomonoff induction and unpredictable environments

2015-07-27T03:03:25.000Z · score: 2 (2 votes)

PA+100 cannot always predict modal UDT

2015-05-12T20:26:53.000Z · score: 3 (3 votes)

MIRIx Stanford report

2015-05-11T06:11:26.000Z · score: 1 (1 votes)

Reflective probabilistic logic cannot assign positive probability to its own coherence and an inner reflection principle

2015-05-07T21:00:10.000Z · score: 5 (5 votes)

Learning a concept using only positive examples

2015-04-28T03:57:24.000Z · score: 3 (3 votes)

Minimax as an approach to reduced-impact AI

2015-04-02T22:00:04.000Z · score: 3 (3 votes)