Do agents with (mutually known) identical utility functions but irreconcilable knowledge sometimes fight?

post by mako yass (MakoYass) · 2023-08-23T08:13:05.631Z · LW · GW · 1 comment

This is a question post.

Contents

  Answers
    9 A.H.
    3 StartAtTheEnd
    2 tailcalled
    2 Gurkenglas
None
1 comment

Been pondering; will conflict always exist? A major subquestion: Suppose we all merge utility functions and form an interstellar community devoted to optimizing the merger. It'll probably make sense for us to specialize in different parts of the work, which means accumulating specialist domain knowledge and becoming mutually illegible.

When people have very different domain knowledge, they also fall out of agreement about what the borders of their domains are. (EG: A decision theorist is insisting that they know things about the trajectory of AI that ML researchers don't. ML researchers don't believe them and don't heed their advice.) In these situations, even when all parties are acting in good faith, they know that they wont be able to reconcile about certain disagreements, and it may seem to make sense, from some perspectives, to try to just impose their own way, in those disputed regions.

Would there be any difference between the dispute resolution methods that would be used here, and the dispute resolution methods that would be used between agents with different core values? (war, peace deals, and most saliently,)

Would the parties in the conflict use war proxies that take physical advantages in different domains into account? (EG: Would the decision theorist block ML research in disputed domains where their knowledge of decision theory would give them a force advantage?)

Answers

answer by A.H. · 2023-08-23T10:50:45.748Z · LW(p) · GW(p)

even when all parties are acting in good faith, they know that they wont be able to reconcile about certain disagreements, and it may seem to make sense, from some perspectives, to try to just impose their own way, in those disputed regions.

Aumann's agreement theorem which is discussed in the paper 'Are Disagreements Honest?' by Hanson and Cowen suggests that perfectly rational agents (updating via Bayes theorem) should not disagree in this fashion, even if their life experiences were different, provided that their opinions on all topics are common knowledge and they have common priors.  This is often framed as saying that such agents cannot 'agree to disagree'. 

I'm a bit hazy on the details, but broadly, two agents with common priors but different evidence (ie. different life experiences or expertise) can share their knowledge and mutually update based on their different knowledge, eventually converging on an agreed probability distribution.

Of course, humans are not perfectly rational so this rarely happens (this is discussed in the Hanson/Cowen paper). There are some results which seems to suggest you can relax some assumptions of Aumann's theorem to have more realistic assumptions and still get similar results. Scott Aaronson showed that Aumann's theorem holds (to a high degree) even when the agreement of agents over priors isn't perfect and the agents can exchange only limited amounts of information. 

Maybe the agents who are alive in the future will not be perfectly rational, but I guess we can hope that they might be rational enough to converge close enough to agreement that they don't fight on important issues.

comment by tailcalled · 2023-08-23T14:38:54.498Z · LW(p) · GW(p)

More detailed comment than mine, so strong upvote. However, there's one important error in the comment:

Of course, humans are not perfectly rational so this rarely happens

Actually it constantly happens. For instance yesterday I had a call with my dad, where I told him about my vacation in Norway, where the Bergen train had been cancelled due to the floods. He believed me, which is an immediate example of Aumann's agreement theorem applying.

Furthermore, there were a bunch of things that I had to do to handle the cancellations, which also relied on Aumannian agreement. For instance I didn't know where I could get news about the floods, which was in disagreement with Google and Twitter which had a bunch of concrete suggestions, so I adopted Google's/Twitter's view and then investigated further to update more. I also didn't know where I could get alternate transportation, but again Google had some flight suggestions that I Aumann-agreed to and then investigated further.

As another example, in Norway I was at a museum about an explorer who sailed the atlantic on a bamboo raft. At first I had disagreements with the museum as e.g. I didn't know that e.g. one of the people on the raft fell in the water and had to be rescued, but the museum told me that he did and so I Aumann-agreed with that.

I think Aumann-agreement is the default thing that happens when communicating, and it's just that usually it happens so quickly that we don't even register it as "disagreements". Persistent public disagreements require that the preconditions for Aumann's theorem fail, and so our idea of "disagreement" ends up connoting precisely the disagreements where Aumann's theorem fails.

Replies from: AlfredHarwood
comment by A.H. (AlfredHarwood) · 2023-08-23T22:41:51.520Z · LW(p) · GW(p)

Good point! Noticeably, some of your examples are 'one-way': one party updated while the other did not. In the case of Google/Twitter and the museum, you updated but they didn't, so this sounds like standard Bayesian updating, not specifically Aumann-like (though maybe this distinction doesn't matter, as the latter is a special case of the former).  

When I wrote the answer, I guess I was thinking about Aumann updating where both parties end up changing their probabilities (ie. Alice starts with a high probability of some proposition P and Bob starts with a low probability for P and, after discussing their disagreement, they converge to a middling probability). This didn't seem to me to be as common among humans. 

In the example with your Dad, it also seems one-way: he updated and you didn't. However, maybe the fact he didn't know there was a flood would have caused you to update slightly, but this update would be so small that it was negligible. So I guess you are right and that would count as an Aumann agreement!

Your last paragraph is really good. I will ponder it...

Replies from: tailcalled
comment by tailcalled · 2023-08-24T21:11:57.882Z · LW(p) · GW(p)

When I wrote the answer, I guess I was thinking about Aumann updating where both parties end up changing their probabilities (ie. Alice starts with a high probability of some proposition P and Bob starts with a low probability for P and, after discussing their disagreement, they converge to a middling probability). This didn't seem to me to be as common among humans.

I think this is a wrong picture to have in mind for Aumannian updating. It's about pooling evidence, and sometimes you can end up with more extreme views than you started with. While the exact way you update can vary depending on the prior and the evidence, one simple example I like is this:

You both start with having your log-odds being some vector x according to some shared prior. You then observe some evidence y, updating your log-odds to be x+y, while they observe some independent evidence z, updating their log-odds to be x+z. If you exchange all your information, then this updates your shared log-odds to be x+y+z, which is most likely going to be an even more radical departure from x than either x+y or x+z alone.

While this general argument is overly idealistic because it assumes independent evidence, I think the point that Aumannian agreement doesn't mean moderation is important.

That said, there is one place where Aumannian agreement locally leads to moderation: If during the conversation, you both learn that the sources you relied on were unreliable, then presumably you would mostly revert to the prior. However, in the context of politics (which is probably the main place where people want to think of this), the sources tend to be political coalitions, so updating that they were unreliable means updating that one cannot trust any political coalition, which in a sense is both common knowledge but also taken seriously is quite radical (because then you need to start doubting all the things you thought you knew).

Good point! Noticeably, some of your examples are 'one-way': one party updated while the other did not. In the case of Google/Twitter and the museum, you updated but they didn't, so this sounds like standard Bayesian updating, not specifically Aumann-like (though maybe this distinction doesn't matter, as the latter is a special case of the former).

There were a couple of multi-way cases too. For instance, one time we told someone that we intended to take the Bergen train, expecting that this would resolve the disagreement of them not knowing we would take the Bergen train. But then they continued disagreeing, and told us that the Bergen train was cancelled, which instead updated us to think we wouldn't take the Bergen train.

But I think generally disagreements would be exponentially short? Because if each time you share a piece of information that you expect to change their mind, then the probability that they haven't changed their mind drops exponentially with the number of pieces of information shared.

comment by dr_s · 2023-08-24T10:27:19.350Z · LW(p) · GW(p)

I'm a bit hazy on the details, but broadly, two agents with common priors but different evidence (ie. different life experiences or expertise) can share their knowledge and mutually update based on their different knowledge, eventually converging on an agreed probability distribution.

But whether I believe in the info you give depends on my belief in your credibility, and vice versa. So it's entirely possible to exchange information and still end up with different posteriors.

comment by mako yass (MakoYass) · 2023-08-23T20:59:03.850Z · LW(p) · GW(p)

You seem to be looking away from the aspect of the question where any usefully specialized agencies cannot synchronize domain knowledge (which reasserts itself as a result of the value of specialization, an incentive to deepen knowledge differences over time, and to bring differently specialized agents closer together. Though of course, they need to be mutually legible in some ways to benefit from it.). This is the most interesting and challenging part of the question so that was kind of galling.

But the Aaronson paper is interesting. It's possible it addresses it. Thanks for that.

answer by StartAtTheEnd · 2023-11-16T09:07:11.125Z · LW(p) · GW(p)

I have thought about similar things, with just humans as the subject. I'm hoping that the overlap is great enough that some of these ideas may be useful.

Firstly, what is meant by conflict? Hostility? more intelligent agents seem to fight less, as the scope which they consider increases. Stupid people have a very small scope, which might even be limited to themselves and the moment in question. Smarter people start thinking about the future, and the environment they're in (as an extension of themselves). Smarter people still seem to dislike tribalism and other conflict between small groups, their scope is larger, often being a superset of at least both groups, which leads to the belief that conflict between these groups is mistaken and only of negative utility. (but perhaps any agent will wish to increase the coherence inside the scope of whatever system they wish to improve)

Secondly, by thinking about games, I have come to the idea of "sportsmanship", which is essentially good-faith conflict. You could say it's competition with mutual benefit. Perhaps this is not the right way to put it, since the terms seem contradicting. Anyway, I've personally come to enjoy interacting with people who are different than me, who think differently and even value differently than me. This can sometimes lead to good-faith conflict akin to what's seen in games. Imagine a policeman who has caught a criminal, only to say "Got me! Impressive, how did you manage?" And for the policeman to say, perhaps "It wasn't easy, you're a sneaky one! You see, I noticed that ...".

I've personally observed a decrease in this "sportsmanship", but it correlates with something like intelligence or wisdom. I don't know if it's a function of intelligence or of humanity (or morality), though.

That said, my own "Live and let live", or even "Play whatever role you want and I will do the same" kind of thinking might be a result of an exotic utility function. My self-actualization has gotten me much higher on the needs-hierarchy than the greed and egoism you tend to see in any person who fear for their own survival/well-being. Perhaps this anti-molloch way of living is a result of wanting to experience life rather than attempting to control/alter it, which is another interesting idea.

answer by tailcalled · 2023-08-23T10:22:15.379Z · LW(p) · GW(p)

Most investigations into this question end up proving something called Aumann's Agreement Theorem, which roughly speaking states that if the different agents correctly trust each other then they will end up agreeing with each other. Maybe there's some types of knowledge differences which prevent this once one deviates from ideal Bayesianism, but if so it is not known what they are.

answer by Gurkenglas · 2023-08-23T09:44:28.664Z · LW(p) · GW(p)

Nope! If we're optimizing the merger, we'd see that problem coming and install whatever transhumanism is necessary to avert this.

comment by mako yass (MakoYass) · 2023-08-23T09:50:07.220Z · LW(p) · GW(p)

It's not necessarily going to be seen as a problem, it would be seen as an unavoidable inefficiency.

Note, I don't expect the fight to play out. It's a question about what sorts of tensions the conflict resolution processes reflect. This is explained in the question body.

Replies from: Gurkenglas, Gurkenglas
comment by Gurkenglas · 2023-08-23T10:19:48.324Z · LW(p) · GW(p)

The Future has way too much freedom to design minds for that inefficiency to be unavoidable. "Who would win a fight?" is completely irrelevant to who is right, why should they pay attention to it? They could just, say, concatenate their memories and let the resulting mind decide. If that depends on the order, the programmer needs to be fired.

comment by Gurkenglas · 2023-08-23T11:12:46.329Z · LW(p) · GW(p)

To be fair, your note guessed correctly that I had misread your question's last two paragraphs, and I'm overly attached to my initial response. But my reasoning holds up: Cryptography is hard because the attacker moves last, mind design is easy because nature doesn't get to respond to our design. The reason we'd get conflict is nostalgia, much like Star Trek's Federation judged an Enterprise manned by holograms to be like Disneyland without children.

1 comment

Comments sorted by top scores.

comment by mako yass (MakoYass) · 2023-08-23T08:20:32.047Z · LW(p) · GW(p)

Conjecture: There is no way to simplify the analysis of the situation, or the negotiation process, by paraphrasing an irreconcilable epistemic conflict as a values conflict (there is no useful equivalence between an error theory and a conflict theory). I expect this to be so because the conflict is a result of irreducible complexity in the knowledge sets (and parties inability to hold the same knowledge). So applying another transform to the difference between the knowledge wont give you a clearer image of the disputed borders. You just wont be able to apply the transform.

(note, if true, this would be a useful thing to say to many conflict theorists: By exaggerating difference in material interests, you make your proposals less informed and so less legitimate.)