# Stuart_Armstrong's Shortform

post by Stuart_Armstrong · 2019-09-30T12:08:13.617Z · LW · GW · 6 comments## 6 comments

Comments sorted by top scores.

Lexicographical preference orderings seem to come naturally to humans. Sentiments like "no amount of money is worth one human life" are commonly expressed.

Now, that particular sentiment is wrong because money can be used to purchase human lives.

The other problem comes from using probability and expected utility, which makes anything lexicographically second completely worthless is all realistic cases. It's one thing to say that you prefer apples to pears lexicographically when there are ten of each lying around and everything is deterministic (just take the ten apples first then the ten pears afterwards). But does it make sense to say that you'd prefer one chance in a trillion of extending someone's life by a microsecond, over a billion euros of free consumption?

So this short post will propose a more sensible, smoothed version of lexicographical ordering, suitable to capture the basic intuition, but usable with expected utility.

If the utility has lexicographical priority and is subordinate to it, then choose a value and maximise:

In that case, increases in expected always cause non-trivial increases in expected , but an increase in of will always be more important than any possible increase in .

This seems related to scope insensitivity and availability bias. No amount of money (that I have direct control of) is worth one human life ( in my Dunbar group). No money (which my mind exemplifies as $100k or whatever) is worth the life of my example human, a coworker. Even then, its false, but it's understandable.

More importantly, categorizations of resources (and of people, probably) are map, not territory. The only rational preference ranking is over reachable states of the universe. Or, if you lean a bit far towards skepticism/solopcism, over sums of future experiences.

Preferences exist in the map, in human brains, and we want to port them to the territory with the minimum of distortion.

This is a link to "An Increasingly Manipulative Newsfeed [LW · GW]" about potential social media manipulation incentives (eg FaceBook).

I'm putting the link here because I keep losing the original post (since it wasn't published by me, but I co-wrote it).

Bayesian agents that knowingly disagree

A minor stub, caveating the Aumann's agreement theorem; put here to reference in future posts, if needed.

Aumann's agreement theorem states that rational agents with common knowledge of each other's beliefs cannot agree to disagree. If they exchange their estimates, they will swiftly come to an agreement.

However, that doesn't mean that agents cannot disagree, indeed they can disagree, and know that they disagree. For example, suppose that there are a thousand doors, and behind of these, there are goats, and behind one there is a flying aircraft carrier. The two agents are in separate rooms, and a host will go into each room and execute the following algorithm: they will choose a door at random among the that contain a goat. And, with probability , they will tell that door number to the agent; with probability , they will tell the door number with the aircraft carrier.

Then each agent will have probability of the named door being the aircraft carrier door, and probability on each of the other doors; so the most likely door is the one named by the host.

We can modify the protocol so that the host will never name the same door to each agent (roll a D100; if it comes up 1, tell the truth to the first agent and lie to the second; if it comes up 2, do the opposite; anything else means tell a different lie to either agent). In that case, each agent will have a best guess for the aircraft carrier, and the certainty that the other agent's best guess is different.

If the agents exchanged information, they would swiftly converge on the same distribution; but until that happens, they disagree, and know that they disagree.