post by [deleted] · · ? · GW · 0 comments

This is a link post for

0 comments

Comments sorted by top scores.

comment by Vladimir_Nesov · 2022-06-02T16:16:46.523Z · LW(p) · GW(p)

should the result be BE, then the row player would have known it, and he would have instead picked A rather than B

I think this thing that gets predictably falsified by accepted policy shouldn't be called knowledge, it's merely a meaningful proposition, in this case a false proposition.

Replies from: gfourny
comment by Ghislain Fourny (gfourny) · 2022-06-02T16:47:58.666Z · LW(p) · GW(p)

Thank you for your comment, Vladimir_Nesov.

It is indeed correct that "the result be BE" is a false proposition in the real world. In fact, this is the reason why they are called counterfactuals and why the subjunctive tense ("would have") is used.

Nashian game theory is based on the indicative tense, for example common knowledge is all based on the indicative tense (A knows that B knows that A knows etc). Semantically, knowledge can be modelled with set inclusion in Kripke semantics: A knows P if the set of accessible worlds (i.e., compatible with A's actual knowledge) is included in the set of possible worlds in which P is true (and we can canonically identify this set with P, i.e., conflate a logical proposition with the set of worlds in which it is true).

What is important to understand is that counterfactuals can be rigorously captured and anchored into the actual world. This has been researched in particular by Lewis and Stalnaker in the 1960s and 1970s.  A statement such as "should the result be BE, then the row player would have known it," is mathematically modelled as a counterfactual implication  with P="the result is BE", Q="the row player knows that the result is BE" where P and Q are predicates on all possible worlds (including counterfactual worlds) that can be understood as subsets of the set of all possible worlds.

 is itself a compound predicate, because it is a knowledge statement: Q is true in a specific world  if the set of worlds accessible to the row player from  is included in P.

Counterfactuals imply some sort of distance between possible worlds: the more different they are, the farther apart they are. Given P and Q, the predicate  is defined as true in a world  if, in the closest world to  in which P is true, denoted , Q is also true.

So all in all, the statement is formally modelled as the predicate  being true in the actual world and is to be distinguished from the logical implication , which is trivially true in the actual world if P is false. But the counterfactual implication  on the other hand is not trivial and can potentially be false in the actual world even when P is false.

I hope it helps clarify! For more details, the 1972 book by Lewis "Counterfactuals" is a very interesting read.

Replies from: Vladimir_Nesov
comment by Vladimir_Nesov · 2022-06-02T17:50:42.357Z · LW(p) · GW(p)

Counterfactuals imply some sort of distance between possible worlds [...] predicate P>Q is defined as true in a world ω if, in the closest world to ω in which P is true, Q is also true

What does "closest world to ω in which P is true" mean? Is this still data extracted from a Kripke frame, a set of worlds plus accessibility, or does this need more data ("some sort of distance")? What sort of distance is this, what if there are multiple worlds in P at the same distance from ω, possibly with different truth of Q?

Keeping to the example at hand, what are the possible worlds/counterfactuals here, just the (row, column) pairs? Their combination with possibly-false-in-that-world beliefs? Something else intractably informal that can't be divided by equivalence for irrelevant distinctions to give an explicit description? What is the accessibility in the Kripke frame? What are the distances? Is "the result is BE" just the one-world proposition true in the world BE? If some of the assumptions in my questions are wrong (as I expect them to be), what are the worlds where "the result is BE" holds? What does it mean to enact row A in the situation where the result is BE (or believed to be BE)? Or is "he would have instead picked A rather than B" referring to something that shouldn't be thought of as enactment?

It is indeed correct that "the result be BE" is a false proposition in the real world

(I meant it's false in the world where it's believed, where row A would be taken instead as a result of that belief, so that in fact in that world row A is taken rather than B, so that the belief that row B is taken in that world is false. I didn't mean to imply that I'm talking about the real world.)

comment by Charlie Steiner · 2022-06-02T11:04:53.937Z · LW(p) · GW(p)

So is the idea that when there's only one Pareto-optimal outcome above both players' fallback price, they'll pick it?

In this case it seems like the failure of this concept is when there's more than one such outcome and each player likes a different one (which would normally be indicated by no equilibrium of this type - the actual outcome might still depend on details of the players' source code even after they meet your rationality standards), but by chance there's a barely above-fallback outcome at the intersection of the outcomes the players really prefer, causing this procedure to return something that might not even be Pareto optimal.

Or maybe I'm misunderstanding, since you said you had a proof?

E.g.:

(5,1) (6,2) (4,4)
(9,6) (2,2) (2,6)
(5,5) (6,9) (1,5)

comment by Coafos (CoafOS) · 2022-06-02T02:04:51.679Z · LW(p) · GW(p)

Have you heard about Infra-Bayesianism [LW · GW]?

If I get it correctly, the core idea is that "consider every possible scenario, use a maximin policy while caring about conterfactual branches", which is very similar to the idea presented in the linked post. The "Nirvana trick" in the other post is similar to just eliminating branches/cells, where the agent would take a different action from the predicted policy.

Non-Nashian Game Theory is Pareto optimal, Infra-Bayesianism implements Updateless Decision Theory. If the two are connected, that could mean that UDT and Pareto-optimality are connected too.

comment by philh · 2022-06-10T19:05:33.906Z · LW(p) · GW(p)

Apparently every finite game has at least one Nash equilibrium if you allow mixed strategies. (i.e. players select probability distributions over their moves, not single moves.) Do you happen to know how mixed strategies affect existence and uniqueness of PTEs?

(Aside, it seems that every image appears twice in the article.)

comment by Epirito (epirito) · 2022-06-01T23:00:07.190Z · LW(p) · GW(p)

I see the Nash equilibrium as rationally justified in a limit-like sort of way. I see it as what you get if you get arbitrarily close to perfect rationality. Having a good enough model of another's preferences is something you can actually achieve or almost achieve, but you can't really have a good enough grasp of your opponent's source code to acausally coerce him into cooperating with you unless you really have God-like knowledge (or maybe if you are in a very particular situation such as something involving AI and literal source codes). In proportion as a mere mortal becomes more and more smart, he becomes more and more able to get the best deal by having a better grasp on the Nashian game-theoretic intricacies of a given situation, but he won't become any more able to acausally trade. It's all or nothing. I think your whole line of reasoning is a bit like objecting to calculus on the grounds that instantaneous change is an oxymoron (as people did when calculus still rested on less rigorous foundations). Non-Nashian game theory is technically correct, but less useful, just like pointing out to Leibniz that "(x^2+0-x^2)/(x+0-x) = undefined" or whatever

comment by Roman Leventov · 2022-11-17T06:43:02.428Z · LW(p) · GW(p)

I think this is an important direction of work because, despite lots of concerns on this forum about the interpretability and explainability of ML, I think that in practice we should expect (in the worlds where we survive, at least) AI agents cooperating within systems (or playing against each other in games, or a mixture of these two modes) are going to be more transparent to each other than humans are transparent to each other.

People always think in private; sometimes give off their "real" thoughts and intentions via facial microexpressions and facial flushing, but less so in the era of remote communication.

AIs always learn by receiving and processing data and doing number crunching on the data, should probably expect that we will build infrastructure for logging this data, looking for signs of deception in the weights or activations selectively saved for future security processing. Moreover, if a single integrated stream of thought (like people have in their heads) proves important for general intelligence, we should expect all these streams of thought to be recorded.

I think it's also important to transfer insights from mathematical constructs (including UDT, FDT, superrationality, and games with perfect prediction) onto a physical footing. Here [LW(p) · GW(p)], I argued that FDT should be seen as a group (collective) decision theory, i. e., a piece of collective intelligence. In "An active inference model of collective intelligence", Kaufmann et al. proposed a physics-based explanatory theory of collective intelligence (I don't have an opinion about this theory, just indicate it as one of the proposals out there).

In such theories and game setups, I think it's important to consider bounded rationality (see "Resource-rational analysis: Understanding human cognition as the optimal use of limited computational resources", "Information overload for (bounded) rational agents"), communication costs, communication delays and the delays (and costs) of reaching consensus, and the non-zero cost of updating one's own beliefs (to the point of "Free Will" and the assumption that agents can deflect momentarily, which leads to the idea of pre-commitments in Newcomb problem and Parfit's hitchhiker, without considering that basically, any belief is a micro pre-commitment to this belief). Also, in iterated games, we should ensure that we don't model agents making ergodicity assumptions when it's not "real-world rational".