Random thoughts on game theory and what it means to be a good person
It does seem to me like there doesn’t exist any good writing on game theory from a TDT perspective. Whenever I read classical game theory, I feel like the equilibria that are being described obviously fall apart when counterfactuals are being properly brought into the mix (like D/D in prisoners dilemmas).
The obvious problem with TDT-based game theory, just as it is with Bayesian epistemology, the vast majority of direct applications are completely computationally intractable. It’s kind of obvious what should happen in games with lots of copies of yourself, but as soon as anything participates that isn’t a precise copy, everything gets a lot more confusing. So it is not fully clear what a practical game-theory literature from a TDT-perspective would look like, though maybe the existing LessWrong literature on Bayesian epistemology might be a good inspiration.
Even when you can’t fully compute everything (and we even don’t really know how to compute everything in principle), you might still be able to go through concrete scenarios and list considerations and perspectives that incorporate TDT-perspectives. I guess in that sense, a significant fraction of Zvi’s writing could be described as practical game theory, though I do think there is a lot of value in trying to formalize the theory and make things as explicit as possible, which I feel like Zvi at least doesn’t do most of the time.
Critch (Academian) tends to have this perspective of trying to figure out what a “robust agent” would do, in the sense of an agent that would at the very least be able to reliably cooperate with copies of itself, and adopt cooperation and coordination principles that allow it to achieve very good equilibria with agents that adopt the same type of cooperation and coordination norms. And I do think there is something really valuable here, though I am also worried that the part where you have to cooperate with agents who haven’t adopted super similar cooperation norms is actually the more important one (at least until something like AGI).
And I do think that the majority of the concepts we have for what it means to be a “good person” ultimate are attempts at trying to figure out how to coordinate effectively with other people, in a way that a more grounded game theory would help a lot with.
Maybe a good place to start would be to brainstorm a list of concrete situations in which I am uncertain what the correct action is. Here is some attempt at that:
How to deal with threats of taking strongly negative-sum actions? What is the correct response to the following concrete instances?
You are in the room with someone holding the launch buttons for the USA’s nuclear arsenal and they are threatening to launch them if you don’t hand over your wallet
You are head of the U.S. and another nation state is threatening a small-scale nuclear attack on one of your cities if you don’t provide some kind of economic subsidy to them
You are at a party and your assigned driver ended up drinking, even though they said they would not (the driver was chosen by a random draw)
I feel like I have some hint of an answer to all of these, but also feel like any answer that I can come up with makes me exploitable in a way that makes me feel like there is no meta-level on which there is an ideal strategy.ccnation on Is AI safety doomed in the long term?
I think the best way to deal with AI alignment is to create AI not just as a separate entity, but instead an extension and augmentation of ourselves. We are much better at using AI in narrow contexts than in real-world AGI scenarios, and we still have time to think about this before willy-nilly making autonomous agents. If humans can use AI and their own smarts to create functional brain-computer interfaces, the problem of aligned AI may not become a problem at all. Because the Artificial Intelligence is just an extension of yourself, of course it will be aligned with you - it is you! What I mean is that as humans become better at interfacing with technology the line between AI and human blurs.alexei on Newcomb's Problem: A Solution
Seems fine as a practical solution. But it’s still nice to do the math to figure out the formula, just like we have a formula for gravity.rorschak on Why the empirical results of the Traveller’s Dilemma deviate strongly away from the Nash Equilibrium and seems to be close to the social optimum?
" in this case, "trust" is equivalent to changing the payout structure to include points for self-image and social cohesion "
I guess I'm just trying to model trust in TD without changing the payoff matrix. The payoff matrix of the "vague" TD works in promoting trust--a player has no incentive breaking a promise.rorschak on Why the empirical results of the Traveller’s Dilemma deviate strongly away from the Nash Equilibrium and seems to be close to the social optimum?
This is true. The issue is that the Nash Equilibrium formulation of TD predicts that everyone else will bid $2, which is counter-intuitive and does not confirm empirical findings.
I'm trying to convince myself that the NE formulation in TD is not entirely rational.rorschak on Why the empirical results of the Traveller’s Dilemma deviate strongly away from the Nash Equilibrium and seems to be close to the social optimum?
If Alice claims close to $100 (say, $80), Bob gets a higher payoff claiming $100 (getting $78) instead of claiming $2 (getting $4).shminux on Does the Higgs-boson exist?
for all you know maybe all that exists is just you with your current set of memories and observations?
That solipsistic model is not very useful, is it? Doesn't offer any useful predictions, so why entertain it?
If so I'm curious what your views on values and decision making are.
I posted about my views on decision making about a year ago. That model seems quite useful to me, as it avoids the pitfalls of logical counterfactuals vs environmental counterfactuals, and a bunch of otherwise confusing dilemmas.
If you're agnostic about the existence of everything except your current memories and observations and models
I didn't say I made an exception at all. I just don't like using terms like "exist", "real" and "true", they can be quite misleading. If anything, I would suggest people try to taboo them and see what happens to the statements they make.
how do you figure out what actions are better than other actions?
Like most people, I have an illusion of making decisions. That is the implication from the current best physics models. My linked post above explains how to compare possible worlds, which is the closest one can get to "making decisions" without implying that they have magical free will separate from physical processes.
If what you are really asking is "how do you reconcile Model A you use in the situation 1 and Model B you use in the situation 2?", then my reply is that every model has its own domain of validity and when stretched beyond it, it breaks. There is nothing unusual about it. in physics quantum mechanics and general relativity are very useful yet incompatible models. You can probably name a few like that in your own are of expertise.milan-cvitkovic on Is value drift net-positive, net-negative, or neither?
I would argue that the concept of value drift (meaning "a change in human values from whatever they are currently") isn't really sensible to talk about. Here's a reductio argument to that effect: Avoiding bad value drift is as important as solving value alignment: https://docs.google.com/document/d/1TDA9vHBT7kN9oJ69-MEtbXZ_GiAGk3aRhLXFgkLv8XM/edit?usp=sharing
It's hard to compare values on their "goodness". I prefer to think of them as phenotypes and compare them on their adaptive benefits to agents that hold them. After all, it doesn't really matter what's right: it matters what wins.habryka4 on Newcomb's Problem: A Solution
Hey, you've been making a lot of comments lately, and when I am honest I've been failing to parse a large fraction of them, and another significant fraction that I have been able to parse haven't been very good. I think it would be better for you to make slightly fewer comments, and invest more time into each individual one.
(This isn't really a moderator warning yet, but I do think it's plausible that we would give you a temporary ban if you continue commenting at your current volume and quality level)slider on 0.999...=1: Another Rationality Litmus Test
f(x) > g(x) for all x but lim f(x) = lim g(x) = 0. Just becuause f gets there "later" does not mean it gets any less deep.
Repeating decimals are far enough removed from decimals its like mixing rationals and integers.