(answered: yes) Has anyone written up a consideration of Downs's "Paradox of Voting" from the perspective of MIRI-ish decision theories (UDT, FDT, or even just EDT)?
post by Jameson Quinn (jameson-quinn) · 2020-07-06T18:26:01.933Z · LW · GW · 9 commentsThis is a question post.
Contents
Answers 10 strangepoop 8 Daniel Kokotajlo 3 Vanessa Kosoy 2 Dagon None 9 comments
The Paradox of Voting, simply stated, is that voting in a large election almost certainly isn't worth your time (unless you think it's the most fun thing you could be doing). The guaranteed opportunity cost of going to vote will in most cases easily and predictably outweigh the expected benefits — the chance that your vote (along with everyone else's) would be pivotal because the margin was 1 vote, multiplied by your expected marginal utility payoff from your chosen candidate winning.
There are various well-known responses to this issue, listed in the Wikipedia article linked above. But to me, one of the obvious responses is to see this as just another instance of a chicken/snowdrift game, and to invoke the logic you might use to support cooperation in such games; that is, decision theory. I think this may even be one of the most common real-world instances where UDT/FDT might apply. I think it would also be a source of interesting edge cases for exploring the limits of UDT/FDT; that is, even small changes in how strictly you delimit which other (potential) voters to consider as UDT/FDT "co-agents" could easily swing the prescriptions you'd get. But doing a few quick google searches doesn't turn up any write-ups considering this issue in this light. Am I missing something, or is this idea really "new" (at least, undocumented)?
ETA: Thanks to @Vanessa Kosoy, @Daniel Kokotajlo, and @strangepoop, I now have sufficient references for prior discussions of this idea. Thanks! Honorable mention to @lkaxas, who suggested a connection to Kantian ethics which is relevant, though more remote than the references given by the above three.
Answers
There's a whole section on voting in the LDT For Economists page on Arbital. Also see the one for analytic philosophers, which has a few other angles on voting.
From what I can tell from your other comments on this page, you might already have internalized all the relevant intuitions, but it might be useful anyway. Superrationality is also discussed.
Sidenote: I'm a little surprised no one else mentioned it already. Somehow arbital posts by Eliezer aren't considered as canon as the sequences, maybe it's the structure (rather than just the content)?
Gary Drescher in his old book Good and Real talks about this. p299. It was especially cool in that it said that even altruist CDTers can't account for the rationality of voting in sufficiently large elections. I haven't verified whether that's true or not.
↑ comment by Lukas Finnveden (Lanrian) · 2020-07-08T18:56:49.835Z · LW(p) · GW(p)
It was especially cool in that it said that even altruist CDTers can’t account for the rationality of voting in sufficiently large elections.
That's pretty surprising. I checked out the page, and he unfortunately doesn't motivate what kind of model he's using, so it's hard to verify. From the book:
If the importance of the election is presumed proportionate to the size of the electorate, then for large enough elections, expected-utility calculations cannot justify the effort of voting by appeal to the small but heavily weighted possibility that your vote will be a tiebreaker. The odds of that outcome decrease faster than linearly with the number of voters, so the expected value of your vote as a tiebreaker approaches zero— even taking account of the value to everyone combined, not just yourself. Given enough voters, then, the causal value (even to everyone) of your vote is overshadowed by the inconvenience to you of going out to vote."
In an election with two choices, in a model where everybody has 50% chance of voting for either side, I don't think the claim is true. Maybe he's assuming that the outcomes of elections become easier to predict as they grow larger, because individual variability becomes less important? If everyone has a 51% probability of voting for a certain side, the election would be pretty much guaranteed for an arbitrarily large population, in which case a CDTer wouldn't have any reason to vote (even if there was a coalition of CDTers who could swing the election). I'm not sure if it's true that elections in larger countries are more predictable, though.
Replies from: Vaniver, Lanrian↑ comment by Vaniver · 2020-07-08T22:27:20.376Z · LW(p) · GW(p)
In an election with two choices, in a model where everybody has 50% chance of voting for either side, I don't think the claim is true.
I also think that in that case, the odds of a tie don't decrease faster than linearly, but you need to take into account symmetry arguments and precision arguments. That is:
Suppose there are 2N other voters and everyone else votes by flipping a coin. Then the number of votes for side A will be binomially distributed with distribution (2N,0.5) with mean N, and the votes for side B will be 2N-A, and the net votes A - B will be 2A-2N, with an expected value of 0.
But how likely is it to be 0 exactly (i.e. a tie that you flip to a win)? Well, that's the probability that A is N exactly, which is a decreasing function of N. Suppose N is 1,000 (i.e. there are 2,000 voters); then it's 1.7%. Suppose it's 1,000,000; then it's 0.05%. But 1.7% divided by a thousand is less than 0.05%.
But from the perspective of everyone in the election, it's not clear why 'you chose last.' Presumably everyone on the side with one extra vote would think "aha, it would have been a tied election if I hadn't voted," and splitting that up gives us our linear factor.
As well, this hinged on the probability being 0.5 exactly. If instead it was 50.1% favor for A, the odds of a tie are basically unchanged for the 2,000 voter election (we've only shifted the number of expected A voters by 2), but drop to 1e-5 for the 2M voter election, a drop by a factor larger than a thousand. (The expected number of net A voters is now 2,000, which is a much higher barrier to overcome by chance.)
However, symmetry doesn't help us here. Suppose you have a distribution over the 'bias' of the coin the other voters are flipping; a tie is just as unlikely if A is favored as if B is favored, and the more spread out our distribution over the bias is, the worse the odds of a tie are, because for large elections only biases very close to p=0.5 contribute any meaningful chance of a tie.
Replies from: jameson-quinn↑ comment by Jameson Quinn (jameson-quinn) · 2020-07-14T11:55:19.735Z · LW(p) · GW(p)
Consider a 2-option election, with 2N voters, each of whom has probability p of choosing the first option. If p is a fixed number, then as N goes to infinity, (chances of an exact tie times N) go to 0 if N isn't exactly .5, and to infinity if it is. Since the event of p is exactly .5 has measure 0, this model supports the paradox of voting (PoV).
But! If p itself is drawn from an ordinary continuous distribution with nonzero probability density d around .5, then (chances of an exact tie times N) go to ... I think it's just d/2. Maybe there's some correction factor that comes into play for bizarre distributions of p, but if we make the conventional assumption that it's beta-distributed, then d/2 is the answer.
I think that the PoV literature is relying on the "fixed p" model. I think the "uncertain p" model is more realistic, but it's still worth engaging with "fixed p" and seeing the implications of those assumptions.
↑ comment by Lukas Finnveden (Lanrian) · 2020-07-08T19:13:01.470Z · LW(p) · GW(p)
As an aside, for really large populations, it would probably be socially optimal to only have a small fraction of the population voting (at least if we ignore things like legitimacy, feeling of participation, etc). As long as that fraction is randomly sampled, you could get good statistical guarantees that the outcome of the election would be the same as if everyone voted. South Korea did a pretty cool experiment where they exposed a representative sample of 500 people to pro- and anti-nuclear experts, and then let them decide how much nuclear power the country should have.
I don't think this is why CDTs refuses to vote, though.
↑ comment by Jameson Quinn (jameson-quinn) · 2020-07-07T14:07:17.729Z · LW(p) · GW(p)
That book is from 2006. I understand that it deals with the Paradox of Voting, but does it have anything that would be directly relevant to considering it in light of "acausal decision theories"? As far as I know, such theories pretty much didn't exist back then.
Replies from: daniel-kokotajlo↑ comment by Daniel Kokotajlo (daniel-kokotajlo) · 2020-07-07T15:08:55.788Z · LW(p) · GW(p)
Drescher coined the term "acausal" in the context of decision theory, in Good and Real. His arguments and ideas are remarkably similar to things Yudkowsky and others on LessWrong have said in the decade or so since. One of my side projects (which I keep putting off) is to explore his proposed decision theory (which differs from CDT and EDT and, notably, one-boxes even in Transparent Newcomb!) in more detail, to see how it compares to stuff LessWrong talks about.
This idea is certainly not new, for example in an essay about TDT [LW · GW] from 2009, Yudkowsky wrote:
Some concluding chiding of those philosophers who blithely decided that the "rational" course of action systematically loses... And celebrating of the fact that rationalists can cooperate with each other, vote in elections, and do many other nice things that philosophers have claimed they can't...
(emphasis mine)
The relevance of TDT/UDT/FDT to voting surfaced in discussions many times, but possibly nobody wrote a detailed essay on the subject.
I don't think any of the more interesting decision theories differ from CDT on a trivial expected value calculation, with no acausal paths to the payoffs. How do you see it working? Can you put some probabilities and payoffs in place to show why you think this is relevant?
↑ comment by Jameson Quinn (jameson-quinn) · 2020-07-06T20:44:16.990Z · LW(p) · GW(p)
But there is an obvious acausal path in this case. If other voters are using the same algorithm you are to decide whether or not to vote, or a "sufficiently similar" one (in some sense that would have to be fleshed out), then that inflates the probability that "your" decision of whether or not to vote is pivotal, because "you" are effectively multiple voters.
Is that sufficient, or do you need actual numbers? (I'd guess it is and you don't.)
Replies from: Dagon↑ comment by Dagon · 2020-07-06T20:52:04.210Z · LW(p) · GW(p)
I guess it is, but I'd edit your question to mention that you include https://en.wikipedia.org/wiki/Superrationality in your assumptions. Personally, I don't think that other potential voters are all that similar to myself, so all decision theories lead to the same result (negative EV for voting, when considering only cost of time spent vs chance of pivotal outcome).
Replies from: jameson-quinn↑ comment by Jameson Quinn (jameson-quinn) · 2020-07-06T21:03:16.371Z · LW(p) · GW(p)
I very much do not include superrationality in my assumptions. I'm not assuming that all other voters, or even any specific individual other voter, is explicitly using a meta-rational decision theory; I'm simply allowing the possibility that the "expected acausal impact" of my decision is greater than 0 other voters. There are, I believe, a number of ways this could be "true".
In simpler terms: I think that my beliefs (and definitions) about whether (how many) other voters are "like" me are orders of magnitude different from yours, in a way that is probably not empirically resolvable. I understand that taking your point of view as a given would make my original question relatively trivial, but I hope you understand that it is still an interesting question from my point of view, and that exploring it in that sense might even lead to productive insights that generalize over to your point of view (even though we'd probably still disagree about voting).
If you like, I guess, we could discuss this in a hypothetical world with a substantial number of superrational voters. For you this would merely be a hypothetical, which I think would be interesting for its own sake. For me, this would be a hypothetical special case of acausal links between voters, links which I believe do exist though not in that specific form.
Replies from: Dagon9 comments
Comments sorted by top scores.
comment by Richard_Kennaway · 2020-07-06T22:12:02.599Z · LW(p) · GW(p)
the chance that your vote (along with everyone else's) would be pivotal because the margin was 1 vote,
I have never understood this criterion for your vote "mattering". It has the consequence that if (as will almost always be the case for a large electorate) the winner has a majority of at least 3, then no-one's vote mattered. If a committee of 5 people votes 4 to 1, then no-one's vote mattered. Two votes mattered, but no-one's vote mattered. If one of the yes voters had stayed at home that day, then every yes vote would matter, but the no vote wouldn't matter.
This does not seem like a sensible concept to attach to the word "matter". If someone on that committee was very anxious that the vote should go they way they would like, they will have done everything they could to persuade every other persuadable member to vote their way. Far from no-one's vote mattering, every vote in that situation matters. This is a frequent occurrence in parliamentary votes, when there is any doubt beforehand whether the motion will pass, and the result is of great importance to both sides. In the forthcoming US presidential election, both parties will be making tremendous efforts to "get out the vote". Yet no-one's vote "matters"?
Replies from: Ikaxas, jameson-quinn↑ comment by Vaughn Papenhausen (Ikaxas) · 2020-07-07T22:36:48.913Z · LW(p) · GW(p)
There has been some philosophical work that makes just this point. In particular, Julia Nefsky (who I think has some EA ties?) has a whole series of papers about this. Probably the best one to start with is her PhilCompass paper here: https://onlinelibrary.wiley.com/doi/abs/10.1111/phc3.12587
Obviously I don't mean this to address the original question, though, since it's not from an FDT/UDT perspective.
↑ comment by Jameson Quinn (jameson-quinn) · 2020-07-06T23:12:22.648Z · LW(p) · GW(p)
I agree that this definition of "matters" is odd; not the one most people use in everyday speech. I think that there are ways to make other definitions rigorous (in ways that aren't addressed in the wikipedia article I linked). But this is the narrowly consequentialist definition, so it does deserve analysis.
comment by Rafael Harth (sil-ver) · 2020-07-07T08:26:40.675Z · LW(p) · GW(p)
I echo Dagon's claim that there is no difference between CDT and FDT or UDT here (although with the disclaimer that I'm not an expert). This is so because you play the game with many other non-UDT agents, and UDT tends to do the same thing CDT does wrt. cooperation with other non-UDT agents. (Where non-UDT is everything that doesn't implement ideas from the TDT/UDT/FDT bundle.)
However, a reasonable calculation shows that a vote is worth quite a lot (at least if you live in a swing state) if you consider the benefit for everyone rather than just for yourself – which seems to be what rationalists tend to do anyway on things like x-risk prevention and charity. And if you don't live in a swing state, you can try to trade your vote with someone who does. (I believe EY did this in 2016.)
Replies from: jameson-quinn↑ comment by Jameson Quinn (jameson-quinn) · 2020-07-07T13:46:15.009Z · LW(p) · GW(p)
Wait, what?
"You play the game with many other CDT agents" — this seems demonstrably false, at least, if we accept the Paradox of Voting as being a thing, in which case, CDT agents have by assumption removed themselves from the game. (I understand your response that voting may be altruistically-CDT-rational; as you know, it's been discussed before, and very rightly so. But I also think it's still worth considering the boundedly-altruistic/diagonally-dominant case.)
It seems to me that the only way you can claim there's "many other CDT agents" is if "CDT" is being used as a catch-all for "not explicitly FDT/UDT", and I'd strongly dispute that usage. I think that memetically/genetically evolved heuristics are likely to differ systematically from CDT. It may be best to create an entirely separate model for people operating under such heuristics, but if you want to force them into a pure CDT-vs-UDT-vs-random-noise (ie, mixture distribution) paradigm, I'd say they would be substantially more than 0% UDT.
ETA: I guess I can parse "other voters are CDT" as a sensible assumption if you're explicitly doing repeated-game analysis, but such an analysis would pretty much dissolve both the Paradox of Voting and the CDT vs. acausal-DTs distinction.
Replies from: sil-ver↑ comment by Rafael Harth (sil-ver) · 2020-07-07T15:39:18.227Z · LW(p) · GW(p)
I think that memetically/genetically evolved heuristics are likely to differ systematically from CDT.
On reflection, I'm not sure whether I agree with this or not. I'll edit the post.
However, the point is non-essential. What I've said holds true if you replace "CDT" with "weird bundle of heuristics." The point is that it's not UDT: an UDT agent needs other agents to be UDT or similar to cooperate with them for stuff like voting. (Or at least that's what I believe is true and what matters for this question.) And I certainly think the UDT proportion is small enough to be modeled as 0.
Replies from: Ikaxas↑ comment by Vaughn Papenhausen (Ikaxas) · 2020-07-07T16:15:35.838Z · LW(p) · GW(p)
I think there is a strong similarity between FDT (can't speak to UDT/TDT) and Kantian lines of thought in ethics. (To bring this out: the Kantian thought is roughly to consider yourself simply as an instance of a rational agent, and ask "can I will that all rational agents in these circumstances do what I'm considering doing?" FDT basically says "consider all agents that implement my algorithm or something sufficiently similar. What action should all those algorithm-instances output in these circumstances?" It's not identical, but it's pretty close.) Lots of people have Kantian intuitions, and to the extent that they do, I think they are implementing something quite similar to FDT. Lots of people probably vote because they think something like "well, if everyone didn't vote, that would be bad, so I'd better vote." (Insert hedging and caveats here about how there's a ton of debate over whether Kantianism is/should be consequentialist or not.) So they may be countable as at least partially FDT agents for purposes of FDT reasoning.
I think that memetically/genetically evolved heuristics are likely to differ systematically from CDT.
Here's a brief argument why they would (and why they might diverge specifically in the direction of FDT): the metric evolution optimizes for is inclusive genetic fitness, not merely fitness of the organism. Witness kin selection. The heuristics that evolution would install to exploit this would tend to be: act as if there are other organisms in the environment running a similar algorithm to you (i.e. those that share lots of genes with you), and cooperate with those. This is roughly FDT-reasoning, not CDT-reasoning.
Replies from: sil-ver↑ comment by Rafael Harth (sil-ver) · 2020-07-07T17:01:30.105Z · LW(p) · GW(p)
[...] Lots of people have Kantian intuitions, and to the extent that they do, I think they are implementing something quite similar to FDT.
I've never thought about this, but your comment is persuasive. I've un-endorsed my answer and moved it to the comments.
comment by jmh · 2020-07-07T00:16:37.337Z · LW(p) · GW(p)
Not sure where of if this fits into your thought or not. In many was I see both the paradox and many of the attempts to explain it may well stem from incorrectly specifying the question. The argument is that the payoff from voting for any given person is lower than the costs incurred so why vote?
However, since people clearly do vote isn't the better question to ask: what did we miss in specifying the equation that results in the implication all these people are irrational and imposing costs on themselves?
In other words, rather than accepting the claimed paradox why not just take the empirical observation and then look for the underlying explanation. Would a good scientist ever talk about the paradox of flight once observed?