Posts

Practical post on 4 #ProRep methods (Proportional Representation) 2021-01-09T14:56:29.431Z
How should I back up and redo, in a publicly-edited article? 2020-07-28T19:07:13.046Z
[not ongoing] Thoughts on Proportional voting methods 2020-07-22T02:46:27.490Z
What should be the topic of my LW mini-talk this Sunday (July 18th)? 2020-07-16T16:32:54.241Z
(answered: yes) Has anyone written up a consideration of Downs's "Paradox of Voting" from the perspective of MIRI-ish decision theories (UDT, FDT, or even just EDT)? 2020-07-06T18:26:01.933Z
Is voting theory important? An attempt to check my bias. 2019-02-17T23:45:57.960Z
EA grants available (to individuals) 2019-02-07T15:17:38.921Z
Does the EA community do "basic science" grants? How do I get one? 2019-02-06T18:10:00.827Z
A Rationalist Argument for Voting 2018-06-07T17:05:42.668Z
The Devil's Advocate: when is it right to be wrong? 2018-05-15T17:12:16.681Z
Upvotes, knowledge, stocks, and flows 2018-05-10T14:18:15.087Z
Multi-winner Voting: a question of Alignment 2018-04-17T18:51:09.062Z
5 general voting pathologies: lesser names of Moloch 2018-04-13T18:38:41.279Z
A voting theory primer for rationalists 2018-04-12T15:15:23.823Z

Comments

Comment by jameson-quinn on What is going on in the world? · 2021-01-19T02:20:02.558Z · LW · GW

Our sense-experiences are "unitary" (in some sense which I hope we can agree on without defining rigorously), so of course we use unitary measure to predict them. Branching worlds are not unitary in that sense, so carrying over unitarity from the former to the latter seems an entirely arbitrary assumption.

A finite number (say, the number of particles in the known universe), raised to a finite number (say, the number of Planck time intervals before dark energy tears the universe apart), gives a finite number. No need for divergence. (I think both of those are severe overestimates for the actual possible branching, but they are reasonable as handwavy demonstrations of the existence of finite upper bounds)

Comment by jameson-quinn on What is going on in the world? · 2021-01-18T22:09:15.535Z · LW · GW

I don't think the point you were arguing against is the same as the one I'm making here, though I understand why you think so.

My understanding of your model is that, simplifying relativistic issues so that "simultaneous" has a single unambiguous meaning, total measure across quantum branches of a simultaneous time slice is preserved; and your argument is that, otherwise, we'd have to assign equal measure to each unique moment of consciousness, which would lead to ridiculous "Bolzmann brain" scenarios. I'd agree that your argument is convincing that different simultaneous branches have different weight according to the rules of QM, but that does not at all imply that total weight across branches is constant across time.

Comment by jameson-quinn on D&D.Sci II Evaluation and Ruleset · 2021-01-18T16:38:14.490Z · LW · GW

I didn't do this problem, but I can imagine I might have been tripped up by the fact that "hammer" and "axe" are tools and not weapons. In standard DnD terminology, these are often considered "simple weapons"; distinct from "martial weapons" like warhammer and battleaxe, but still within the category of "weapons".

I guess that the "toolish" abstractions might have tipped me off, though. And even if I had made this mistake, it would only have mattered for "simple-weapon" tools with a modifier.

Comment by jameson-quinn on What is going on in the world? · 2021-01-18T14:57:19.739Z · LW · GW

This is certainly a cogent counterargument. Either side of this debate relies on a theory of "measure of consciousness" that is, as far as I can tell, not obviously self-contradictory. We won't work out the details here.

In other words: this is a point on which I think we can respectfully agree to disagree.

Comment by jameson-quinn on What is going on in the world? · 2021-01-17T22:05:37.619Z · LW · GW

It seems to me that exact duplicate timelines don't "count", but duplicates that split and/or rejoin do. YMMV.

Comment by jameson-quinn on What is going on in the world? · 2021-01-17T19:41:39.207Z · LW · GW

I think both your question and self-response are pertinent. I have nothing to add to either, save a personal intuition that large-scale fully-quantum simulators are probably highly impractical. (I have no particular opinion about partially-quantum simulators — even possibly using quantum subcomponents larger than today's computers — but they wouldn't change the substance of my not-in-a-sim argument.)

Comment by jameson-quinn on Excerpt from Arbital Solomonoff induction dialogue · 2021-01-17T17:41:59.565Z · LW · GW

Yes, your restatement feels to me like a clear improvement.

In fact, considering it, I think that if algorithm A is "truly more intelligent" than algorithm B, I'd expect if f(x) is the compute that it takes for B to perform as well or better than A, f(x) could even be super-exponential in x. Exponential would be the lower bound; what you'd get from a mere incremental improvement in pruning. From this perspective, anything polynomial would be "just implementation", not "real intelligence". 

Comment by jameson-quinn on What is going on in the world? · 2021-01-17T16:42:38.670Z · LW · GW

Though I've posted 3 more-or-less-strong disagreements with this list, I don't want to give the impression that I think it has no merit. Most specifically: I strongly agree that "Institutions could be way better across the board", and I've decided to devote much of my spare cognitive and physical resources to gaining a better handle on that question specifically in regards to democracy and voting.

Comment by jameson-quinn on What is going on in the world? · 2021-01-17T16:37:45.104Z · LW · GW

Third, separate disagreement: This list states that "vastly more is at stake in [existential risks] than in anything else going on". This seems to reflect a model in which "everything else going on" — including power struggles whose overt stakes are much much lower — does not substantially or predictably causally impact outcomes of existential risk questions. I think I disagree with that model, though my confidence in this is far, far less than for the other two disagreements I've posted.

Comment by jameson-quinn on What is going on in the world? · 2021-01-17T16:31:37.588Z · LW · GW

Separate point: I also strongly disagree with the idea that "there's a strong chance we live in a simulation". Any such simulation must be either:

  • fully-quantum, in which case it would require the simulating hardware to be at least as massive as the simulated matter, and probably orders of magitude more massive. The log-odds of being inside such a simulation must therefore be negative by at least those orders of magnitude.
  • not-fully-quantum, in which case the quantum branching factor per time interval is many many many orders of magnitude less than that of an unsimulated reality. In this case, the log-odds of being inside such a simulation would be very very very negative.
  • based on some substrate governed by physics whose "computational branching power" is even greater than quantum mechanics, in which case we should anthropically expect to live in that simulator's world and not this simulated one.

Unlike my separate point about the great filter, I can claim no special expertise on this; though both my parents have PhDs in physics, I couldn't even write the Dirac equation without looking it up (though, given a week to work through things, I could probably do a passable job reconstructing Shor's algorithm with nothing more than access to Wikipedia articles on non-quantum FFT). Still, I'm decently confident about this point, too.

Comment by jameson-quinn on What is going on in the world? · 2021-01-17T16:15:00.396Z · LW · GW

Strongly disagree about the "great filter" point.

Any sane understanding of our prior on how many alien civilizations we should have expected to see is structured (or at least, has much of its structure that is) more or less like the Drake equation: a series of terms, each with more or less prior uncertainty around it, that multiply together to get an outcome. Furthermore, that point is, to some degree, fractal; the terms themselves can be — often and substantially, though not always and completely — understood as the products of sub-terms.

By the Central Limit Theorem, as the number of such terms and sub-terms increases, this prior approaches a log-normal distribution; that is, if you take the inverse (proportional to the amount of work we'd expect to have to do to find the first extraterrestrial civilization), the mean much higher than the median, dominated by a long upper tail. That point applies not just to the prior, but to the posterior after conditioning on evidence. (In fact, as we come to have less uncertainty about the basic structure of the Drake-type equation — which terms it comprises, even though we may still have substantial uncertainty about the values of those terms — the argument that the posterior must be approximately log-normal only grows stronger than it was for the prior.)

In this situation, given the substantial initial uncertainty about the value of the terms associated with steps that have already happened, the evidence we can draw from the Great Silence about any steps in the future is very, very weak.

As a statistics PhD, experienced professionally with Bayesian inference, my confidence on the above is pretty high. That is, I would be willing to bet on this at basically any odds, as long as the potential payoff was high enough to compensate me for the time it would take to do due diligence on the bet (that is, make sure I wasn't going to get "cider in my ear", as Sky Masterson says). That's not to say that I'd bet strongly against any future "Great Filter"; I'd just bet strongly against the idea that a sufficiently well-informed observer would conclude, post-hoc, that the bullet point above about the "great filter" was at all well-justified based on the evidence implicitly cited.

Comment by jameson-quinn on Excerpt from Arbital Solomonoff induction dialogue · 2021-01-17T06:11:26.093Z · LW · GW

I'm not sure if this comment goes best here, or in the "Against Strong Bayesianism" post. But I'll put it here, because this is fresher.

I think it's important to be careful when you're taking limits. 

I think it's true that "The policy that would result from a naive implementation of Solomonoff induction followed by expected utility maximization, given infinite computing power, is the ideal policy, in that there is no rational process (even using arbitrarily much computing power) that leads to a policy that beats it." 

But say somebody offered you an arbitrarily large-and-fast, but still finite, computer. That is to say, you're allowed to ask for a google-plex operations per second and a google-plex RAM, or even Graham's number of each, but you have to name a number then live with it. The above statement does NOT mean that the program you should run on that hyper-computer is a naive implementation of Solomonoff induction. You would still want to use the known tricks for improving the efficiency of Bayesian approximations; that is, things like MCMC, SMC, efficient neural proposal distributions with importance-weighted sampling, efficient pruning of simulations to just the parts that are relevant for predicting input (which, in turn, includes all kinds of causality logic), smart allocation of computational resources between different modes and fallbacks, etc. Such tricks — even just the ones we have already discovered — look a lot more like "intelligence" than naive Solomonoff induction does. Even if, when appropriately combined, their limit as computation goes to infinity is the same as the limit of Solomonoff induction as computation goes to infinity

In other words, saying "the limit as amount-of-computation X goes to infinity of program A, strictly beats program B with amount Y of finite computation, for any B and Y"; or even "the limit as amount-of-computation X goes to infinity of program A, is as good or better than the limit as amount-of-computation Y goes to infinity of program B, for any B" ... is true, but not very surprising or important, because it absolutely does not imply that "as computation X goes to infinity, program A with X resources beats program B with X resources, for any B".

Comment by jameson-quinn on Practical post on 4 #ProRep methods (Proportional Representation) · 2021-01-11T14:17:10.712Z · LW · GW

PLACE is compatible with primaries; primaries would still be used in the US.

Thus, PLACE has all the same (weak) incentives for the local winner to represent any nonpartisan interests of the local district, along with strong incentives to represent the interests of their party X district combo. The extra (weaker) incentives for those other winners who have the district in their territory to represent the interests of their different party X district combos, to fill out the matrix, make PLACE's representation strictly better.

Comment by jameson-quinn on Practical post on 4 #ProRep methods (Proportional Representation) · 2021-01-11T14:09:53.751Z · LW · GW

Also worth noting that both AV and FPTP are winner-take-all methods, unlike the proportional methods I discuss here. The AV referendum question was essentially "do you want to take a disruptive half-step that lines you up for maybe, sometime in the future, actually fixing the problem?"; I'm not the only one who believes it was intentionally engineered to fail.

Comment by jameson-quinn on Practical post on 4 #ProRep methods (Proportional Representation) · 2021-01-10T16:29:23.154Z · LW · GW

It seems that most of what you're talking about are single-winner reforms (including single-winner pathologies such as center squeeze). In particular, the RCV you're talking about is RCV1, single-winner, while the one I discuss in this article is RCV5, multi-winner; there are important differences. For discussing single-winner, I'd recommend the first two articles linked at the top; this article is about multi-winner reforms.

Personally, I think that the potential benefits of both kinds of reform are huge, but there are some benefits that only multi-winner can give. For instance, no single-winner reform can really fix gerrymandering, while almost any multi-winner one will.

The issue of politicians not wanting to "do surgery on the hand that feeds them" (I like that metaphor) is a real one. The four methods I've chosen to discuss are all chosen partly with an eye to that issue; that is, to being as nondisruptive as possible to incumbents (unless those incumbents owe their seats to gerrymandering, in which case, fixing gerrymandering has to take precedence). Actually, of the four methods, RCV5 is the most disruptive, so if this is your main concern, I'd look more closely at the other three methods I discuss.

Comment by jameson-quinn on Practical post on 4 #ProRep methods (Proportional Representation) · 2021-01-10T16:22:38.775Z · LW · GW

Formally speaking, nothing. Indirectly speaking: the candidate is a Schelling point for voters in those districts, especially if they are not excited by the that-party candidate in their own district. So those voters are a potential source of direct votes for that candidate, which help win not just directly, but also by moving the candidate up in the preference order that gets filled in on ballots cast for other candidates.

Comment by jameson-quinn on Practical post on 4 #ProRep methods (Proportional Representation) · 2021-01-10T16:18:59.781Z · LW · GW

This is not an article about the specific circumstances in the US. Suffice it to say that, while you make good points, I stand by my assessment that things are more hopeful for electoral reform in the US some time in the next decade, than they have been in my 25 years of engagement with the issue. That doesn't mean hopes are very high in an absolute sense, but they're high enough to be noticeably higher.

Comment by jameson-quinn on Practical post on 4 #ProRep methods (Proportional Representation) · 2021-01-10T00:31:42.666Z · LW · GW

You're right, the sentence you quoted is only a small part of the necessary ingredients for reform; finding a proposal that's minimally disruptive to incumbents (unless they owe their seat to gerrymandering) is key to getting something passed; and even then, it's a heavy lift. 

The 4 methods I chose here are the ones I think have the best chances, from exactly those perspectives. It's still a long shot, but IMO realistic enough to be worth talking about. 

Comment by jameson-quinn on A voting theory primer for rationalists · 2021-01-08T00:03:47.994Z · LW · GW

You've described, essentially, a weighted-seats closed-list method.

List methods: meh. It's actually possible to be biproportional — that is, to represent both party/faction and geography pretty fairly — so reducing it to just party (and not geography or even faction) is a step down IMO. But you can make reasonable arguments either way.

Closed methods (party, not voters, decides who gets their seats): yuck. Why take power from the people to give it to some party elite?

Weighted methods: who knows, it's scarcely been tried. A few points:

  • If voting weights are too unequal, then effective voting power can get out of whack. For instance, if there are 3 people with 2 votes each, and 1 person with 1 vote, then that last person has no power to ever shift the majority, even though you might have thought they had half as much power as the others.
  • I think that part of the point of representative democracy is deliberation in the second stage. For that purpose, it's important to preserve cognitive diversity and equal voice. So that makes me skeptical of weighted methods. But note that this is a theoretical, not an empirical, argument, so add whatever grains of salt you'd like.

General points: I like your willingness to experiment; it is possible to design voting methods that are better than even the best ones in common use. But it's not easy, so I wouldn't want to adopt a method that somebody had just come up with; important to at least let experienced theoreticians kick it around some first.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-10-01T20:17:09.295Z · LW · GW

V0.9.1: Another terminology tweak. New terms: Average Voter Effectiveness, Average Voter Choice, Average Voter Effective Choice. Also, post-mortem will happen in a separate document. (This same version might later be changed to 1.0)

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-10-01T20:16:45.785Z · LW · GW

V0.9: I decided to make a few final edits (mostly, adding a summary and a post-mortem section, which is currently unfinished) then freeze this as is.

(This was done some time ago but I forgot to post it here.)

Comment by jameson-quinn on A voting theory primer for rationalists · 2020-08-31T14:25:36.347Z · LW · GW

You seem to be comparing Arrow's theorem to Lord Vetinari, implying that both are undisputed sovereigns? If so, I disagree. The part you left out about Arrow's theorem — that it only applies to ranked voting methods (not "systems") — means that its dominion is far more limited than that of the Gibbard-Satterthwaite theorem.

As for the RL-voting paper you cite: thanks, that's interesting. Trying to automate voting strategy is hard; since most voters most of the time are not pivotal, the direct strategic signal for a learning agent is weak. In order to deal with this, you have to give the agents some ability, implicit or explicit, to reason about counterfactuals. Reasoning about counterfactuals requires make assumptions, or have information, about the generative model that they're drawn from; and so, that model is super-important. And frankly, I think that the model used in the paper bears very little relationship to any political reality I know of. I've never seen a group of voters who believe "I would love it if any two of these three laws pass, but I would hate it if all three of them passed or none of them passed" for any set of laws that are seriously proposed and argued-for.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-08-08T16:26:16.602Z · LW · GW

V0.7.3 Still tweaking terminology. Now, Vote Power Fairness, Average Voter Choice, Average Voter Effectiveness. Finished (at least first draft) of closed list/Israel analysis.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-08-04T21:43:50.878Z · LW · GW

V 0.7.2: A terminology change. New terms: Retroactive Power, Effective Voting Equality, Effective Choice, Average Voter Effectiveness. (The term "effective" is a nod to Catherine Helen Spence). The math is the same except for some ultimately-inconsequential changes in when you subtract from 1. Also, started to add a closed list example from Israel; not done yet.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-08-01T21:39:06.893Z · LW · GW

V 0.7.1: added a digression on dimesionality, in italics, to the "Measuring "Representation quality", separate from power" section. Finished converting the existing examples from RF to VW.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-07-30T15:51:50.216Z · LW · GW

V 0.7.0: Switched from "Representational Fairness" to the more-interpretable "Vote Wastage". Wrote enough so that it's possible to understand what I mean by VW, but this still needs revision for clarity/convincingness. Also pending, change my calculations for specific methods from RF to VW.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-07-29T15:25:16.009Z · LW · GW

I am rewriting the overall "XXX: a xxx proportionality metric" section because I've thought of a more-interpretable metric. So, where it used to be "Representational fairness: an overall proportionality metric", now it will be "Vote wastage: a combined proportionality metric". Here's the old version, before I erase it:


Since we've structured RQ_d as an "efficiency" — 100% at best, 0% at worst — we can take each voter's "quality-weighted voter power" (QWVP) to be the sum of their responsibility for electing each candidate, times their RQ_1 for that candidate. Ideally, this would be 1 for each voter; so we can define the overall "quality-weighted proportionality" (QWP) of an outcome as the average of squared differences between voters' QWVP and 1, shifted and scaled so that no difference gives a QWP of 100% and uniform zeros gives a QWP of 0. (Note that in principle, a dictatorship could score substantially less than 0, depending on the number of voters).

(To do: better notation and LaTeX)

Since realistic voting methods will usually have at least 1 Droop quota of wasted votes (or, in the case of Hare-quota-based methods, just over half a Hare quota of double-powered votes and just under half of wasted votes; which amounts to much the same thing in QWP terms), the highest QWP you could reasonably expect for a voting method would be S/(S+1).

(show the math for the QWP of the IRV example above. Key point: the D>C>A>B voters have zero responsibility for electing A, so all they do is lower the average RQ of the B>C>A>D and C>B>A>D voters)

Note that this QWP metric, in combining the ideas of overall equality and representation quality, is no longer perfectly optimizing either of those aspects in itself. That is to say, in some cases it will push methods to sacrifice proportionality, in search of better representation, in a way that would tend to hurt the MSE from God's perspective. I think those cases are likely to be rare enough, especially for voting methods that weren't specifically designed to optimize to this metric, that I'm OK with this slight mis-alignment. That is to say: I think the true ideal quality ordering would be closer to a lexical sort with priority on proportionality ("optimize for proportionality, then optimize RQ only insofar as it doesn't harm proportionality`); but I think most seriously-proposed, practically-feasible voting methods are far enough from the Pareto frontier that "optimize the product of the two" is fine as an approximation of that ideal goal.

One more note: in passing, this rigorous framework for an overarching proportional metric also helps define the simple concept of "wasted vote"; any vote with 0 responsibility for electing any winner. Although "wasted votes" are already commonly discussed in the political science literature, I believe this is actually the first time the idea has been given a general definition, as opposed to ad-hoc definitions for each voting method.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-07-27T15:26:20.119Z · LW · GW

V 0.6.0: Coined the term "Representational Fairness" for my metric. Did a worked example of Single transferrable vote (STV), and began to discuss the example. Bumping version because I'm now beginning to actually discuss concrete methods instead of just abstract metrics.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-07-25T13:13:14.441Z · LW · GW

V 0.5.5: wrote a key paragraph about NESS: similar outcomes just before Pascal's Other Wager? (The Problem of Points). Added the obvious normalizing constant so that average voter power is 1. Analyzed some simple plurality cases in Retrospective voting power in single-winner plurality.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-07-23T15:13:51.553Z · LW · GW

V 0.5.4: Meaningful rewrite to "Shorter "solution" statement", which focuses not on power to elect an individual, but power to elect some member of a set, of whom only 1 won in reality.

Comment by jameson-quinn on [AN #109]: Teaching neural nets to generalize the way humans would · 2020-07-22T19:19:17.868Z · LW · GW

Finding "Z-best" is not the same as finding the posterior over Z, and in fact differs systematically. In particular, because you're not being a real Bayesian, you're not getting the advantage of the Bayesian Occam's Razor, so you'll systematically tend to get lower-entropy-than-optimal (aka more-complex-than-optimal, overfitted) Zs. Adding an entropy-based loss term might help — but then, I'd expect that H already includes entropy-based loss, so this risks double-counting.

The above critique is specific and nitpicky. Separately from that, this whole schema feels intuitively wrong to me. I think there must be ways to do the math in a way that favors a low-entropy, causally-realistic, quasi-symbolic likelihood-like function that can be combined with a predictive, uninterpretably-neural learned Z to give a posterior that is better at intuitive leaps than the former but beter at generalizing than the latter. All of this would be intrinsic, and human alignment would be a separate problem. Intuitively it seems to me that trying to do human alignment and generalizability using the same trick is the wrong approach.

Comment by jameson-quinn on [AN #109]: Teaching neural nets to generalize the way humans would · 2020-07-22T17:42:30.135Z · LW · GW

This seems to be unreadably mis-formatted for me in Safari.

Comment by jameson-quinn on Swiss Political System: More than You ever Wanted to Know (II.) · 2020-07-22T17:37:14.645Z · LW · GW

Thank you.

Bit of trivia on Switzerland and voting methods: I've heard (but have not seen primary sources for) that in 1798 the briefly-independent city-state of Geneva used the median-based voting method we anachronously know as "Bucklin" after its US-based reinventor. This was at the (posthmous) suggestion of the Marquis de Condorcet. Notably that suggestion was not to use what we know of as "Condorcet" voting, as that would have been logistically too complex for the time.

Also, if I'm not mistaken, Swiss municipal councils use a biproportional voting method; one of the only such methods in public use.

In other words, Switzerland, like Sweden, is a place for interesting voting methods.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-07-22T17:23:15.027Z · LW · GW

V 0.5.3: Added "Bringing it all together" (85% complete)

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-07-22T00:50:17.783Z · LW · GW

Version 0.5.2: Added "Equalizing voter power" and "Measuring "Representation quality", separate from power" sections.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-07-21T13:43:28.285Z · LW · GW

Done.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-07-20T21:05:55.734Z · LW · GW

I rewrote the article to incorporate your contribution. I think you'd be interested to read what I added afterwards discussing this idea.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-07-20T20:57:06.928Z · LW · GW

Ping 2.5 of 3; that is, unexpectedly, I got input that superceded the old ping 2 of 3, and I've now incorporated it.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-07-20T20:55:48.321Z · LW · GW

Rewritten to reflect Thomas Sepulchre's contribution. Which is awesome, by the way.

Or in other words...

V 0.5.1: the main changes since the previous version 0.5.0 are a complete rewrite of the "Tentative Answer" section based on a helpful comment by a reader here, with further discussion of that solution; including the new Shorter "Solution" Statement subsection. I also added a sketch to visualize the loss I'm using.

Comment by jameson-quinn on The Credit Assignment Problem · 2020-07-20T18:57:24.567Z · LW · GW

(Comment rewritten from scratch after comment editor glitched.)

This article is not about what I expected from the title. I've been thinking about "retroactively allocating responsibility", which sounds a lot like "assigning credit", in the context of multi-winner voting methods: which voters get credit for ensuring a given candidate won? The problem here is that in most cases no individual voter could change their vote to have any impact whatsoever on the outcome; in ML terms, this is a form of "vanishing gradient". The solution I've arrived at was suggested to me here; essentially, it's imposing a (random) artificial order on the inputs, then using a model to reason about the "worst"-possible outcome after any subset of the inputs, and giving each of the inputs "credit" for the change in that "worst" outcome.

I'm not sure if this is relevant to the article as it stands, but it's certainly relevant to the title as it stands.

Comment by jameson-quinn on The Credit Assignment Problem · 2020-07-20T18:51:35.120Z · LW · GW

I just read this, and from the title, I thought it was going to be about something else. Essentially, the problem I've been thinking about recently is a form of "credit assignment" between agents in highly multi-agent scenarios; for instance, "which voter(s) are responsible for this election outcome". In ML terms, this is roughly the problem of the vanishing gradient; that is, in most cases, no individual voter could have changed the outcome by changing their vote. The solution I've arrived at was suggested here —

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-07-17T18:13:00.364Z · LW · GW

Nice. Thank you!!!

This corresponds to the Shapley-Shubik index. I had previously discounted this idea but after your comment I took another look and I think it's the right answer. So I'm sincerely grateful to you for this comment.

Comment by jameson-quinn on What should be the topic of my LW mini-talk this Sunday (July 18th)? · 2020-07-17T09:27:42.845Z · LW · GW

I thik I'm gonna do something about why high-dimensional is hard. I'll mention the voting context, but mostly discuss the problem in abstract.

Comment by jameson-quinn on Classification of AI alignment research: deconfusion, "good enough" non-superintelligent AI alignment, superintelligent AI alignment · 2020-07-16T16:53:34.559Z · LW · GW

This is very well-said, but I still want to dispute the possibility of "perfect alignment". In your clustering analogy: the very existence of clusters presupposes definitions of entities-that-correspond-to-points, dimensions-of-the-space-of-points, and measurements-of-given-points-in-given-dimensions. All of those definitions involve imperfect modeling assumptions and simplifications. Your analogy also assumes that a normal-mixture-model is capable of perfectly capturing reality; I'm aware that this is provably asymptotically true for an infinite-cluster Dirichlet process mixture, but we don't live in asymptopia and in reality it is effectively yet another strong assumption that holds at best weakly.

In other words, while I agree with (and appreciate your clear expression of) your main point that it's possible to have a well-defined category without being able to do perfect categorization, I dispute the idea that it is possible even in theory to have a perfectly-defined one.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-07-16T14:07:47.052Z · LW · GW

Ping 2 of 3

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-07-16T14:05:46.176Z · LW · GW

OK, I think I'm "over the hump". That is, there's still plenty left to write and plenty of details left to work out, but I think that an ideal reader could begin to see the light at the end of the tunnel, could begin to grasp the overall scale and shape of what I'm trying to do here. I'm going to ping again.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-07-15T14:45:33.503Z · LW · GW

Key to strange punctuation I use: "((...))" are technical notes that you can safely skip if you don't understand; the extra parens are to emphasize the skippability. "))(...)((" are digressions that break the narrative flow, but are actually important; the inner normal parens are to mark digression and the doubled outer reverse parens are to mark importance.

Also I wrote TLD̦R instead of TL;DR (too long; didn't read) because on my keyboard the semicolon is a dead key that combines with space or enter to be an ordinary semicolon but combines with letters to add accents. I think making TLD̦R one character shorter is in the spirit of TL;DR.

Comment by jameson-quinn on [not ongoing] Thoughts on Proportional voting methods · 2020-07-15T03:27:10.764Z · LW · GW

Beginning "a proposed solution" but it's slow going — haven't actually gotten to the solution part of the solution yet. Also wrote a bit more above on what "MSE" means in this context — it's less intiutive than you might think when you first read the acronym.

Comment by jameson-quinn on (answered: yes) Has anyone written up a consideration of Downs's "Paradox of Voting" from the perspective of MIRI-ish decision theories (UDT, FDT, or even just EDT)? · 2020-07-14T11:55:19.735Z · LW · GW

Consider a 2-option election, with 2N voters, each of whom has probability p of choosing the first option. If p is a fixed number, then as N goes to infinity, (chances of an exact tie times N) go to 0 if N isn't exactly .5, and to infinity if it is. Since the event of p is exactly .5 has measure 0, this model supports the paradox of voting (PoV).

But! If p itself is drawn from an ordinary continuous distribution with nonzero probability density d around .5, then (chances of an exact tie times N) go to ... I think it's just d/2. Maybe there's some correction factor that comes into play for bizarre distributions of p, but if we make the conventional assumption that it's beta-distributed, then d/2 is the answer.

I think that the PoV literature is relying on the "fixed p" model. I think the "uncertain p" model is more realistic, but it's still worth engaging with "fixed p" and seeing the implications of those assumptions.

Comment by jameson-quinn on (answered: yes) Has anyone written up a consideration of Downs's "Paradox of Voting" from the perspective of MIRI-ish decision theories (UDT, FDT, or even just EDT)? · 2020-07-07T14:07:17.729Z · LW · GW

That book is from 2006. I understand that it deals with the Paradox of Voting, but does it have anything that would be directly relevant to considering it in light of "acausal decision theories"? As far as I know, such theories pretty much didn't exist back then.