Weighted Voting Delenda Est

post by Czynski (JacobKopczynski) · 2021-03-01T20:52:27.929Z · LW · GW · 20 comments

Unlike a number of other issues, this one I didn't call in advance, though in retrospect it's, if anything, much more obvious than other things I did call out. Weighted voting on LW, at minimum the ability for it to be visible and preferably the ability for it to affect anything at all except, at most, the number displayed next to a user on their profile page, is a catastrophic failure in progress and must be destroyed.

I've said in the past that

The Hamming problem of group rationality [LW(p) · GW(p)], and possibly the Hamming problem of rationality generally, is how to preserve epistemic rationality under the inherent political pressures existing in a group produces.

It is the Hamming problem because if it isn’t solved, everything else, including all the progress made on individual rationality, is doomed to become utterly worthless. We are not designed to be rational, and this is most harmful in group contexts, where the elephants in our brains take the most control from the riders and we have the least idea of what goals we are actually working towards.

And, closely connected but somewhat separable:

Most things we do [LW(p) · GW(p)] are status-motivated, even when we think we have a clear picture of what our motivations are and status is not included in that picture. Our picture of what the truth looks like is fundamentally warped by status in ways that are very hard to fully adjust for.

I also said, particularly for the latter, that "the moderation policies of new LessWrong double down on this". I stand by that, but I missed a bigger issue: the voting system, where higher karma grants a bigger vote, also doubles down on it. Big names are overrepresented on the front page, at the top of the comments section, and everywhere else you can discover new LW content. This was somewhat understandable when LW was working itself out of its doldrums and influential people were making an effort to put good content here, but if that was the driver, it would have gotten less noticeable over time, and instead it has gotten more blatant.

Individuals can opt-out of seeing these votes, but to a first approximation that's useless. Everybody knows that everyone can see the strength of votes, even if that isn't strictly true; social proof is stronger than abstract inference. Social proof is bad, very bad, at best something to be used like someone carrying around two slightly-subcritical uranium masses in their pocket, where a small slip could make them fit together and kick off a chain reaction. It is Dark Arts at their most insidious because, like the Unbreakable Vow, it's tightly integrated into society, extremely useful for some goals we endorse, and very difficult to stop using. And we can each opt out of individually seeing this signal, but we can't opt out of the community seeing and displaying social proof and 'everybody knowing' that, if not them, everybody else is doing so. Even if, in point of fact, 90% of users are opting out of seeing vote totals*, each user 'knows' that everyone, or nearly everyone, other than themself, sees them, and knows that everyone else sees them, and knows that they know, etc., etc.; social proof is a very effective means of establishing common knowledge, which makes it extremely useful, except that it is virtually just as effective at establishing inaccurate common knowledge as it is for accurate.

The medium... is the mess.

It is not sufficient, for establishing common knowledge of a fact, that the fact be true. But it is also, crucially, not necessary. There's a party game called 'Hive Mind': you get a prompt, and write down six things that fit it. You get points based on how many other people wrote them down. If the prompt is "insect", one of the six should say "spider". You know a spider is not an insect; probably so does everyone else around the table. But everybody knows that a spider is a bug and a bug is an insect, so everybody knows "spider" belongs on the list. Never mind that it's false; a spider is not an insect but there's no common knowledge of that fact, and there is common knowledge of its opposite.

So, much like the spider: everybody knows that the big names are more correct than the little fish. Just about everyone can, and occasionally sometimes does, notice and remind themself that this is not inherently true, and the big names should get more weight only because they have demonstrated the ability to generate past good ideas and thereby earned a big name. But there is no common knowledge of that, because the voting system is structured to promote common knowledge that the big names are always right. This is a catastrophe, even if the big names are almost-always right.

Possible solutions, in ascending order of estimated usefulness starting from the mildest:

I don't really expect any of this to be done. No one seems to be willing to treat small-group politics or status-corrupting instincts as important, and people who are respected much more than me are actively working in the opposite direction in the name of instrumental rationality. But it needs to be said.

* I do not believe this; I would guess about 5% of users opt out. I would be interested to learn the true number.


Comments sorted by top scores.

comment by Ben Pace (Benito) · 2021-03-02T02:36:34.609Z · LW(p) · GW(p)

I appreciate your thoughtful list of changes. But I don’t agree that weighted voting is bad. Overall I see the following things happening: 1) new writers coming to the site, writing excellent posts, getting reputation, and going on to positively shape the site’s culture (e.g. Alkjash, Wentworth, TurnTrout, Daniel Kokotajlo, Evan Hubinger, Lsusr, Alex Flint, Jameson Quinn, and many more) 2) very small amounts of internal politicking 3) very small amounts of brigading from external groups or the dominant culture. I agree that setting weighted karma is a strong bet on the current culture (or the culture at the time) being healthy and being able to grow into good directions, and I think overall that bet seems to be going fairly well.

I don’t want most people on the internet to have an equal vote here, I want the people who’ve proven themselves to have more say.

I do think that people goodhart on short-term rewards (e.g. karma, number of comments, etc) and to build more aligned long-term incentives the team has organized the annual review (2018 [LW · GW], 2019 [LW · GW]) (the result of which notably does not just track post-karma) and we have published the top 40 posts in a professionally designed book set [? · GW].

I agree the theoretical case would be pretty compelling alone, and I agree that some part of you should always be terrified by group incentive mechanisms in your environment, but I don’t feel the theoretical case here is strong and I think ‘what I see when I look’ is a lot of thoughtful and insightful writing about interesting and important ideas.

I also am pretty scared of messing up group incentives and coordination mechanisms – for example there are many kinds of growth that I have explicitly avoided because I think they would overwhelm our current reward and credit-allocation systems.

Replies from: JacobKopczynski
comment by Czynski (JacobKopczynski) · 2021-03-03T06:18:59.586Z · LW(p) · GW(p)

What I see when I look is almost nothing of value which is less than five years old, and comment sections which have nothing at all of value and are complete wastes of time to read at all. And I see lackluster posts by well-known names getting tons of praise and little-to-no meaningful argument; the one which ultimately prompted this post to be written now was Anna's post about PR, which is poorly reasoned and doesn't seem to be meant to endure scrutiny.

The annual reviews are no exception; I've read partway through several, and gave up because they were far lower in quality than random blog posts from personal blogs; sample purely randomly from Zvi's archives or the SSC archives and you'll get something better than the best of the annual review nine times out of ten, and I get far more value out of an RSS subscription to a dozen 'one or two posts a year' blogs like those of Nate Soares or Jessica Taylor than the annual review has even approached.

You think that the bet on "the current culture (or the culture at the time) being healthy and being able to grow into good directions[...] seems to be going fairly well." I do not see any reason to believe this is going well. The culture has been bad since nuLW went up, and getting steadily worse; things were better when the site was old, Reddit-based, and mostly dead. The site maintainers are among the groups of people who are receiving the benefit of undeserved social proof, and this is among the significant factors responsible for this steady degradation. (Also half of the team are people who had a demonstrated history of getting this kind of dynamic badly wrong and doing the collective epistemic rationality equivalent of enthusiastically juggling subcritical uranium masses, so this outcome was predictable; I did in fact predict it.)

I also resent the characterization of my list as 'babble'; this imputes that it is a bunch of ideas thrown against the wall, rather than a considered list. It is a well-considered list, presented in long form because I don't expect any action to be taken on any of it but I know no action would be taken if all I presented was the things I thought would be sufficient.

Replies from: Benito, Vladimir_Nesov
comment by Ben Pace (Benito) · 2021-03-03T07:13:24.952Z · LW(p) · GW(p)

I have some sense to engage you on particular posts, but I don’t know what would be compelling/cruxy for you.

I could say that Anna’s is in some ways like Swentworth‘s recent post “Making Vaccine”, not in being notably successful/groundbreaking but for being a move in an important direction to be rewarded — I think making your own vaccine is relatively easy and I am facepalming that I did not try to make a proper mRNA vaccine back in April. Similarly, I think Anna’s post is correctly moving from a common naive consequentialist refrain that I think is very damaging and contrasting it with a virtue ethics perspective that I think is a healthy remedy, and I regularly see people failing to live up to virtues when faced with naive consequentialist reasoning. No, it was not especially rigorous or especially brilliantly communicated like it was Tim Urban explaining how Neuralink works. But I think that there’s space for rigorous, worked out research like Cartesian Frames or Radical Probabilism, as well as off-the-cuff ideas like the PR/Honor one.

Or I could talk about how valuable new ideas have been explained and built on and discussed. I could talk about Mesa-Optimizers and then follow-on work where people have done Explain Like I’m 12 [LW · GW]. I could talk about discussion building on the Simulacra Levels [? · GW] ideas that I think LW has helped move along (although I expect you’ll point out that many of the people writing on it like Benquo, Zvi, and Elizabeth have their own blogs). I could talk about the time Jameson Quinn spent a month or two writing up a research question he had in voting theory and a commenter came in and solved it [LW · GW]. I don’t know if you’ll find this stuff compelling, in each case I can imagine a reason to not be excited. But in my mind this is all contributions to our understanding of rationality and how minds work, and I think it’s pretty positive. And maybe you’ll agree and just think it’s nowhere near enough progress. And on that I might even agree with you, and would say I am planning something fairly more ambitious than this in the longer term.

The single best thing on LessWrong 2.0 so far I’d say is the Embedded Agency sequence [? · GW]. I think this was a lot of work done primarily by Scott and Abram (employed by MIRI), and I think LessWrong gave it a home and encouraged Abram to do it in the cartoon style (after the hit success of An Untrollable Mathematician) which I think improved it massively, making it more Feynman-esque in its attempt at simplicity, and would have probably stayed in the long drought of editing for far longer, and had the LW audience not been around for it and been read far less and built on far less. I would call this a big deal and a major insight. That would be somewhat cruxy for me and I’d be overall quite surprised if I came to think it didn’t represent philosophical progress in our understanding of rationality and LessWrong hadn’t helped it (and follow-up work like this [LW · GW] and this [LW · GW]) get written up well.

Added: You’re right, it wasn’t a babble, it was quite thoughtful. Edited.

Replies from: Vaniver
comment by Vaniver · 2021-03-03T22:43:57.731Z · LW(p) · GW(p)

I could talk about the time Jameson Quinn spent a month or two writing up an open research question in voting theory and a commenter came in and solved it [LW · GW].

I do think this is overselling this a little, given that Shapley value already existed. [Like, 'open research question' feels to me like "the field didn't know how to do this", when it was more like "Jameson Quinn discounted the solution to his problem after knowing about it, and then reading a LW comment changed his mind about that."]

Replies from: Benito
comment by Ben Pace (Benito) · 2021-03-04T00:38:23.379Z · LW(p) · GW(p)

Thx, edited.

comment by Vladimir_Nesov · 2021-03-03T09:26:59.838Z · LW(p) · GW(p)

This is a much clearer statement of the problem you are pointing at than the post.

(I don't see how it's apparent that the voting system deserves significant blame for the overall low-standard-in-your-estimation of LW posts. A more apparent effect is probably bad-in-your-estimation posts getting heavily upvoted or winning in annual reviews, but it's less clear where to go from that observation.)

comment by shminux · 2021-03-01T22:12:54.426Z · LW(p) · GW(p)

I assume you have evidence of your conjectures that voting is a problem? If so, can you list a few high-quality posts with strangely low voting total by less-known users here?

Replies from: lsusr
comment by lsusr · 2021-03-03T09:57:19.908Z · LW(p) · GW(p)

Czynski claims to find "almost nothing of value which is less than five years old" [LW(p) · GW(p)]. It may be more efficient for Czynski to write high-quality posts instead.

comment by maia · 2021-03-01T20:58:14.107Z · LW(p) · GW(p)

It seems you think that people weighting how much to believe something based on whether the author is a Big Name is a bad thing. I get that. But I don't understand why you think weighted voting in particular makes this problem worse?

Replies from: JacobKopczynski
comment by Czynski (JacobKopczynski) · 2021-03-01T21:03:11.027Z · LW(p) · GW(p)

Fair point. The short version is that it expands the scope of 'what is endorsed by the respected' from just the things they say themselves to the things they indicate they endorse, and this expands the scope of what social proof is affecting.

It seems obvious in my head, but I should have elaborated (and may edit it in, actually, once I have a long version).

comment by G Gordon Worley III (gworley) · 2021-03-02T15:25:52.100Z · LW(p) · GW(p)

As I see it, voting on LessWrong isn't directly a measure of anything other than how much other readers on LessWrong chose to click the upvote and downvote buttons. It gets used by the site as a proxy to guess how likely other readers would like to see a post, but it's pretty easy to use the site in a way that just ignores that (say by using the All Posts page).

So why does it matter if the voting is bad at measuring things you consider important? Which is really just asking, why should we prefer to be better at measuring whatever it is you would like us to measure (seems you want something like "higher score = more true")?

I mean that seriously. If you want the voting to be different, it's not enough to say you don't like it and that it's status oriented (and most of this post reads to me like complaining the votes to status mapping today doesn't match your desired mapping, which is just its own meta-status play). You've got to make a persuasive bid that the thing you want voting to track instead is better than whatever is happening today and then downstream from that propose a mechanism (ideally by explaining how the gears of it will get you what you want). Instead you've given us a kind of outline with all the essential details missing or implied.

Replies from: JacobKopczynski
comment by Czynski (JacobKopczynski) · 2021-03-03T06:23:08.206Z · LW(p) · GW(p)

The point of LessWrong is to refine the art of rationality. All structure of the site should be pointed toward that goal. This structure points directly away from that goal.

Replies from: gjm
comment by gjm · 2021-03-03T14:31:17.426Z · LW(p) · GW(p)

I don't think you've established that "this structure points directly away from that goal".

Your thesis (if I'm understanding it right) is that weighted voting increases the role of "social proof", which will be bad to whatever extent (1) valuable outside perspectives are getting drowned by less-valuable[1] insider-approved posts and/or (2) the highest-karma users have systematically worse judgement than lower-karma users do. This trades off against (2') whatever tendency there may be for the highest-karma users to have better judgement. (Almost-equivalently: for people with better judgement to get higher karma.)

If 2' is a real thing (which it seems to me one should certainly expect), simply saying "social proof is a bad thing" isn't enough to indicate that weighted voting is bad. The badness of giving more weight to something akin to status could be outweighed by the goodness of improving the SNR in estimates of post quality.

You haven't provided any evidence that either 1 or 2 is actually happening. You've said that you think the content here is of low quality, but that's not (directly) the relevant question; it could be that the content here is of low quality but weighted voting is actually helping the situation by keeping outright junk less prominent.

My guess is that if you're right about the quality being low, the primary reason isn't poor selection, or poor incentives, but simply that the people here aren't, in aggregate, sufficiently good at having and refining good ideas; and that the main effect of removing weighted voting would be to make the overall quality a bit worse. I could of course be wrong, but so far as I can tell my guess is a plausible one; do you have evidence that it's wrong?

[1] Less valuable in context. Outside stuff of slightly lower quality that provides greater diversity of opinions could be more valuable on net, for instance.

comment by John_Maxwell (John_Maxwell_IV) · 2021-03-03T21:15:47.185Z · LW(p) · GW(p)

For whatever it's worth, I believe I was the first to propose [LW · GW] weighted voting on LW, and I've come to agree with Czynski that this is a big downside. Not necessarily enough to outweigh the upsides, and probably insufficient to account for all the things Czynski dislikes about LW, but I'm embarrassed that I didn't foresee it as a potential problem. If I was starting a new forum today, I think I'd experiment with no voting at all [LW(p) · GW(p)] -- maybe try achieving quality control by having an application process for new users? Does anyone have thoughts about that?

Replies from: lsusr, Vaniver, JacobKopczynski
comment by lsusr · 2021-03-04T08:48:07.619Z · LW(p) · GW(p)

Personally, I am allergic to application processes. Especially opaque ones. I likely would have never joined this website if there was an application process for new users. I don't think the site is too crowded with bad content right now, though that's certainly a potential problem if more people choose to write posts. If lots more people flood this site with low quality posts then an alternative solution could be to just tighten the frontpage criteria.

For context: I was not part of Less Wrong 1.0. I have only known Less Wrong 2.0.

Replies from: John_Maxwell_IV
comment by John_Maxwell (John_Maxwell_IV) · 2021-03-04T08:55:32.001Z · LW(p) · GW(p)

Good to know! I was thinking the application process would be very transparent and non-demanding, but maybe it's better to ditch it altogether.

comment by Vaniver · 2021-03-03T22:50:29.515Z · LW(p) · GW(p)

IMO the thing voting is mostly useful for is sorting content, not users. You might imagine me writing twenty different things, and then only some of them making it in front of the eyes of most users, and this is done primarily through people upvoting and downvoting to say "I want to see more/less content like this", and then more/less people being shown that content.

Yes, this has first-mover problems and various other things, but so do things like 'recent discussion' (where the number of comments that are spawned by something determines its 'effective karma').

Now, in situations where all the users see all the things, I don't think you need this sort of thing--but I'm assuming LW-ish things are hoping to be larger than that scale.

Replies from: John_Maxwell_IV
comment by John_Maxwell (John_Maxwell_IV) · 2021-03-04T08:30:02.666Z · LW(p) · GW(p)

Makes sense, thanks.

comment by Czynski (JacobKopczynski) · 2021-03-24T23:36:26.365Z · LW(p) · GW(p)

The 'application process' used by Overcoming Bias back in the day, namely 'you have to send an email with your post and name', would probably be entirely sufficient. It screens out almost everyone, after all.

But in actuality, what I'd most favor would be everyone maintaining their own blog and the central repository being nothing but a blogroll. Maybe allow voting on the blogroll's ordering.