The secret of Wikipedia's successpost by aaronb50 · 2021-04-14T22:18:52.871Z · LW · GW · 10 comments
This is a link post for https://aaronbergman.substack.com/p/the-secret-of-wikipedias-success
its reputation for unreliability is Wikipedia's greatest asset Intro Is it any good? is this possible? The cost of a good reputation Wikipedia hacked the system None 10 comments
Why its reputation for unreliability is Wikipedia's greatest asset
Wikipedia is everything the internet was supposed to be. Before social media became a battleground for foreign election meddling, corporate and political messaging wars, and algorithmic competition for our attention, it was going to be a means of sharing information across geographic, social, and political borders.
Over the last 30 years, this egalitarian-techno-optimist naiveté has given way to pragmatism about the social and economic forces governing the internet. But one beacon of innocent, brilliant functionality reminiscent of the old ethos remains: wikipedia.org.
If you’ve read a few of my other pieces, you may know that I love linking to Wikipedia. It is my default source for any event, item, or concept that I think readers might not know a ton about. The Capitol riots? Got that. A book summary? Yup. An author’s background? That too.
Is it any good?
Lots of people have commented on Wikipedia’s notable, and frankly surprising, reliability, breadth and depth all fueled by earnest online volunteers and a little over $100 million each year, or about 2% of American annual spending on ice cream. Although anyone can edit just about any Wikipedia page to say just about anything, several studies find that Wikipedia is very reliable, albeit not always comprehensive or very analytical.
In my opinion, Wikipedia is often the best source of information on topics with an intermediate amount of salience. That is, extremely popular (say, the views of two presidential candidates) or extremely banal (say, “rice”) topics of inquiry naturally attract so much attention that there are likely other excellent resources on the topic.
On the other end, Wikipedia pages on the very specific or obscure likely cannot attract enough attention to warrant confidence that something important has not been omitted. But for things in between - juggling, the city of Oakland, CA, or flips-flops - Wikipedia is often unambiguously the best single, easily accessible resource.
How is this possible?
There are several articles (like this one and this one) out there praising Wikipedia and speculating about the reasons for its success. Without a doubt, the organization makes good use of explicit rules and guidelines, community norms, and a social structure that awards status to people for high-quality contributions.
However, it strikes me that Wikipedia’s secret to reliability is something paradoxical that I’ve never seen explicitly addressed: its reputation for unreliability.
If I had a dollar for every time I was told in school that, no, I can’t cite Wikipedia as a source for my paper, I would be able to stop feeling guilty for ignoring this popup when I use the site.
To be clear, I agree that scholarly work should not cite Wikipedia itself as a source. Anyone can edit it, there’s no accountability, blah, blah, blah. Little did I realize, until a few days ago, that every teacher and librarian who pounded this into my head from kindergarten on was likely doing Wikipedia a huge service. Let me explain…
The cost of a good reputation
There are an interesting class of beliefs that become more true the fewer or less strongly people believe them. For example, the belief “my vote matters” will lead more people to go to the polls, which in turn means that every individual vote matters less.
Something similar is happening with Wikipedia. For sources of information widely regarded as reliable—not just by individuals, but by authorities and institutions like government, academia, and the media—there is a massive incentive to get them to say what you want them to.
Newspapers are the most obvious example. Politicians and government agencies strategically craft press releases, offer quotes, and make timed leaks to shape the media narrative around some event. Companies pay PR people big money to say something sympathetic that will be quoted in an article. Why? Because many people (at least, the people in power) think that newspapers are reliable. Or, perhaps more accurately, everyone thinks that everyone else thinks that newspapers are reliable.
The same thing holds true for scientific research, government reports, and more. Their reputation for reliability (whether deserved or otherwise) makes them a prime target for any institution with a story to sell. That’s why biomedical research is a big, juicy steak for the pharmaceutical industry, and nutrition research is an important lever of influence for agribusiness.
This isn’t an original point. I heard it most clearly expressed by Will Wilkinson on The Wright Show. But there is an obvious corollary I have not heard: sources that are not seen as reliable are much less tempting prey for narrative-shaping predators. For example, I solemnly swear that exactly zero corporations, politicians, or government agencies (that I know of) have tried to get me to say (or not say) something on this blog. The reason is pretty obvious: I’m just a random guy on the internet, and I am not regarded by society at large as a reliable source of information.
How Wikipedia hacked the system
Usually, to a rough first approximation, sources regarded as reliable are actually more reliable than those that are not. Yes, media bias, Manufacturing Consent, the replication crisis, etc. etc. I’m probably more skeptical of institutional authority than the median non-Trump supporter, precisely because of these concerns.
That said, “unreliable” sources generally are pretty unreliable. Consider a list of things not generally trusted by the Powers that Be
- Blogs (by individuals, at least)
- Reddit posts
- Donald Trump
- Company press releases
- Undergrad research papers not published or endorsed by someone high-profile
That doesn’t mean these things are wrong. There are specific blogs (not my own), Reddit posts, and undergrad research papers (my own, obviously) that I trust more than an arbitrarily-selected Washington Post article or scientific paper. But I would not trust an arbitrary blog or Reddit post more than an arbitrary WaPo article or paper. This relationship holds for the first five items on that list.
But Wikipedia is different. I do trust a random Wikipedia article (use en.wikipedia.org/wiki/Special:Random to get one) more than a random newspaper article or scientific paper, (footnote: Of course, this isn’t a “fair” comparison. Wikipedia articles can be about random shit completely unrelated to an ideology or the culture war, whereas news tends to be precisely the opposite. I suspect that this would hold true even if I weighted each Wikipedia page by its number of views and then took a random sample, though), although this wouldn’t hold if we limited the papers to, say, those in the hard sciences with >100 citations. If you think I’m crazy for writing this, read “What's Wrong with Social Science and How to Fix It” and check out several random Wikipedia pages and then get back to me.
So how did Wikipedia hack the system? How is it able to be so reliable? Because it—alone, as far as I can tell—maintains a set of incentives and processes for generating content that simultaneously produce reliable content and is coded as “unreliable.”
Giving anyone online the ability to write anything they want on almost any article sounds like the kind of thing that would generate a cesspool of disinformation and nonsense. However, features peculiar to Wikipedia (in particular, the fact that every article has only one version, so “opposing sides” have to reach some sort of equilibrium instead of everyone just publishing their own version) do effectively incentivize internet randos to write things that are true instead of false.
These two opposing forces mean that Wikipedia has managed to do something analogous to landing a flipped coin on its side: generate reliable content without gaining an institutional reputation for reliability that would incentivize a massive effort to shape its content (though, unfortunately, the coin may be wobbling).
Not convinced? Imagine, for a minute, that every Wikipedia page was afforded the same degree of authority as a major newspaper, government agency, or even “real encyclopedia. All hell would break loose. “Trump Wins 2020 Election!” would have been splashed across every half-relevant page, and bots would be created to re-edit the pages each time they were corrected. Companies would spread rumors about rival firms. Investors would short stock and then announce that the company has been falsifying their quarterly statements. Random people would “award” themselves a Nobel Prize.
Obviously, this isn’t a stable equilibrium. Within a few hours at most, people would realize that Wikipedia was (genuinely) unreliable and it would be downgraded to the epistemic equivalent of a flat-earther Facebook group.
So why aren’t other “unreliable” sources actually reliable, if they aren’t being attacked by anyone trying to shape a narrative? A bunch of reasons. As mentioned above, Wikipedia works as a “marketplace of ideas” because everyone ultimately has to collaborate to create a single page for a given topic. It’s similar to an economic ideal competitive market, in which the collective self-interest of thousands of buyers and sellers yields an optimal single “market price.”
In the blogosphere or on Reddit, Democrats and Republicans, or Kantians and Utilitarians, or Yankees and Red Sox fans don’t have to do this type of adversarial collaboration, ultimately producing just a single product. Instead, every individual or group can have their own blog and write their own Reddit posts.
Even an individual earnestly seeking the truth cannot generally expect to beat the “marketplace of ideas,” in the same way that a hypothetical benevolent politburo couldn’t expect to price goods and services more efficiently than the free market, except under certain rare circumstances (say, large externalities but no ability to tax or subsidize, or extremely low trading volume).
I always thought the aphorism “power corrupts, and absolute power corrupts absolutely” was a little silly, but maybe I was missing the point. Perhaps it isn’t that the thing or person with power suddenly sheds its values, but rather that a thing only attracts external corrupting influences once it gains power.
The U.S. president’s power, for example, inevitably attracts media attention, lobbyists, and even personally-targeted commercials. No matter how principled he or she is, human psychology is fundamentally responsive to the stimuli it receives, and the president’s actions will inevitably be influenced to some degree. If my theory is generally right, I hope it serves to illustrate a general cautionary principle: be careful before empowering or elevating the salience of something good.
Wikipedia is good not in spite of but because of its limited power. Intellectual or social movements might be maximally productive when concentrated among a few earnest, hard-core supporters. Musicians might produce their best work before they feel the need to cater to a growing mass of fans. My blog, insofar as it is interesting at all, can attribute its ‘goodness’ in part to the fact that I have few ‘corrupting influences,’ and make $0 in revenue.
This doesn’t mean everything good should be zealously guarded against acquiring power and influence; Wikipedia wouldn’t be so awesome if nobody knew about it, after all. Rather, influence is simultaneously symbiotic and parasitic with net positive impact. And Wikipedia, it seems to me, is at the top of the curve.
Comments sorted by top scores.