The value of the online hive mind

vipulnaik

The value of the online hive mind

post by VipulNaik · 2014-04-09T16:52:38.818Z · LW · GW · Legacy · 21 comments

  What have I used the hive mind for?
  How good are people at using these resources, and what advice is being offered to them?
  Some pre-emptive remarks
None
21 comments

The phrase "wisdom of crowds" was made popular in James Surowiecki's eponymous book. The idea of aggregating a diverse range of opinions has been proposed in different forms, ranging from polling to prediction markets. Empirically, prediction markets perform somewhat better than crude polling, but just the act of aggregation itself improves significantly over not aggregating. Even crude aggregation mechanisms can be beneficial.

Aggregation over larger numbers of people can be beneficial even if most people aren't experts. However, it's important to note that aggregation is beneficial only if enough people have at least a rudimentary knowledge of the subject, and those who don't know anything are either unbiased or their biases cancel out(see The Myth of the Rational Voter for more). Aggregation with a certain level of filtering to sieve out the signal from the noise can overcome the problem of ignorance or even bias, as long as there is enough signal on the whole (i.e., enough people in absolute terms who know what they're talking about).

When you're stuck with a question, whether personal, professional, or academic, it is often effective to turn to the hive mind for suggestions. Not that the hive mind can, or should, make your decisions for you. But it can offer valuable input that would otherwise take you a lot of time to collect.

In the past, few people had access to the wisdom of the hive mind when it came to their own questions. Now, however, we have the Internet, and Internet research is a powerful way that people can access the hive mind for far more specific questions than they could have dreamed of before. There are many different types of onilne hive mind you could access:

The Google/Internet hive mind: Search what the Internet as a whole has to say, using Google as your discovery tool. There's a lot of wisdom out there. The advantage is that you can access a huge corpus of knowledge. The disadvantage is that you cannot ask your own questions and the knowledge isn't arranged in a question-and-answer format.
The Wikipedia hive mind: Avail of an "encyclopedia" that's been written through the collaborative efforts of hundreds of thousands of people, and is regularly updated, to fill in the gaps in your knowledge and make an informed decision.
The Quora/LessWrong/StackExchange/Reddit/discussion forum/blogosphere hive mind: Avail of stuff that's explicitly designed for intellectual consumption, including stuff in the question-answer format. Also, ask your own questions and get answers (though not necessarily quickly).
The Facebook(/Twitter?) hive mind: Ask quick questions and get quick answers from a select group of friends.

Of these, (1) and (2) don't rely much on your existing network of friends or followers. As long as your research skills are good, you can turn up the same material regardless of how good your friends and followers are at research. (3) involves a mix of research skills and the quality and size of your network of friends and followers. (4) is very heavily focused on the set of friends and followers you've accumulated.

Is the hive mind actually helpful? To a large extent, this depends on how much the people involved know and/or have interesting things to say about the questions you pose to them. The narrower and more specialized your domain of inquiry, the more likely it is that the hive mind will not be any use. And for the Facebook hive mind (type (4) in my list), you need to have friends who have knowledge of the subject, check Facebook regularly, and are willing to comment. I now turn to my own experience.

What have I used the hive mind for?

The Google and Wikipedia hive minds are the ones I've used the longest, and they're both indispensable to my process of discovery and research for the vast majority of subjects I try to learn about.

I've used the Quora hive mind since I joined the site in June 2011, though my level of use has varied considerably.

For other things that I've been interested in, either professionally or as a hobby, I've found the Facebook hive mind useful. This was not the case when I joined Facebook. It really started happening around late December 2012 and early January 2013, by which time I had accumulated a sufficiently large collection of Facebook friends who were (together) sufficiently widely knowledgeable and spent sufficient amount of time in total on Facebook. By "sufficient" here I mean "sufficient to make sure that enough of my posts attracted valuable comment feedback that I thought posting passed a cost-benefit analysis." I've posted about a varied range of topics ranging from mathematics teaching to education in general to technological progress and social and political issues, and often learn a lot from the comments that I would probably either not have discovered by myself or have taken a much longer time to discover.

However, these general-purpose hive minds are often not of much use for specific technical topics. I've also benefited from access to hive minds associated with more niche communities, some of them on Facebook or Quora, and others on their own websites or blogs. Back when I was working on my Ph.D. in group theory, the Facebook hive mind and Quora hive mind were little use for my research: less than a dozen of my friends knew enough group theory, and those who did didn't check Facebook often enough. For the most part, I had to figure things out by myself, ask my advisor, or handpick individuals who would be likely to know. But I did have access to one hive mind, namely MathOverflow, that I used productively to ask many questions, one of which turned out to fill in an important gap in my thesis.

How good are people at using these resources, and what advice is being offered to them?

Let's look at the four types of hive minds mentioned and how far people are from making use of them:

The Google/Internet hive mind: There is a fair amount of research as well as commentary on how people use search engines for school work and other research. For instance, here's a slideshare presentation from October 2010 (by these people) describing how people's web research skills fall short and how they can be fixed. I'm not very confident of the quality of the advice offered, and also of its continued validity: much of it was written before some of the recent improvements in Google Search such as Google Instant and the knowledge graph (see this timeline of Google Search), and a lot of the advice doesn't jive with my personal experience. But at any rate it's a somewhat well-understood problem where people are actually trying to advise others on how to do it well rather than debating whether to do it at all.
The Wikipedia hive mind: Effective use of Wikipedia has received a fair amount of attention. Wikipedia has its own page on Wikipedia research skills, including some cautionary notes about the particular issues with citing and using Wikipedia because of its role as an often-unvetted tertiary sources. There are also other articles and videos on the subject.
The Quora/LessWrong/StackExchange/Reddit/discussion forum/blogosphere hive mind: These are relatively new, and "best practices" for these haven't percolated to the people who write advice on study habits or general research skills. A biger problem is that a lot of people haven't even heard of relevant websites like Quora, LessWrong, Stack Exchange, or the appropriate niche communities for them. So there's some clear low-hanging fruit just in making them aware of the appropriate resources. That said, there are a few articles on effective use of Quora in particular, but these are largely in niche websites or the technology press rather than in stuff aimed at the general public. As described here, my experience with Cognito Mentoring advisees suggests that recommending to people to join Quora is one of the low-hanging fruit in terms of value we have been able to provide advisees.
The Facebook(/Twitter?) hive mind: The problem here might be most severe, even though a fairly large number of people use Facebook and a reasonable number of people use Twitter. A fair number of people use Facebook as a hive mind for personal problems (such as opinions on a restaurant) but it's not used for academic or research-related questions as much as it could be. Moreover, its use in this respect is generally not encouraged and not considered high-status. I'll talk more about this in a subsequent post.

I'm curious to learn about the personal experiences of LessWrong users on tapping into the online hive minds of various sorts, including categories that I've missed. In addition, views on how effectively most other people tap into the various online hive minds would also be much appreciated.

Some pre-emptive remarks

Pre-empting some criticisms I expect:

I don't mean to imply that the only or even the primary purpose of websites such as Facebook is to answer one's questions. Clearly, there are many other ways people derive value from the websites. This post is focused on the hive mind component of the value, and does not assert that that is or should be the most important reason for people to use Facebook.
The privacy issues surrounding websites such as Facebook and Quora are taken quite seriously by a number of people. I'm not trying to evaluate here whether the benefits of using these website exceeds the (perceived) privacy costs of doing so. I'm simply discussing one item that (I think) would go on the benefits side of the ledger.

PS: Chris Hendrix comments on Facebook:

It seems to me that there's a logic of how to develop your various hivemind levels here. If you attempt to simply start with a FB group as your wisdom of the crowds you may not have enough knowledge to be able to determine whether or not your crowd selection is systematically biased in ways that don't correlate with finding truth. Instead I think there's a logic to building up each level of hivemind usage from the previous. From Google searches you will often be directed to Wikipedia. Wikipedia can then direct you to effective discussion sites (you hear about a discussion site, you check wikipedia to see if there are any criticisms of obvious failure modes). Finally, once you've found effective discussion sites, you've been learning what are useful and what are non-useful contributions. Since these sites will include a number of effective contributors you can pick and choose among this group to find people you can make into good facebook friends.

I think done well, this can be a supplement (or perhaps even an alternative) to professional and academic networking for answering complex and non-obvious questions (the less complex and obvious ones are simply answered at the Google or Wikipedia levels normally).

Cross-posted on Quora here and on the Cognito Mentoring blog here.

21 comments

Comments sorted by top scores.

comment by Stefan_Schubert · 2014-04-10T14:19:48.417Z · LW(p) · GW(p)

In the beginning you write about aggregation of judgments, but later you turn to something else:

When you're stuck with a question, whether personal, professional, or academic, it is often effective to turn to the hive mind for suggestions.

The hive mind can be useful for both of these things, but they are distinct and should be kept apart. The first idea is the one most famously discussed by Condorcet, namely that if a number of voters are on average at least slightly more likely to be right than wrong, then the probability that the majority is right goes to 1 as the number of voters goes to infinity. Generally speaking, if group's are not systematically biased, then their aggregated judgment tends to be (if the aggregation procedure is reasonable) better than the vast majority of the individual voter's judgments.

The other use of the hive mind is rather that as you ask more people, then the probability that at least one has precisely the kind of knowledge you need increases. In this case, the fact that some people are completely ignorant doesn't hurt you (though it doesn't help either), since all you're looking for is this one person who has the kind of knowledge that you need.

I'm working on this at the moment. My suggestion is that people have not paid sufficient consideration to the set-up. In order to make best use of "the hive mind", you need to give people incentives to give sincere votes, and weigh the more reliable voters more heavily. Prediction markets is probably the system that does this best at the moment, but they are impractical in many situations.

Replies from: army1987, VipulNaik

↑ comment by A1987dM (army1987) · 2014-04-13T09:10:44.195Z · LW(p) · GW(p)

if group's are not systematically biased

That's a big if.

↑ comment by VipulNaik · 2014-04-10T14:35:00.454Z · LW(p) · GW(p)

Thanks! This distinction is important, and I was somewhat careless in not clearly highlighting the distinction. I will edit the post later to clarify this.

comment by ChristianKl · 2014-04-09T20:48:17.063Z · LW(p) · GW(p)

If you advice high school students to go an Quora, I would add recommendation about thinking about what the student writes. Quora is very public and depending on the career that the students wants to persue later, there might be content that a student shouldn't post on Quora.

The fact that Quora makes sure that your people who are linked with you are also on Quora get to know what you write can make it more relevant for your life than a forum like Lesswrong.

Replies from: VipulNaik

↑ comment by VipulNaik · 2014-04-10T01:31:02.864Z · LW(p) · GW(p)

Thanks, this is an important point.

We have written up some general recommendations on maintaining one's online presence here. Do you have thoughts on the advice presented there?

Replies from: ChristianKl

↑ comment by ChristianKl · 2014-04-10T10:38:35.870Z · LW(p) · GW(p)

When it comes to pick handles or nicknames I would add that numbers should not appear in them. John123 does not look professional. In my days of forum moderation numbers in a nickname correlated with the person writting a post being a spammer.

There are plenty of cases where it's reasonable to say: I have a friend who did X and then Y happens. Technically that violates the standard of not giving out information about real life friends.

In those cases it's important to ask yourself two questions: (A) Would the friend be okay with me sharing this story? (B) Is all the information I share relevant for the point I'm making? Can I share less information about the identity of the friend and still keep the story in tact?

At my last NLP seminar I interact with a journalist. The journalist afterwards described me as someone who studies natural sciences. That's broad enough that I'm not identifiable. If she would have written someone who studies bioinformatics that would have made it easier to identify myself.

When registering for an anomymous forum account, don't use a email address that's linked to your real identity. When trying to find out who someone happens to be in forum moderation I frequently did simply put the email address into the facebook search and got the real identity of people who thought they were fairly anonymous.

If the annonymity is important to you, don't share the city in which you are living.

I'm personal a person who purposefully made a decision to put out information that might make a conservative employeer reject myself. I'm living in a mental sphere where I don't really think about whether I might offend someone.

I think most high schoolers have a poor idea of what constitutes online behavior that might offend a person who makes hiring decisions. It might be worthwhile to interview a few people who make hiring decisions at more conservative companies to get their standards and explain those standards in a practical way to high school students.

comment by CronoDAS · 2014-04-11T00:24:46.634Z · LW(p) · GW(p)

Has anyone at MIRI tried bringing up any FAI-related math problems on MathOverflow or a similar site?

Replies from: Punoxysm

↑ comment by Punoxysm · 2014-04-11T03:43:33.595Z · LW(p) · GW(p)

CSTheory.stackexchange is probably also relevant for anything in the vein of solomonoff induction.

comment by John_Maxwell (John_Maxwell_IV) · 2014-05-11T23:55:16.493Z · LW(p) · GW(p)

If your question is specific enough, searching scholar.google.com will sometimes let you read the abstract of a study or meta-analysis that purports to answer it.

comment by Benvie · 2014-04-15T20:37:58.294Z · LW(p) · GW(p)

If we look at using the internet as a method of offloading both memory and processing from our brain, we see it as a tool that allows our brain to reach higher level abstractions and better reasoned and tested hypothesis. By deferring to the hive mind, we remove the need to actually sift through information ourselves. Once we trust our own ability to filter the signal from the hive mind's noise, then we can defer to it more and more to make decisions for us, freeing our minds to make higher level decisions.

The more filtered you get, the better it works. I rarely read articles nowadays. I read the comments first and only read the article if multiple comments say "wow must read" and if the subject matter is something I really want more depth in. Even better, don't read many comments. Using sites that order comments by popularity, you can get the absolute most distilled wisdom by reading a few of the highest rated comments and moving on.

If you have developed your own ability to efficiently meta-aggregate information from all these different types of information aggregators, and you can resist the drive toward curiosity and learning more in depth, then you can essentially turn yourself into a Competent Man/Woman solely using your intuitive information aggregation capabilities.

comment by HungryHobo · 2014-04-11T11:20:52.845Z · LW(p) · GW(p)

Combining them can be invaluable.

For the first 2 not knowing the correct terms for a niche area can slow your progress or lead you to the least informed articles because you're likely to use the same terms as an amateur journalist or writer talking about the subject.

3, simply asking a few simple questions on forums can significantly augment other searches by teaching you the correct terms and start you off in the correct node clusters.

comment by Lumifer · 2014-04-09T21:02:30.577Z · LW(p) · GW(p)

Sorry, ignore.

comment by ChristianKl · 2014-04-09T19:58:55.405Z · LW(p) · GW(p)

A fair number of people use Facebook as a hive mind for personal problems (such as opinions on a restaurant) but it's not used for academic or research-related questions as much as it could be. Moreover, its use in this respect is generally not encouraged and not considered high-status. I'll talk more about this in a subsequent post.

I don't see how it's not high status to ask research questions on facebook. If you want you can use groups to filter your audience well enough that people outside of the subject won't be reached.

I would value a person more than when I find a bunch of research discussions on his facebook feed then when I find lolcats.

Replies from: VipulNaik

↑ comment by VipulNaik · 2014-04-09T21:30:56.227Z · LW(p) · GW(p)

I don't think posting research questions to Facebook is low-status, as much as it's something that people don't consider doing, and part of the reason is that Facebook is associated with low-status entertainment-type stuff, so people miss out on the possibility of using the Facebook hive mind effectively for "research" type purposes.

Replies from: ChristianKl, Stefan_Schubert

↑ comment by ChristianKl · 2014-04-10T14:25:04.935Z · LW(p) · GW(p)

On thing with facebook that would be interesting to know is whether if you tag a message in a way where it's shown to 10 people instead of 1000 those ten people are more likely to see the message.

I see you cc people in a comment to make them aware of the quesiton. I might copy that approach the next time I have a good question for that format.

Replies from: VipulNaik

↑ comment by VipulNaik · 2014-04-10T14:36:32.792Z · LW(p) · GW(p)

Everybody who gets tagged does receive a notification about the Facebook status, when they next log in to Facebook. So they are highly likely to see it (if they check their notifications).

Moreover, if a few people are tagged, and if some of them comment, that makes it more likely that Facebook will show the post to other people as well (because posts that attract more likes and comments do better on Facebook's news feed algorithms).

↑ comment by Stefan_Schubert · 2014-04-10T15:12:02.908Z · LW(p) · GW(p)

I think it is in certain circles seen as low-status, which is a shame, really. It's an interesting point you bring up, not the least since it connects to the issue of how deeply social an enterprise science should be. Even though the number of authors of scientific papers have gone up, etc, science is still not a deeply collaborative effort. People don't publish data that would be useful for people to get hold of, they fudge their inferences to suit their interests, they defend unresonable hypotheses simply because they themselves came up with them, etc.

This happens to a large part because scientists aren't sufficiently incentivized to be more collaborative. The question is how to do that. I think that one way could be this.

1) Let a community of researchers discuss a certain question openly.

2) Once the group, or a sub-group of it, has reached a consensus on what the correct solution to the question is, you assign someone to write the whole stuff up.

3) A community of peers assign credits to the people contributing in the discussion (and, possibly, to the guy who writes up).

I'm aware that this is not without problems (e.g. how are you going to allocate the credits with people feeling unfairly treated). Nevertheless I think it's an idea worth exploring.

Replies from: asr

↑ comment by asr · 2014-04-10T19:05:09.428Z · LW(p) · GW(p)

This does sometimes happen. A recent very impressive example was the collective effort to improve the bound on gaps between primes.

Replies from: Stefan_Schubert

↑ comment by Stefan_Schubert · 2014-04-11T08:30:03.696Z · LW(p) · GW(p)

Thanks! Someone tipped me about that before in fact, but I had half forgotten about it.

However, I think this could be done outside of mathematics, too. Also, I think that one could debate how people are to be given credit for their work on the collaborative project. Polymath don't give explicit credit to individual contributors, but in my system, you would. The details of this are very important, since you need to give time-pressed researchers incentives to participate in a system like this.

Replies from: ChristianKl

↑ comment by ChristianKl · 2014-04-11T09:41:30.188Z · LW(p) · GW(p)

Time-pressured researchers do things like reviewing papers of journals without someone paying them to do so. Being a reviewer is having power over a peer. It's a kind of social status.

Replies from: itaibn0

↑ comment by itaibn0 · 2014-04-11T20:51:41.966Z · LW(p) · GW(p)

I don't think researchers review papers because they want to have power over their peers. I think they do it because it is a community norm and beneficial to their community. This is similar to why people avoid littering. Status games may still enter into it because how often someone litters or reviews papers affects their reputation.

The value of the online hive mind

Contents

21 comments