2022 Less Wrong Census/Survey: Request for Comments

post by Screwtape · 2023-01-25T20:57:30.853Z · LW · GW · 29 comments

The first draft of the 2022 Less Wrong Census is complete, and viewable here. Please take a look at it! Please do not actually take it yet!

I have included all the standard questions such as age, country, and probability estimates on whether we live in a simulation. I have also included many of the bonus questions from previous years, and have in fact compiled a list of the questions from every previous survey I could freely find. I intend to take the Do Not Take This Survey Yet warning off of and make another post asking people to take it on Monday the 30th of January.

Yes, this is a bit later than usual. I'm comfortable counting people answering this in late January or early February as part of the previous year.

Three things I would like from you:

First, please look over this draft. Let me know if any questions are poorly phrased or pointless. Tell me if there are any dumb typos or copy and paste errors. 

Second, if you work for Less Wrong or CFAR, I didn't add any of your questions. (Ben Pace in particular had a very lovely document full of possible questions, all of which I read and very few of which I used.) If you want me to add some questions, let me know and I'm happy to add entire sections for you! If you don't work for Less Wrong or CFAR, you can also suggest additions and I'll add as many as make sense to the bonus section.

Third, the data links for many previous surveys are now dead links. If you have a copy of the public data for the 2009, 2011, 2012, 2013, 2017, or 2020 Less Wrong surveys, please pass them on to me (or just publish a working link in the comments below.) As far as I know no Less Wrong Census was run in 2010, 2015, 2018, 2019, or 2021. Yes, the only Less Wrong data set I've recovered was 2014 and 2016. If I can get copies of more of them, I intend to bundle them together into one .zip file that can be backed up in lots of places.

Edit: Census post is up here. [LW · GW]

29 comments

Comments sorted by top scores.

comment by gjm · 2023-01-25T21:11:10.709Z · LW(p) · GW(p)

The "Profession" question's text presupposes that respondents work in some "academic field". I think deleting the word "academic" would be an improvement, since it would e.g. avoid the risk that someone who works as a farmer thinks "oh, this question is not for the likes of me" rather than just putting in "Farming" or "Agriculture" in the write-in slot at the end (which I assume would be more helpful).

Replies from: Screwtape
comment by Screwtape · 2023-01-25T21:16:57.537Z · LW(p) · GW(p)

I agree and it's been updated. I read it as "what academic field does your work fall under?" but that does make the computer trifecta a little odd. This is one of the ones I got from previous surveys, so we have a little risk of throwing answers off by changing the wording, but I think it's safe enough.

comment by gjm · 2023-01-25T21:09:06.895Z · LW(p) · GW(p)

The "Profession" question says "If more than one, please choose most important" ... but then, unlike most other questions, has non-mutually-exclusive checkboxes rather than mutually-exclusive radio buttons. I think it should either enforce "choose only one" or else have text that embraces the option of choosing more than one.

Replies from: Screwtape
comment by Screwtape · 2023-01-25T21:17:56.450Z · LW(p) · GW(p)

I agree and it's been updated. That was purely me missing a dropdown when adding a lot of questions to google forms one after another.

comment by gjm · 2023-01-25T21:47:09.090Z · LW(p) · GW(p)

There are a few double-spaces, which are completely harmless but look kinda ugly. I see three in the "human biodiversity" question and one in the "Great Stagnation" question. (Those are the only questions where I noticed them.)

Replies from: Screwtape
comment by Screwtape · 2023-01-25T22:20:44.778Z · LW(p) · GW(p)

I have fixed those two questions. I will fix any more that get pointed out or that I happen to spot.

comment by gjm · 2023-01-25T21:43:18.993Z · LW(p) · GW(p)

Perhaps the "SRS" question should have finer-grained answers. "Every day", "Frequently", "Occasionally", "Never" or something. Or perhaps the question should have a word like "regularly" or "at least once a week" in it. (Every couple of years, I remember the existence of Anki and make a desultory effort to learn some stuff using it, but the habit never really sticks. You probably want a "no" answer from me, but strictly speaking my answer to the question as written is "yes".)

Replies from: Screwtape
comment by Screwtape · 2023-01-25T22:23:06.634Z · LW(p) · GW(p)

Updated with finer grained answers, which should be easy to convert into the old "Yes, No."

comment by gjm · 2023-01-25T21:39:58.202Z · LW(p) · GW(p)

There is a thing saying "Section 9: More complicated probability questions", after which there is only one probability question, and actually even that one is not exactly a probability question; after that are a bunch of not-probability questions. I suspect that it should be something more like "Section 9: Other questions", and if so then probably either the words that follow should be tweaked or else they should be moved to the previous section heading.

Replies from: Screwtape
comment by Screwtape · 2023-01-25T21:46:56.491Z · LW(p) · GW(p)

I trust the new heading, "Section 9: Other Traditional Less Wrong Census Questions, Which Used To Be Called More Complicated Probability Questions," is perfectly satisfactory to all concerned.

Thank you for the second set of eyes by the way!

Replies from: gjm
comment by gjm · 2023-01-25T21:55:26.138Z · LW(p) · GW(p)

You're welcome. My apologies if you would have preferred one long comment with many observations in it, rather than one comment per thing I spotted. :-) [EDITED to add:] It was a deliberate decision, on the rather dubious grounds that maybe some people would want to be able to upvote/downvote things on a per-comment basis, though in fact I doubt there's much need for voting in this thread unless something is super-stupid or super-insightful, which I'm pretty sure none of my comments here are.

Replies from: Screwtape
comment by Screwtape · 2023-01-25T22:10:09.680Z · LW(p) · GW(p)

One observation per comment is actually preferable to me! It means I can reply to each observation to argue with it state that it's been done or that I've considered it and decided to keep it, forming a nice little to-do list for me.

comment by gjm · 2023-01-25T21:37:35.514Z · LW(p) · GW(p)

I don't know how common it is for answers to questions in the "Probabilities" section to be misleading because answerers give probabilities rather than percentages despite the nice clear instructions. If it's not uncommon, I wonder whether it would be an improvement to make the instructions say explicitly that one should include the percentage sign. Presumably someone who does so can reasonably confidently be assumed to be giving a percentage rather than a probability, and someone who doesn't is at least known not to have read the instructions carefully, so if nothing else those responses could be discarded.

BUT maybe that would throw off whatever automated analysis Google Forms does? My hazy recollection is that whatever automated analysis Google Forms does is pretty terrible anyway, but I'm not very sure about that. (... I just checked the ACX2022 survey results and yes, it's terrible; it just lists all the responses and doesn't turn them into summary statistics or histograms or anything. So including percent signs doesn't seem like it would do any harm to speak of.)

Replies from: Screwtape
comment by Screwtape · 2023-01-25T21:55:10.050Z · LW(p) · GW(p)

I have taken the detailed description format (copied more or less verbatim from previous censuses) as a warning from those who have gone before me. Besides, while Google Forms doesn't really know how to handle the percents in either format, I know how to sort numbers into buckets in Google Sheets easily so I could find out "half the respondents answered between 80% and 90%" or the like.

I just checked, and Google Sheets thinks the average of "50, 40" is 45 and that the average of "50, 40%" is 25.2. I plan to stick with the current instructions for giving probabilities.

Replies from: gjm
comment by gjm · 2023-01-25T22:28:15.385Z · LW(p) · GW(p)

To be clear, I wasn't proposing that the detailed description be removed! I was proposing something more like changing

Each of these questions will ask you for a probability. Please answer on a scale from 0% (definitely false) to 100% (definitely true). Do not include the percent sign in your answer. For your convenience, 0% will be interpreted as "epsilon" and 100% as "100 minus epsilon". Do NOT give your answer in the form of a decimal between 0 and 1 unless you deliberately mean for it to be interpreted as a very small percent. For example, 0.5 will be interpreted as 0.5%, that is, a one in two hundred chance, NOT as 50%.

to

Each of these questions will ask you for a probability. Please answer on a scale from 0% (definitely false) to 100% (definitely true). Do include the percent sign in your answer. For your convenience, 0% will be interpreted as "epsilon" and 100% as "100 minus epsilon".

Answers like "0.2%" (meaning a 1/500 chance) are fine. Do not write just 0.2, whether you mean 1/500 or 1/5. Answers without a final percent sign will be ignored as ambiguous. (Even if they are bigger than 1, in order to avoid bias.)

I do, however, take your point about wanting it to be easy to paste things into Google Sheets or whatever. But I don't think there's any avoiding the need to check for bogus answers. If you ask for no trailing percentage signs, you just know that some people will write them anyway, and then you have to do something with them so that they don't mess up your calculations. (To be clear, this isn't an advantage of my proposal; it's also true that if you ask for trailing percentage signs, some people will miss them out. But "just copy and paste the whole lot into Google Sheets" isn't a reliable approach either way, and if you're already having to remove/repair answers where people have done the wrong thing I don't think all-percent is either better or worse than no-percent for Google Sheets processing. The average of 50% and 40% is 0.45, which is fine.)

Replies from: Screwtape
comment by Screwtape · 2023-01-25T22:42:02.244Z · LW(p) · GW(p)

Counterargument: previous surveys specified to remove the percent sign. Assume some people will add it when they were told not to, and some people will leave it off if told to add it. Keeping the same format means that we could do things like take the average of all surveys, past and present, and the instruction-following population's answers will work just fine.

I predict people are very roughly equally likely to make the mistake in either direction and currently plan to stay consistent with previous surveys.

(To be clear, if one format was better than the other, I'd just make a note to convert the data whenever we wanted to compare between years. Since both formats seem fine to me, avoiding the trivial inconvenience of conversion seems worth it.)

comment by gjm · 2023-01-25T21:31:08.219Z · LW(p) · GW(p)

It would be nice if the "Time in Community" question had a note saying something like "If you have been here since Less Wrong first got started, put X months" rather than making us look up when that was.

(According to Wikipedia, which as we all know is never wrong about Less Wrong, the first incarnation of LW-as-such was in 2009-02, which would mean it was just under 14 years = 168 months ago; Overcoming Bias started in 2006-11, so if you want to count the OB days too the longest possible interval would be 16y2m = 194 months.)

Replies from: Screwtape
comment by Screwtape · 2023-01-25T22:08:20.075Z · LW(p) · GW(p)

Hint text has been added.

I also swapped the unit to years, which is easily compared to past answers.

comment by gjm · 2023-01-25T21:26:18.154Z · LW(p) · GW(p)

Some years ago, I think the "IQ" question (or maybe one of the various questions aiming to quantify brainpower) didn't discourage less-rigorous answers. Might there be value in asking both for properly measured IQ values and for estimates, results of crappy internet tests, etc.? It might maaaybe be possible to get some idea of the relationship between proper measurements and estimates by seeing what properly-measured IQ range any given estimated IQ range "looks like" in terms of answers to other questions.

But arguably the LW community is too focused on its own intelligence, and on IQ in particular as a measure thereof, and it would be better to have less of this rather than more :-).

Replies from: Screwtape
comment by Screwtape · 2023-01-25T22:17:39.780Z · LW(p) · GW(p)

There used to be lots of IQ questions, which seems to have culminated in this delightful subsection [LW · GW] back in 2013. (Search for "Can we finally resolve this IQ controversy that comes up every year?") I have included the basic five questions, because I defaulted strongly towards keeping any question that was asked on at least three previous surveys. Also, I find the reported average to be baffling, at least somewhat convincing, and really funny.

comment by gjm · 2023-01-25T21:22:39.210Z · LW(p) · GW(p)

The "Religious Denomination" question is arguably mistitled, since e.g. "Hindu" and "Traditional Chinese" are not denominations as usually understood.

More substantially, I wonder whether it would be a win to modify the question so that it's meaningful for some non-religious people: something like "If you are not now religious but were in the past, please answer as you would have done when most recently you were religious." It would still need an option for "not religious" in case someone who was never religious accidentally clicks on one of the radio buttons. Note that this would make the personal-religion questions more parallel to the family-religion questions. But I'm not sure whether this would actually provide sufficient extra value to be worth the extra complexity. 

Replies from: Screwtape
comment by Screwtape · 2023-01-25T21:33:42.073Z · LW(p) · GW(p)

I feel like it's fine for the question not to apply to non-religious people. Combined with the Religious Views question, it's a quick sequence of "are you religious? if yes, what kind? if no, keep going, Moral Views will still apply to you." If the question included what religion someone last identified with, then you could wind up with "Are you religious?" getting four Athiests and one Theist, then "What religion, including the last religion if you're now atheist?" getting five Catholics.

Hindu is not a denomination, you are correct, and that's a case of me copying a previous answer and not thinking about it carefully enough. My inclination is to look up the largest denominations of Hinduism, Buddhism, etc, and split those out as well. That way it's easy to compare to past surveys (since you can add the different kinds of Buddhism together) and also now is consistent about all being denominations. People with experience or knowledge of the non-Christian variations, I'm open to suggestions, otherwise I will likely go with what Wikipedia suggests.

Replies from: gjm
comment by gjm · 2023-01-25T21:54:45.372Z · LW(p) · GW(p)

I agree that it's fine for the denomination question not to apply to non-religious people. I was just pointing out that we could, if we wanted, collect a little more information that way. (But at the cost of making interpretation slightly harder work, since as you say it would be potentially misleading to just count up answers to the denomination question without cross-referencing them against the religion question. The family-religion questions already have this problem, if problem it be.)

Splitting up the non-Christian religions is probably a good idea, though I have the feeling that "denomination" isn't really the right term for many of the subdivisions you might want. E.g., you probably don't want to split up Islam any further than Shia versus Sunni versus Other Muslim, but those would generally be called "branches" or something rather than "denominations". This is all sheer nitpickery, of course :-).

Replies from: Screwtape
comment by Screwtape · 2023-01-25T22:02:13.833Z · LW(p) · GW(p)

Wikipedia calls the branches of Islam "Branches or Denominations" and the article on Judaism suggests Commonly used terms are movements, as well as denominations varieties, traditions, groupings, streams, branches, trends, and such." Nitpickery appreciated, I'm currently happy with the divisions.

The family-religion question does have this issue. If I was going to change it, I'd change it to "What is your family's religious background, as of when you were growing up?" That makes it fit the question above it, but risks making it harder to compare across years. I'm currently lightly leaning towards leaving it as-is, figuring the value of comparison is worth it.

comment by gjm · 2023-01-25T21:18:05.175Z · LW(p) · GW(p)

On the ACX 2022 survey, the "Politics" question deliberately adopted different wording for the "Neoreactionary" option from the one it's used in the past and that's also used here. Here's Scott, talking about the results of that survey:

The biggest effect is that many fewer people identify as neoreactionary, but I'm pretty sure that's because I changed the question wording. It was previously "neoreactionary, such as Singapore", but I thought that in real life nobody thinks of Singapore as NRx, and this was probably attracting a lot of Singapore fans who had no idea what it meant otherwise, so I changed it to "neoreactionary, such as the writings of Curtis Yarvin" and it dropped by more than half. Unfortunately this means I can't track actual variations in neoreaction popularity over time, which I'd be interested in knowing. There was no similar change to the definition of alt-right, but it dropped by about 33%.

I think that (1) he is correct that "neoreactionary" doesn't really mean "like Singapore" but also (2) the change makes inter-year comparisons difficult. I wonder whether the Right Thing would be for the LW survey to keep the old wording this year and then change next year to Scott's new wording. (Then if you want to compare LW 2020 with LW 2024, you can try to relate LW 2020-2022 to LW 2023-2024 despite the wording change by guessing that LW's 2022/2023 change in support for neoreaction here is comparable to ACX's change in the same pair of years, which you can see because ACX uses the same wording for those years.)

Replies from: Screwtape
comment by Screwtape · 2023-01-25T22:34:56.567Z · LW(p) · GW(p)

Your suggested Right Thing seems a decent idea. We could also put both definitions on that question, each with their own bullet points.

The other option is to use the 2020 questions, which did not give examples but covered a little wider of a spread. Looking at the different ways this question has been asked over the years, I'm tentatively leaning towards this option.

comment by gjm · 2023-01-25T21:07:10.603Z · LW(p) · GW(p)

The "Living With" question has among its answer-options "With family" and "With partner/spouse". This could be trying to draw two different distinctions: (1) with partner/spouse only versus e.g. with children, or (2) with partner/spouse and possibly also children versus with parents. I think 2 is more likely, and expect most others to interpret it that way, but I'm not sure it's so much more likely that no one will interpret it differently.

I wonder whether this might actually be better with a few not-mutually-exclusive tickyboxes: "Do you live with any other people? Check all that apply", and then "Parent(s)", "Partner/spouse", "Children", "Others". Or something like that.

Alternatively, and assuming I'm right about which meaning was intended: "Alone" / "With parents and/or siblings" / "With partner/spouse and/or children" / "With roommates" (though that last one feels a bit too specific to me).

Replies from: Screwtape
comment by Screwtape · 2023-01-25T21:24:30.817Z · LW(p) · GW(p)

I've updated it to be "Alone, With Parents/Siblings, With Partner/Spouse, With Roommates, or Other" and to allow checking multiple boxes. That does change the wording on a question we've used in the past, but I think it's safe.

Changing it from Pick One to Check All That Apply feels more likely to throw off comparisons to past data. It feels better, because it's obviously possible to live with a parent, your partner, and roommates all at once, but it is also not how we've done it in the past. I'm currently leaning toward making it Check All That Apply ("It's wrong but it's tradition" is not the defense I want to give for a decision made for the Less Wrong Census!) but haven't done it yet. Anyone reading this, feel free to weigh in?

comment by nim · 2023-01-31T16:57:53.737Z · LW(p) · GW(p)

what's meant by "life"? If the question is "something recognized by a majority of humans as life", I'd give a very different answer than if the question is "something recognized by any human as life".

There's also some e-prime-flavored breakage to "is" in the probability questions, perhaps intrinsic to all probability questions. For instance, there are some people who go through life with absolute certainty that the government is controlled by camouflaged reptiles, and other people who find reptiles the best available metaphor for predicting how the government will behave. I believe it's overwhelmingly likely that tissue samples and full-body scans of all politicians would show them to be biologically indistinguishable from other humans, yet it is simultaneously true that some people live in a world where the politicians are "really" reptilian and the tissue samples were faked somehow. What probability in that box would accurately communicate this view?