On coincidences and Bayesian reasoning, as applied to the origins of COVID-19

post by viking_math · 2024-02-19T01:14:06.772Z · LW · GW · 28 comments

Contents

  The use of coincidences in Bayesian reasoning
    None of this is special
  The coincidence of Covid-19 starting in Wuhan
    Timing
    Independence 
  Viral features
  Intermezzo: Bias in sources
  These sorts of arguments are easy to make
    I doubt anyone takes these numbers seriously
  Conclusion
  Addendum: Debate Results
  Appendix: Genetic features
None
28 comments

(Or: sometimes heuristics are no substitute for a deep dive into all of the available information). 

This post is a response to Roko's recent series of  posts (Brute Force Manufactured Consensus is Hiding the Crime of the Century [LW · GW], The Math of Suspicious Coincidences [LW · GW], and A Back-Of-The-Envelope Calculation On How Unlikely The Circumstantial Evidence Around Covid-19 Is [LW · GW]); however, I made a separate post for a few reasons. 

  1. I think it's in-depth enough to warrant its own post, rather than making comments
  2. It contains content that is not just a direct response to these posts
  3. It's important, because those posts seem to have gotten a lot of attention and I think they're very wrong. 

Additional note: Much of this information is from the recent Rootclaim debate; if you've already seen that, you may be familiar with some of what I'm saying. If you haven't, I strongly recommend it. Miller's videos have fine-grained topic timestamps, so you can easily jump to sections that you think are most relevant. 

The use of coincidences in Bayesian reasoning

A coincidence, in this context, is some occurrence that is not impossible or violates some hypothesis, but is a priori unlikely because it involves 2 otherwise unrelated things actually occurring together or with some relationship. For example, suppose I claimed to shuffle a deck of cards, but when you look at it, it is actually in some highly specific order; it could be 2 through Ace of spades, then clubs, hearts, and diamonds. The probability of this exact ordering, like any specific ordering, is 1/52! from a truly random shuffle. Of course, by definition, every ordering is equally likely. However, there is a seeming order to this shuffle which should be rare among all orderings. 

In order to formalize our intuition, we would probably rely on some measure of "randomness" or some notion related to entropy, and note that most orderings have a much higher value on this metric than ours. Of course, a few other orderings are similarly rare (e.g. permuting the order of suits, or maybe having all 2s, then all 3s, etc. each in suit order) but probably only a few dozen or a few hundred. So we say that "the probability of a coincidence like this one" is < 1000/52!, which is still fantastically tiny, and thus we have strong evidence that the deck was not shuffled randomly. On the other hand, maybe I am an expert of sleight of hand and could easily sort the deck, say with probability 10%. Mathematically, we could say something like

And similarly for the alternative hypothesis, that I manipulated the shuffle. 

On the other hand, we might have a much weaker coincidence. For example, we could see a 4 of the same value in a row somewhere in the deck, which has probability about 1/425 (assuming https://www.reddit.com/r/AskStatistics/comments/m1q494/what_are_the_chances_of_finding_4_of_a_kind_in_a/ is correct). This is weird, but if you shuffled decks of cards on a regular basis, you would find such an occurrence fairly often. If you saw such a pattern on a single draw, you might be suspicious that the dealer were a trickster, but not enough to overcome strong evidence that the deck is indeed random (or even moderate evidence, depending on your prior).

However, if we want to know the probability of some coincidence in general, that's more difficult, since we haven't defined what "some coincidence" is. For example, we could list all easily-describable patterns that we might find, and say that any pattern with a probability of at most 1/100 from a given shuffle is a strange coincidence. So if we shuffle the deck and find such a coincidence, what's the Bayes Factor in favor of stacked deck? 100x, right? Or if we get 2 such coincidences, it's 10,000x, right? 

No! The probability of finding any such coincidence is higher than 1/100, possibly much higher. Figuring out exactly what the true probability is may be difficult. For one, the events may not be independent or exclusive. In particular, we can't just multiply probabilities if we have more than 1 coincidence: For example, suppose we have a lot of hearts and spades in the top half of the deck; can we then also note that we have a lot of clubs and diamonds in the bottom half, and say that we have 2 suspicious coincidences, and square P(many hearts and spades in the top half) to get the probability of this outcome? Of course not. And of course, P(any of N outcomes X_1...X_N)) is generally larger than P(X_i) for any given i. 

The standard way to really be confident in our result is to consider all of the outcomes that would have surprised us a similar or greater amount, and ask about the probability of any such outcome, like we did above. And we also have to be careful about what we consider a "coincidence" (or multiple coincidences) since A) sometimes things are more likely than they seem; B) sometimes multiple coincidences are actually one coincidence; and C) the more things you look at, the more likely at least one of them is to be "unlikely" (essentially, avoid p-hacking). In practice of course this is quite difficult, especially in cases where there was very little pre-registration, but it should at least be considered. 

None of this is special

There's not anything about a "coincidence" that impacts how we analyze this situation. We just care about how unlikely some outcome is under each hypothesis. It doesn't really matter if our justification for the mismatch between theory and evidence is coincidence, measurement error (which is just another form of bad luck), malicious manipulation of data, p-hacking/cherry picking, or something else. It only really matters what P(evidence|hypothesis) is for each hypothesis under consideration. Our usual tools of analyzing data will mostly apply here as they do in any other case. 

The coincidence of Covid-19 starting in Wuhan

Roko writes: 

How many times do you have to rerun history for a naturally occurring virus to randomly appear outside the lab that's studying it at the exact time they are studying it? I think it's at least 1000:1 against.

As stated literally, this question is rather difficult to answer for several reasons, including the variables not being independent, pandemics being rare, the question being ambiguously worded. It would not be surprising if a lab studied pathogens that are likely to be found nearby. It also would not be surprising if they were to study those viruses over an extended period of time. 

The follow-up post looks at essentially the same variables: 

  1. Coincidence of Location: Wuhan is a particularly special place in China for studying covid-19; the WIV group was both the most important, most highly-cited group before 2020, and the only group that was doing GoF on bat sarbecoronaviruses as far as I know. Wuhan is about 0.5% of China's population. It's a suspicious coincidence that a viral pandemic would occur in the same city as the most prominent group that studies it. 
  2. Coincidence of timing: several things happened that presaged the emergence of covid-19. In December 2017, the US government lifted a ban on risky pathogen research, and in mid-2018 the Ecohealth group started planning how to make covid in the DEFUSE proposal. A natural spillover event could have happened at any time over either the last, say, 40 years or (probably) the next 40 years, though likely not much before that due to changing patterns of movement (I need help on exactly how wide this time interval is). 
  3. Warnings turning out to be accurate: Warnings were given in Nature specifically mentioning the WIV/Zhengli Shi group and no other group involved with coronaviruses, and only a few other groups involved with any viruses at all (in other articles). There were hundreds of groups that could have been warned about I think, but this article gives 59 as the number of BSL-4 labs around the world. This is a subtler point than those above because getting a warning is extra evidence for the lab leak hypothesis even conditional on the timing and location coincidence. Warnings were also given about WIV itself independent of the connection to coronaviruses too.

(I'm leaving out point 4 for now since Roko outsources the Bayes Factors; we'll get back to that). 

The first issue here is really just one of facts. Not everywhere in China is equally likely to be the source of a natural pandemic. Simulations indicate that pandemics like Covid are much more likely to start in urban areas (or to be specific, they are likely to go extinct if they start in rural areas), and historically this has been the case as well (e.g. Sars 1 started in a city). In addition, South and Central China are closer to wildlife like bats, civets, and racoon dogs, and there is a thriving wildlife trade in many of these cities, while the same is not true in Northern China. We should also include other parts of Southeast Asia; Roko estimates 700 million people living within the distance of Wuhan from a plausible origin location. That seems reasonable to me; China is around 60% urban, so maybe somewhere around 400 million people are plausible candidates for patient 0. 11/400 = 2.75%. 

Wuhan is a particularly big city, it's a transportation hub, and a lot of wildlife passes through it. The actual likelihood of a bat coronavirus starting in Wuhan may actually be even higher than this, say 5%. On the other hand, maybe there are reasons why Wuhan is less likely, but these have to be demonstrated in order to claim we have a highly suspicious coincidence. A claim of a strong Bayes Factor requires a very careful argument (see Confidence levels inside and outside an argument [LW · GW]). Do you think that the arguments in Roko's posts about location are less than 1/200 to be wrong? 

Timing

The discussion of timing returns to the nominal topic of this post, coincidences. First, the "coincidence" of specific, relevant work being done at WIV just before the pandemic started is based on a rejected grant proposal from 2018. What is the probability of a lab having at least one proposal like this in the previous few years? From the information in any of these posts, it's impossible to tell. Maybe every big virology lab is regularly submitting grant proposals for similar research. And a rejected grant proposal doesn't mean that specific work has actually been done, so you should not only consider grant proposals, but published research, unpublished or in-progress research, and other weak evidence like conference talks, interviews, and maybe even social media posts or emails. No indication that any such search has been conducted appears in any analysis of the lab leak hypothesis of which I am aware, and so it is impossible to assign a meaningful value to P(bat coronavirus pandemic starts in city with a lab doing research on bat coronaviruses), since for all we know every city in China meets this criterion. 

Also, remember what we said about counting any of several different things as a coincidence? Let's take a look at why the timing is supposed to be so suspicious: 

But gain of function is a new invention - it only really started in 2011 and funding was banned in 2014, then the moratorium was lifted in 2017. The 2011-2014 period had little or no coronavirus gain of function work as far as I am aware. So coronavirus gain of function from a lab could only have occurred after say 2010 and was most likely after 2017 when it had the combination of technology and funding

If we had a similar pandemic in 2012, right when this sort of research became possible, would that also have been a suspicious coincidence? Unclear, but there's certainly some possibility. In fact, is the 2014 moratorium even relevant? Alina Chan says no, in which case 2019 isn't anything special, and you have something like 8 years since the relevant research apparently became possible, not 2. Maybe she's wrong--but you should discount the 2/80 number but however likely you think she is to be correct. 

Actually, we have to go further. The proposal emphasized work being at UNC, which has a lot more experience manipulating viruses. So we have to consider a very wide range of types of work that would be considered "suspicious." I'm just going to link Miller's slides from day 1 (go to slide 41) which mentions "suspicious-sounding" research in many cities in China, links to EcoHealth alliance, even adding an FCS to coronaviruses at other labs, all in the past few years. Once you start assuming that work was done without good evidence, it's easy to assert that just about any lab could be involved. With these standards, the answer to the question "How many times do you have to rerun history for a naturally occurring virus to randomly appear outside the lab that's studying it at the exact time they are studying it?" looks like it would actually be close to 1; it's certainly much more likely than just "how likely is it that Covid started in Wuhan specifically?" In any event, Roko gives 2 out of 80 (even though the yearly rate of natural virus pandemics out of South and Central China seems to be more like 3%);[1]again, does it seem like the chance that this argument is wrong is really much less than 2.5%? 

Independence 

You might note that I'm talking about location again, rather than timing. Partially this is because Roko included the grant proposal under the timing section, but also because I don't think these factors are independent, and so you can't just multiply numbers together. The Defuse grant proposal impacts not just the timing you should consider, but also the location. 

Viral features

The last major piece of evidence that Roko cites is based on molecular and genetic features. We can apply similar tests as above. 

First, how do we know what counts as suspicious? Nothing was pre-registered, so we really don't have a good sense for how many "suspicious coincidences" would be in a random virus. Or, really, how many there would be in a virus that caused a major human pandemic, because any such virus must be rare on some measure--most viruses don't cause human pandemics. Do you think that a motivated reasoner could find some suspicious patterns in a given virus if they really wanted to? Roko takes his source's 1/30 million and rounds down to 1/500, but do you really think that this exercise would result in a positive finding only a fraction of a percent of the time? 

Second, do we even think that any of these claims hold up? The source seems to mostly focus on food and drug related topics, not virology, so let's take a careful look at their justification. 

SARS-CoV-2 has a furin cleavage site positioned in the spike protein at the S1/S2 junction. The furin cleavage site supercharged the virus into the worst pandemic pathogen in a century. Virologists have yet to identify one in any other related coronavirus. 

Roko cites a tweet in one of his posts to the same effect, saying that none of the 800 other known sarbecoviruses have an FCS, so "p-value < 0.002." But again, Covid isn't a random virus. The whole reason the FCS is deemed to be relevant is because it impacts how it affects humans--if only 1/800 viruses has caused a pandemic, we would definitely expect it to have some features that are rare among those viruses. Otherwise they would have already caused other pandemics! 

Second, FCS are fairly common through other coronaviruses. They're also spread throughout the tree, intermingled with non-FCS viruses, suggesting they evolved multiple times. I don't know how to assign a probability here, because there's a fair amount of arbitrariness in what counts as a "separate virus" or the "same family" and because evolution is a complicated dynamical system. But nothing remotely like 1 out of several hundred is justified without further argument. 

SARS-CoV-2 emerged highly infectious without evolving much in humans

Their main citation here seems to be... the Daily Mail, quoting a "Trump aide." Their other source is one of their own articles, but the evidence quoted is far too vague to be putting big numbers on. 

The genome of SARS-CoV-2 falls within the range of a 25 percent genetic difference from SARS.

I can't tell what this means, but Roko seems to ignore it. 

(Edit: I think this is saying that WIV was looking at viruses within 25% genetic distance of Sars-1, and so would have been a valid candidate for study.)

Their last claim is based on this preprint, which I don't have the ability to analyze, though even RSUTK's summary admits other experts aren't convinced. Roko gives it 1/1000 before rounding the combined effect down to 1/500; again, I don't think an unpublished preprint like this justifies anything like that level of confidence. 

Intermezzo: Bias in sources

In a recent post (Most experts believe COVID-19 was probably not a lab leak [LW · GW]) many commenters, including Roko, expressed skepticism at taking the results of a survey of experts at face value. This is fair (apparently a substantial fraction of respondents had claimed to be familiar with a fake study, for example); however, scrutiny should be applied to all sources. Who are the 3 authors of the preprint linked above, and why should we trust them? What about USRTK? Is Richard H. Ebright less biased than whoever was surveyed? 

If we're going to be skeptical of potentially biased people as a source of information, then we have to apply that same standard to any person, even if we agree with them. It feels weird saying this in 2024 on a rationalist forum, since this seems like one of the most basic principles of rationality, period. But apparently we need a reminder. 

These sorts of arguments are easy to make

Without careful consideration, it's easy to come up with arguments that imply very strong Bayes factors. For example:

  1. What is the probability, with a lab leak, of all of the known early cases being located at or clustered around a market on the other side of town? A market that keeps and sells wild animals, which is the kind of place where SARS 1 started? And in fact, is the exact place that at least one virologist (Edward Holmes) identified as the likely start of a viral pandemic years ago! 
  2. In fact, there appear to have been 2 separate spillover events. No early cases cluster around any other location, such as the WIV, so this already suspicious event essentially happened twice!  

How many different public, well-trafficked indoor places are there in Wuhan, a city of 11 million people? Has to be at least 1,000? So that's 1/1000 against, maybe 2/1000 if you account for there being a few other small wet markets. But both early crossover events resulted in only having cases at the same market (i.e. we didn't have 1 cluster at each of  2 different markets), so maybe it really is 1/1000, and then square it for 1/1 million. And it could easily be a lot less than that if there are more than 1,000 possible spreading locations. Or if we note that 4 of the first 5 known cases worked at the market, rather than just visiting there, and use workers at market/Wuhan population = about 1/10,000. 

I doubt anyone takes these numbers seriously

There are plenty of obvious objections to the argument above (just as there are objections to the pro-lab-leak arguments). The biggest one is the fact that many cases were missed, especially early. This isn't a guess; given Covid's hospitalization rate, and the fact that no one was looking for it in mid-December 2019, it's pretty much a guarantee that many people were infected at the time but did not know it. However, the mere fact that some cases are unknown doesn't mean the clustering around the market is wrong. Cases spread outwards from the market over time, with no other clear center point. There are no other initial clusters of cases (most notably, nothing near the lab). And there is a limit to the number of "missing cases" there can be, because we know how fast it spreads (about 2 doublings a week with no mitigation). If there are actually 50 cases severe enough to be hospitalized on December 10th, instead of the handful we know about, then on on January 23rd, when Wuhan was locked down, there would be about 50*2^(6*2) = 200,000 hospitalizations and, given its death rate, around 40,000 deaths. There's some error in these numbers but the actual number of deaths was only nowhere near that, even a month after that. You can miss cases, but the death count is not going to be 10x what you think it is. 

(Also, although cases could be missing, there's no reason to expect that hospitalizations would be biased toward the market if cases are not).  

Now, this argument does clearly leave the possibility that someone from the WIV was patient 0 and brought Covid to the market. But it's still a very unlikely coincidence, probably much stronger than the coincidence of the pandemic starting in Wuhan to begin with.

...actually, if you have such strong skepticism of the case data, why do you believe the pandemic began in Wuhan at all? We know that covid can spread quickly from city to city, and there are likely to be a lot of missing cases. It could have spilled over in the countryside (Hubei has civet farms, for example) or in another city, and patient 0 could have hopped on a train to Wuhan. 

A second set of objections usually revolves around China manipulating the data in some capacity. Again, this is possible. But also, again, you have to ask why you believe anything at all? If China could fake all of the early case data, why couldn't they make it seem like it started somewhere else? Couldn't the early cases seeming to be in Wuhan itself be an attempt to draw attention away from somewhere even less likely to be the origin of a natural pandemic? If they want to make it seem like the market is the source, why not create a fake lab test of an animal? The term "conspiracy theory" is overused, but once you start speculating that some malevolent entity has both the motive and the capability to do whatever is convenient for your theory, it quickly becomes unfalsifiable.   

In addition, China doesn't seem to have had any particular motive to frame the HSM instead of the WIV. Back in December 2019 and January 2020, they seemed mostly interested in covering up the existence of a pandemic at all. But this failed miserably, with Chinese doctors reporting on this new serious pandemic even as the government arrested several of them (food for thought if you think China could hide all evidence of a lab leak). Their current position is that the virus is of American origin. Why would they try to make the market seem like the origin? If they could impart arbitrary bias onto the data, why not have it cluster around a hotel, train station, or airport, which would be more consistent with the idea that it came from somewhere else? 

Conclusion

It is always possible to come up with some explanation whereby the conspiracy just so happens to behave in exactly the way that prevents you from firmly disproving the theory. This pattern reminds me of the warning in Contaminated by Optimism [LW · GW]:

It is a fact of life that we hold ideas we would like to believe, to a lower standard of proof than ideas we would like to disbelieve [? · GW].  In the former case we ask "Am I allowed to believe it?" and in the latter case ask "Am I forced to believe it?" 

Is the evidence for a zoonotic origin sufficient to force you to believe it? No; that would probably have required an open and thorough investigation into the market back in December 2019. 

Is the evidence for a lab leak strong enough to allow you to believe it? Sure, if you believe that enough relevant evidence would be hidden, and that the pandemic starting in Wuhan is a really strong coincidence. 

Is it actually the case that lab leak is the most likely explanation? Not by far, in my opinion. 

Addendum: Debate Results

Since I started writing this post, the results of the Rootclaim debate have been announced. Both judges agreed with Miller's evaluation, that the zoonotic origin is substantially more likely. Judge Will's decision is at https://www.youtube.com/watch?v=YlxTztAkdGQ&ab_channel=PeterMiller and Eric's decision is at https://www.youtube.com/watch?v=OKwunTJ1b40&ab_channel=PeterMiller

Both videos contain links to the judges' decision making processes in the descriptions. I highly recommend looking at them (I'm still making my way through) as well as Rootclaim's response linked above. Especially if you didn't want to watch the original videos (and Rootclaim is changing to primarily written for the future!) 

Appendix: Genetic features

Other than the pandemic starting location, the main lines of evidence cited (by either side) concern genetic features of the virus. As far as I can tell:

  1. Some aspects of the viral genome look somewhat weird, but are difficult to clearly identify as being strong evidence of either lab leak or zoonosis. For example, the CGG codon is rare in human coronaviruses, but it's not that rare and the reason for this (C and G provoking an immune response) may just not apply to this particular virus for unknown reasons. 
  2. We just don't know enough about viral evolution. For example, as I discussed above and in one of my comments [LW · GW], furin cleavage sites do not appear in sarbecoviruses (the group of viruses that includes SARS-Cov-2) but does appear frequently in other slightly less-related groups of coronaviruses. I don't really know how to turn these facts into a probability, as evolution is a complex, dynamic process, and the FCS is related to its infectiousness in humans (and thus to its ability to create a pandemic). If I wanted to ignore this complexity, it looks at a glance like something like 1/2 of all betacoronaviruses have a FCS, so I could say the Bayes factor is actually only 1:2 in favor of lab leak, even before accounting for the fact that the FCS makes the virus more likely to infect humans, but that would be just as wrong, just with bias in the other direction.  
  3. There's nothing that conclusively identifies the virus as being engineered, such as a feature that appears no where else in nature but is common in engineering. 
  4. While some more detailed analysis, future data, etc. could possibly shed more light on this question, it is certainly not possible to take 1 or 2 easily-summarized soundbites and perhaps a paragraph or 2 of a analysis and come to the conclusion that the viral genome shows clear signs of being engineered.  

In conclusion, it's difficult to assign a strong Bayes factor in either direction here, and going into all of the details would be a bit much. I recommend the Rootclaim debate, both in video form and the judges' conclusions, to provide more specifics on both this topic and the epidemiological evidence. 

  1. ^

    The time from the SARS-1 pandemic to Covid was 15-17 years, depending on how you count. A common technique here is to assume you are in the "middle" of the time period between events, so you double the gap to get 1 natural pandemic every 30-34 years. Going back further, there were 2 major flu pandemics starting in Hong Kong and Guizhou in the 50s and 60s. That's 3 pandemics in the 70 years prior to Covid. We could go back further and make the rate lower, but even rounding up to 90 years puts us at exactly 1 natural viral pandemic in this region per 30 years. The extent to which the flu pandemics should weigh on this question is an exercise for the reader, but even just looking at SARs-1 I find 1/80 per year to be too low. 

28 comments

Comments sorted by top scores.

comment by Davidmanheim · 2024-02-19T13:53:55.928Z · LW(p) · GW(p)

Thank you for writing this.

I think most points here are good points to make, but I also think it's useful as a general caution against this type of exercise being used as an argument at all! So I'd obviously caution against anyone taking your response itself as a reasonable attempt at an estimate of the "correct" Bayes factors, because this is all very bad epistemic practice!  Public explanations and arguments are social claims, and usually contain heavily filtered evidence (even if unconsciously [LW · GW]). Don't do this in public.

That is, this type of informal Bayesian estimate is useful as part of a ritual for changing your own mind, when done carefully [? · GW]. That requires a significant degree of self-composure, a willingness to change one's mind, and a high degree of justified confidence n your own mastery of unbiased reasoning.

Here, though, it is presented as an argument, which is not how any of this should work. And in this case, it was written by someone who already had a strong view of what the outcome should be, repeated publicly frequently, which makes it doubly hard to accept the implicit necessary claim that it was performed starting from an unbiased point at face value! At the very least, we need strong evidence that it was not an exercise in motivated reasoning, that the bottom line wasn't written before the evaluation started - which statement is completely missing, though to be fair, it would be unbelievable if it had been stated.

Replies from: SDM, viking_math
comment by Sammy Martin (SDM) · 2024-02-19T15:45:38.053Z · LW(p) · GW(p)

This whole thing reminds me of Scott Alexander's Pyramid essay. That seems like a really good case where it seems like there's a natural statistical reference class, seems like you can easily get a giant Bayes factor that's "statistically well justified", and to all the counterarguments you can say "well the likelihood is 1 in 10^5 that the pyramids would have a latitude that matches to the speed of light in m/s". That's a good reductio for taking even fairly well justified sounding subjective bayes factors at face value.

And I think that it's built into your criticism that because the problem is social and hidden evidence filtering going on, there will also tend to be an explanation on the meta-level too for why my coincidence finding is different from your coincidence finding.

comment by viking_math · 2024-02-20T17:12:47.442Z · LW(p) · GW(p)

To make sure I understand your point... the "Bayes Factors" I give like 1/ 1 million aren't meant to be taken literally. Rather they're to show how easy it is to get a high BF in this case, if you do a very quick analysis that doesn't account for details. I don't expect this post, on its own, to convince anyone of the zoonotic origin hypothesis. 

Replies from: Davidmanheim
comment by Davidmanheim · 2024-02-20T22:12:27.135Z · LW(p) · GW(p)

Yeah, but I think that it's more than not taken literally, it's that the exercise is fundamentally flawed when being used as an argument instead of very narrowly for honest truth-seeking, which is almost never possible in a discussion without unreasonably high levels of trust and confidence in others' epistemic reliability.

comment by Ilio · 2024-02-19T15:14:27.874Z · LW(p) · GW(p)

Yup. Thanks for trying, but these beliefs seem to form a local minima, like a trap for the rational minds -even very bright ones. Do you think you understand how an aspiring rationalist could 1) recover and get out of this trap 2) don’t fall for it in the first place?

To be clear, my problem is not with the possibility of a lab leak itself, it’s with the evaluation that present evidences are anything but posthoc rationalizations fueled by unhealthy levels of tunnel vision. If bright minds can fall for that on this topic specifically, how do I know I’m not making the same mistake on something else?

Replies from: RussellThor
comment by RussellThor · 2024-02-19T18:44:09.832Z · LW(p) · GW(p)

Some advice I heard that was for investing was when committing to a purchase, write a story of what you think is most likely to make you lose your money. Perhaps you could identify your important beliefs that also perhaps are controversial and each year write down the most likely story you can think of that would make it be wrong? I also believe that you can only full learn from you  own experience so building up a track record is necessary.

Replies from: Ilio
comment by Ilio · 2024-02-22T04:05:12.356Z · LW(p) · GW(p)

Perhaps you could identify your important beliefs

That part made me think. If I see bright minds falling in this trap, does blindness goes with importance of the belief for that person? I would say yes I think. As if that’s where we tend to make more « mistakes. that can behave as ratchets of the mind ». Thanks for the insight!

that also perhaps are controversial

Same exercise: if I see bright minds falling in this trap, does blindness goes with controversial beliefs? Definitely! Almost by definition actually.

each year write down the most likely story you can think of that would make it be wrong

I don’t feel I get this part as well as the formers. Suppose I hold the lab leak view, then notice it’s both controversial (« these morons can’t update right »), and much more important to me (« they don’t get how important it is for the safety of everyone »). What should I write?

Replies from: RussellThor
comment by RussellThor · 2024-02-22T06:34:09.126Z · LW(p) · GW(p)

"most likely story you can think of that would make it be wrong" - that can be the hard part. For investments its sometimes easy - just they fail to execute, their competitors get better, or their disruption is itself disrupted.
Before the debate I put Lab leak at say 65-80%, now more like <10%. The most likely story/reason I had for natural origin being correct (before I saw the debate) was that the host was found, and the suspicious circumstances where a result of an incompetent coverup and general noise/official lies  mostly by the CCP around this.

Well I can't say for sure that LL was wrong of course, but I changed my mind for a reason I didn't anticipate - i.e. a high quality debate that was sufficiently to my understanding.

For some other things its hard to come up with a credible story at all, i.e. AGW being wrong I would really struggle to do.

comment by RussellThor · 2024-02-19T06:17:25.221Z · LW(p) · GW(p)

Good article. I listened to all the rootclaim debate and found it informative. After that debate, I have a lot less belief in the credibility of giving accurate bayes estimates for complicated events, e.g. both debaters attempted it but their estimates where different by >>> like >1e20 I think.

I think this applies even more for P(doom) for AI,  after all its about something that hasn't even happened yet - I agree with the criticism that P(doom) is more a feeling rather than the result of rationality.

Replies from: viking_math, romeostevensit
comment by viking_math · 2024-02-22T18:12:32.218Z · LW(p) · GW(p)

At one point Miller gave a likelihood against LL by a factor of 1e20 or 1e25, I think during the second debate, on genetic evidence. I don't think he intended this number to be an actual Bayes factor, but rather to show how easy it is to get a big BF by multiplying many small numbers together (see also https://arbital.com/p/multiple_stage_fallacy/). 

comment by romeostevensit · 2024-02-22T16:37:52.340Z · LW(p) · GW(p)

Is there a name for these sorts of errors of conjunction and disjunction in super high dimension parameter spaces? I usually just refer to it as 'cold reading yourself.'

Replies from: viking_math
comment by viking_math · 2024-02-22T18:12:51.379Z · LW(p) · GW(p)

https://arbital.com/p/multiple_stage_fallacy/

Replies from: romeostevensit
comment by romeostevensit · 2024-02-22T20:52:47.314Z · LW(p) · GW(p)

Interesting that conjunctive fallacy is a broadly used term but disjunctive fallacy is not.

Replies from: viking_math
comment by viking_math · 2024-02-23T02:12:32.444Z · LW(p) · GW(p)

What would the disjunctive fallacy be? Failing to account for the fact that P(A or B) >= P(A) and P(B)?

Replies from: romeostevensit
comment by romeostevensit · 2024-02-23T02:45:55.189Z · LW(p) · GW(p)

assumption of independence

comment by avturchin · 2024-02-19T11:23:32.431Z · LW(p) · GW(p)

If I have uniformed prior 1 to 1 on natural vs lableak origin, and update on 5 per cent coincidence that origin place is near lab, I will get around 95 per cent for lableak.

Replies from: Davidmanheim
comment by Davidmanheim · 2024-02-19T14:28:53.309Z · LW(p) · GW(p)
  1. What is the relevance of the "posterior" that you get after updating on a single claim that's being chosen, post-hoc, as the one that you want to use as an example?
  2. Using a weak prior biases towards thinking the information you have to update with is strong evidence. How did you decide on that particular prior? You should presumably have some reference class for your prior. (If you can't do that, you should at least have equipoise between all reasonable hypotheses being considered. Instead, you're updating "Yes Lableak" versus "No Lableak" - but in fact [LW · GW], "from a Bayesian perspective, you need an amount of evidence roughly equivalent to the complexity of the hypothesis just to locate the hypothesis in theory-space. It’s not a question of justifying anything to anyone.") 
  3. How confident are you in your estimate of the bayes factor here? Do you have calibration data for roughly similar estimates you have made? Should you be adjusting for less than perfect confidence? 
Replies from: avturchin
comment by avturchin · 2024-02-21T18:24:08.889Z · LW(p) · GW(p)

My point was that in some cases the update can be so strong that it overrides all reasonable uncertainties in priors and personal estimates. 

And exactly this makes Bayes' theorem useful and strong instrument. 

The fact that the virus was found in 2 miles from the facility which was supposed to research them - must make our bells ring. 

To override this we need some mental equlibristics (I think of meme here but I don't want to be rude)

Replies from: Davidmanheim, viking_math
comment by Davidmanheim · 2024-02-22T15:26:35.129Z · LW(p) · GW(p)

To start, the claim that it was found 2 miles from the facility is an important mistake, because WIV is 8 miles from the market. For comparison to another city people might know better, in New York, that's the distance between World Trade Center and either Columbia University, or Newark Airport. Wuhan's downtown is around 16 miles across. 8 miles away just means it was in the same city. 

And you're over-reliant on the evidence you want to pay attention to. For example, even rstricting ourselves to "nearby coincidence" evidence, the Hunan the market is the largest in central China - so what are the odds that a natural spillover events occurs immediately surrounding the largest animal market? If the disease actually emerged from WIV, what are the odds that the cases centered around the Hunan market, 8 miles away, instead of the Baishazhou live animal market, 3 miles away, or the Dijiao market, also 8 miles away?

So I agree that an update can be that strong, but this one simply isn't.

Replies from: avturchin
comment by avturchin · 2024-02-23T12:30:18.222Z · LW(p) · GW(p)

Yes, my mistake for the distance. Confused it with local CDC, which is like 600 meters from the market.

The place where most human cases are concentrated is the place where human-to-human transmission started - or there was multiple events of animal-to-human transmission in this place. The second thing would be surprising as if the virus can so often jump to humans from animals it will happen closer to its origin in Laos.

Alternative explanation is following: as the market is one of the most crowded place in the city (not sure, heard about it somewhere) it worked as an amplification of a single transmission event which could happen elsewhere. 

If we assume that a worker of  WIH was infected at work, this will be completely unspectacular until he started infecting other people. Such person can commute all around the city including to CDC near wet market.

My point: 8 miles or 2 miles is not big difference here, as the virus came to market not by air but with  a commuting person, and 8 miles day commute is pretty normal. The market being big is not also a strong evidence as the animal number in smaller markets all over china will overweight animal-number in one big market. 

Replies from: viking_math
comment by viking_math · 2024-02-23T16:48:42.038Z · LW(p) · GW(p)

The second thing would be surprising as if the virus can so often jump to humans from animals it will happen closer to its origin in Laos.

Spillover events probably did happen elsewhere, but not all spillover events lead to a pandemic, and covid is usually so mild that it's not surprising we can't find any such cases. (I also don't know if some final important mutation didn't happen until much closer to the actual pandemic start). 

Alternative explanation is following: as the market is one of the most crowded place in the city

This is discussed in the Rootclaim debate. There are many different types of places which served as superspreader events early on, the evidence we have shows the growth rate in the market as the same outside of it, and overall growth didn't seem to slow down when they closed the market. 

If we assume that a worker of  WIH was infected at work, this will be completely unspectacular until he started infecting other people. Such person can commute all around the city including to CDC near wet market.

This is also addressed. It would be a fantastic coincidence--much stronger than the one you posited at the start of this thread--if the only place they brought the disease was one of only a handful of other places in the city that a pandemic could actually start. Like, if all the early cases clustered around the WIV, and I said that a HSM worker could have brought it to the lab, would anyone take that seriously? 

This, by the way, is exactly the kind of thing that annoys me and which is one of the main issues I made this thread to address. If you make enough favorable assumptions, you can make any hypothesis look good. This is clearly not the best explanation for the available evidence. Merely because you have successfully epicycled your way into a version of the theory which is not obviously impossible doesn't mean anyone has any reason to think it is even remotely likely. Your arguments aren't even consistent, as you seem surprised that there were no spillovers between Wuhan and Laos, but then don't seem at all skeptical of the idea that a sick person would commute all over the city and only bring it to 1 place. 

I mean, I could point out that the first non-Wuhan case was in Beijing on December 17th (I think, going off memory here) and that someone could have gotten sick in a different city, and then just hopped on a train and immediately went to the HSM, and the WIV isn't relevant at all. Is this story convincing? Is there any evidence to support it? Does it feel like I am engaging in truth-seeking, or just throwing shit at a wall and seeing what sticks so I can prop up my pet theory? 

Replies from: Richard_Kennaway, avturchin
comment by Richard_Kennaway · 2024-02-23T17:11:48.768Z · LW(p) · GW(p)

covid is usually so mild

It is mild now. It was not mild in the early stages. ICUs in many places were overwhelmed.

Replies from: viking_math
comment by viking_math · 2024-02-23T20:31:59.758Z · LW(p) · GW(p)

ICUs were overwhelmed because Covid spread so much. Its hospitalization rate is a few percent and its fatality rate is 1% or so. This is in contrast to diseases like SARS 1 (9.5% fatality rate) or MERS (34% fatality rate). Sure, it's not mild compared to seasonal flu, but it is much more mild than the obvious things you would compare it to.  

comment by avturchin · 2024-02-24T20:10:25.333Z · LW(p) · GW(p)

Thank for explaining your position which is interesting and consistent.

I can suggest that the connection between WIH and wet market can be explained by the idea that some criminals sold lab animals from WIH on the wet market, e.g. bats. 

Obviously this looks like ad hoc theory. But the travel of the virus to the market from the Laos caves also seems to be tricky and may include some steps like intermediate carrier. Both look equally unlikely, one of the happened. 

So my idea is to ignore all the details and small theories; instead just updated on the distances to two possible origins points: 8 miles and 900 miles. This is 100 times difference and if we count the areas - it is 10000 times difference. In last case we can make so powerful update in the direction of  WIH as source, that it overrides all other evidence.

Replies from: viking_math
comment by viking_math · 2024-02-25T00:38:32.208Z · LW(p) · GW(p)

They're not equally unlikely. You haven't provided any actual evidence for this claim. 

Also, why on Earth would we just take the ratio of distances or areas as the probability factor? That's not how pandemics work. 

comment by viking_math · 2024-02-21T21:07:12.306Z · LW(p) · GW(p)

What facility? WIV and HSM are at least 6 miles apart as the crow flies, with a big river between that forces anyone traveling from one to the other to go even further than that. 

To override this we need some mental equlibristics (I think of meme here but I don't want to be rude)

No, you just need stronger evidence. 1/20 isn't that strong, especially for a complex situation with a high number of possible parameters to check.  

comment by Mike P (mike-p) · 2024-03-26T08:33:30.912Z · LW(p) · GW(p)

***In fact, there appear to have been 2 separate spillover events. No early cases cluster around any other location, such as the WIV, so this already suspicious event essentially happened twice! ***

Note these claims have been seriously challenged in the past few months:

  1. Spatial statistics experts Stoyan and Chiu (2024) dispute the analysis that Huanan Seafood Market was necessarily early epicenter. https://academic.oup.com/jrsssa/advance-article-abstract/doi/10.1093/jrsssa/qnad139/7557954

  2. Lv et al (2024) find the multiple spillover theory is unlikely. A single point of emergence is more likely with lineage A coming first. So market cases are not the primary cases (all market linked cases were lineage B). Their findings are consistent with Caraballo-Ortiz (2022), Bloom (2021). t.co/50kFV9zSb6

  3. Jesse Bloom showed again the available market samples don't support market origin. t.co/rorquFs1wm

  4. Michael Weissman uses a mathematical argument to show ascertainment bias in early case data. (George Gao, the Chinese CDC head at the time, acknowledged this to the BBC last year - they focused too much on and around the market and may have missed cases on the other side of the city).

arxiv.org/abs/2401.08680 (now published but paywalled https://academic.oup.com/jrsssa/advance-article/doi/10.1093/jrsssa/qnae021/7632556)

  1. The Account that identified errors in Pekar et. al. leading to an erratum last year has found another significant error. Single spillover again looks more likely. t.co/GAPihZu51P

  2. Weissman's Bayesian analysis provides a thorough overview and is probably as good a case for lab origin as any. https://michaelweissman.substack.com/p/an-inconvenient-probability

Replies from: viking_math
comment by viking_math · 2024-03-27T05:31:57.212Z · LW(p) · GW(p)

Brand new account, reposting old arguments? Not suspicious at all. 

Stoyan and Chiu (2024)

"Just because the market was the epicenter doesn't mean the pandemic started there," while technically true, is fairly meaningless. If the center were at the lab every lab leak proponent would be shouting at the top of their lungs this conclusively proves the lab leak theory. Debating one particular statistical analysis doesn't disprove the very elementary technique of "look at the data, it's obvious" aka https://xkcd.com/2400/.

The multiple spillover theory might be wrong. But then again, so might all of the analyses that Roko cited in his initial post, including the paper about genetic engineering, the Richard Ebright tweet, the RTK estimates, etc. The point of that part was to show that it's very easy to generate high Bayes factors if you highball favorable pieces of information, ignore unfavorable ones, make convenient assumptions, and multiply numbers together. 

https://michaelweissman.substack.com/p/an-inconvenient-probability

This analysis is obviously heavily biased. No Bayes factor at all for the cases being at the market? Again, no LL supporter would seriously say the BF would be one if the cases were clustered near the WIV. This is the exact same sort of highly motivated reasoning that Rootclaim applied, and neither of the judges bought it, for the same reason. The CGG analysis is just wrong, etc.