↑ comment by Spiracular ·
2020-09-15T18:55:20.546Z · LW(p) · GW(p)
Chinese virology researcher released something claiming that SARS-2 might even be genetically-manipulated after all? After assessing, I'm not really convinced of the GMO claims, but the RaTG13 story definitely seems to have something weird going on.
Claims that the RaTG13 genome release was a cover-up (it does look like something's fishy with RaTG13, although it might be different than Yan thinks). Claims ZC45 and/or ZXC21 was the actual backbone (I'm feeling super-skeptical of this bit, but it has been hard for me to confirm either way).
https://zenodo.org/record/4028830#.X2EJo5NKj0v (aka Yan Report)
RaTG13 Looks Fishy
Looks like something fishy happened with RaTG13, although I'm not convinced that genetic modification was involved. This is an argument built on pre-prints, but they appear to offer several different lines of evidence that something weird happened here.
Simplest story (via R&B): It looks like people first sequenced this virus in 2016, under the name "BtCOV/4991", using mine samples from 2013. And for some reason, WIV re-released the sequence as "RaTG13" at a later date?
(edit: I may have just had a misunderstanding. Maybe BtCOV/4991 is the name of the virus as sequenced from miner-lungs, RaTG13 is the name of the virus as sequenced from floor droppings? But in that case, why is the "fecal" sample reading so weirdly low-bacteria? And they probably are embarrassed that it took them that long to sequence the fecal samples, and should be.)
A paper by by Indian researchers Rahalkar and Bahulikar ( https://doi.org/10.20944/preprints202005.0322.v1 ) notes that BtCoV/4991 sequenced in 2016 by the same Wuhan Virology Institute researchers (and taken from 2013 samples of a mineshaft that gave miners deadly pneumonia) was very similar, and likely the same, as RaTG13.
A preprint by Rahalkar and Bahulikar (R&B) ( doi: 10.20944/preprints202008.0205.v1 ) notes that the fraction of bacterial genomes in in the RaTG13 "fecal" sample was ABSURDLY low ("only 0.7% in contrast to 70-90% abundance in other fecal swabs from bats"). Something's weird there.
A more recent weird datapoint: A pre-print Yan referenced ( https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7337384/ ), whose finding (in graphs; it was left unclear in their wording) was indeed that a RaTG13 protein didn't competently bind their Bat ACE2 samples, but rather their Rat, Mouse, Human, and Pig ACE2. It's supposedly a horseshoe bat virus (sequenced by the Wuhan lab), so this seems hecka fishy to me.
(Sure, their bat samples weren't precisely the same species, but they tried 2 species from the same genus. SARS-2 DID bind for their R. macrotis bat sample, so it seems extra-fishy to me that RaTG13 didn't.).
((...oh. According to the R&B paper about the mineshaft, it was FILTY with rats, bats, poop, and fungus. And the CoV genome showed up in only one of ~280 samples taken. If it's like that, who the hell knew if it came from a rat or bat?))
At this point, RaTG13 is genuinely looking pretty fishy to me. It might actually take evidence of a conspiracy theory in the other direction for me to go back to neutral on that.
E-Protein Similarity? Meh.
I'm not finding the Protein-E sequence similarity super-convincing in itself, because while the logic is fine, it's very multiple-hypothesis-testing flavored.
I'm still looking into the ZC45 / ZXC21 claim, which I'm currently feeling skeptical of. Here's the paper that characterized those: doi: 10.1038/s41426-018-0155-5 . It's true that it was by people working at "Research Institute for Medicine of Nanjing Command." However, someone on twitter used BLAST on the E-protein sequence, and found a giant pile of different highly-related SARS-like coronaviruses. I'm trying to replicate that analysis using BLAST myself, and at a skim the 100% results are all more SARS-CoV-2, and the close (95%) results are damned diverse. ...I don't see ZC in them, it looks like it wasn't uploaded. Ugh. (The E-protein is only 75 amino acids long anyway. https://www.ncbi.nlm.nih.gov/protein/QIH45055.1 )
A different paper mentions extreme S2-protein similarity of early COVID-19 to ZC45 , but that protein is highly-conserved. That makes this a less surprising or meaningful result. (E was claimed to be fast-evolving, so its identicality would have been more surprising, but I couldn't confirm it.) https://doi.org/10.1080/22221751.2020.1719902
I think Yan offers a reasonable argument that a method could have been used that avoids obvious genetic-modification "stitches," instead using methods that are hard to distinguish from natural recombination events (ex: recombination in yeast). Sounds totally possible to me.
The fact that the early SARS-CoV-2 samples were already quite adapted to human ACE2 and didn't have the rapid-evolution you'd expect from a fresh zoonotic infection is something a friend of mine had previously noted, probably after reading the following paper (recommended): https://www.biorxiv.org/content/10.1101/2020.05.01.073262v1 (Zhan, Deverman, Chan). This fact does seem fishy, and had already pushed me a bit towards the "Wuhan lab adaptation & escape" theory.