How can I reconcile these COVID test false-negative numbers?

optimization-process

How can I reconcile these COVID test false-negative numbers?

post by Optimization Process · 2020-10-27T23:56:44.544Z · LW · GW · 5 comments

5 comments

[effort level: thinking out loud, plus a couple hours' googling]

There's been a lot of press lately around Costco selling an “AZOVA” at-home COVID test...

...with a sensitivity of 98% (meaning 98% of positive tests are correct) and a specificity of 99% (meaning 99% of negative tests are correct).

(IIUC, they're getting their terms wrong here: "sensitivity" means "P(positive test | sick)", not "P(sick | positive test)" as their parenthetical claims. Same flip for "specificity." I'd guess that they mean "P(positive test | sick)", and that some copywriter mis-translated, but not sure.)

That is, they claim a false negative rate of 2%.

Compare that to this study of RT-PCR COVID test false negative rates:

Over the 4 days of infection before the typical time of symptom onset (day 5), the probability of a false-negative result in an infected person decreases from
100% (95% CI, 100% to 100%) on day 1 to
67% (CI, 27% to 94%) on day 4.On the day of symptom onset, the median false-negative rate was
38% (CI, 18% to 65%). This decreased to
20% (CI, 12% to 30%) on day 8 (3 days after symptom onset) then began to increase again, from
21% (CI, 13% to 31%) on day 9 to
66% (CI, 54% to 77%) on day 21.

That is, they claim false negative rates ten times higher than AZOVA's, even if you nail the timing.

How can these false negative rates be so different?

Hypothesis 1: the study with the >20% false negative rates was from April-May, and the state of the art has moved on since then.

(Counterpoint: in five months, we reduced false negatives by a factor of ten? Seems unlikely.)
Hypothesis 2: AZOVA doesn't actually mean "sensitivity" i.e. "P(positive test | sick)", they truly mean "P(sick | positive test) = 98%" -- which might bw achievable through some clever definition of base rates.

(Counterpoint: I think this would have to drive their "P(healthy | negative test)" numbers into the toilet.)
Hypothesis 3: AZOVA's "98%" and "99%" are just benchmarked against some other test -- so "P(positive test | sick)" should actually read "P(positive AZOVA test | positive gold standard test)" -- which just means that their test is about as good as the gold standard.

(Counterpoint: in the "Contrived Clinical Study" section of their Emergency Use Authorization summary, they build "known positive" / "known negative" samples by spiking negative samples with viral RNA, and their test gets every one(!) correct (n=31 for +, n=11 for -). Naively, this seems hard to fake, unless their spiking concentration is ridiculously high, which I don't think it is. (Their spiking concentrations are on the order of their "LoD" = "Limit of Detection" = "10 copies/μL" -- for comparison, at least after symptom onset, saliva carries thousands of copies/μL [1, 2].))

None of these hypotheses seems to hold water. I'm inclined to think that the pessimistic study is closer to the true false negative rate, since those authors aren't trying to sell me something, but I'm still distressed that I can't see how AZOVA is (presumably) tricking me.

5 comments

Comments sorted by top scores.

comment by Optimization Process · 2020-10-27T23:59:02.218Z · LW(p) · GW(p)

Further point of confusion: the Emergency Use Authorization summary mentions n=31 positive samples and n=11 negative samples in the "Analytical Specificity" section -- how do you get "98%" or "99%" out of those sample sizes? Shouldn't you need at least n=50 to get 98%? Heck, why do they have any ~~positive~~ (edit: negative) samples in a "Specificity" section?

Replies from: Teerth Aloke

↑ comment by Teerth Aloke · 2020-10-28T06:48:38.101Z · LW(p) · GW(p)

Please follow up on what you find.

comment by atlas · 2020-10-28T13:36:22.216Z · LW(p) · GW(p)

Yeah, this has been confusing me as well. There's an antigen test that Roche claims has 96.52% sensitivity (so <4% false negatives), which seems both surprisingly high (since even PCR tests seem to have far lower sensitivity, as per the study you linked) and suspiciously precise.

Replies from: danohu

↑ comment by danohu · 2020-10-31T10:35:28.509Z · LW(p) · GW(p)

With antigen tests, I believe the figures are "P(positive test | positive gold standard test)"

Look at the documentation for another antigen test -- section 14 gives sensitivity compared to PCR. The figure they promote (and is e.g. used for the German government approval) is 97.56% sensitivity. This is both based on PCR, and based on filtering down to specimens with a high viral load (Ct 20-30).

BTW, that German site is a good collection of documentation links for all the antigen tests currently approved.

Replies from: atlas

↑ comment by atlas · 2020-10-31T22:21:00.383Z · LW(p) · GW(p)

Ah, gotcha, that makes more sense. And thanks for the awesome antigen test table you linked there!

How can I reconcile these COVID test false-negative numbers?

Contents

5 comments