How can I reconcile these COVID test false-negative numbers?

[effort level: thinking out loud, plus a couple hours’ googling]

There’s been a lot of press lately around Costco selling an “AZOVA” at-home COVID test...

...with a sensitivity of 98% (meaning 98% of positive tests are correct) and a specificity of 99% (meaning 99% of negative tests are correct).

(IIUC, they’re getting their terms wrong here: “sensitivity” means “P(positive test | sick)”, not “P(sick | positive test)” as their parenthetical claims. Same flip for “specificity.” I’d guess that they mean “P(positive test | sick)”, and that some copywriter mis-translated, but not sure.)

That is, they claim a false negative rate of 2%.

Compare that to this study of RT-PCR COVID test false negative rates:

Over the 4 days of infection before the typical time of symptom onset (day 5), the probability of a false-negative result in an infected person decreases from
100% (95% CI, 100% to 100%) on day 1 to
67% (CI, 27% to 94%) on day 4.On the day of symptom onset, the median false-negative rate was
38% (CI, 18% to 65%). This decreased to
20% (CI, 12% to 30%) on day 8 (3 days after symptom onset) then began to increase again, from
21% (CI, 13% to 31%) on day 9 to
66% (CI, 54% to 77%) on day 21.

That is, they claim false negative rates ten times higher than AZOVA’s, even if you nail the timing.

How can these false negative rates be so different?

Hypothesis 1: the study with the >20% false negative rates was from April-May, and the state of the art has moved on since then.

(Counterpoint: in five months, we reduced false negatives by a factor of ten? Seems unlikely.)
Hypothesis 2: AZOVA doesn’t actually mean “sensitivity” i.e. “P(positive test | sick)”, they truly mean “P(sick | positive test) = 98%”—which might bw achievable through some clever definition of base rates.

(Counterpoint: I think this would have to drive their “P(healthy | negative test)” numbers into the toilet.)
Hypothesis 3: AZOVA’s “98%” and “99%” are just benchmarked against some other test—so “P(positive test | sick)” should actually read “P(positive AZOVA test | positive gold standard test)”—which just means that their test is about as good as the gold standard.

(Counterpoint: in the “Contrived Clinical Study” section of their Emergency Use Authorization summary, they build “known positive” / “known negative” samples by spiking negative samples with viral RNA, and their test gets every one(!) correct (n=31 for +, n=11 for -). Naively, this seems hard to fake, unless their spiking concentration is ridiculously high, which I don’t think it is. (Their spiking concentrations are on the order of their “LoD” = “Limit of Detection” = “10 copies/μL”—for comparison, at least after symptom onset, saliva carries thousands of copies/μL [1, 2].))

None of these hypotheses seems to hold water. I’m inclined to think that the pessimistic study is closer to the true false negative rate, since those authors aren’t trying to sell me something, but I’m still distressed that I can’t see how AZOVA is (presumably) tricking me.