I can’t speak to San Francisco specifically. But if it’s anything like many other locations in the US, the problem isn’t malice or indifference: it’s that generating this data is vastly harder than you realize. The politicians get the data the same time you do: as soon as it’s ready.
Here’s one tiny true example, from one part of the pipeline in one particular location. A substantial amount of data enters the system as faxes. The faxes go to a room full of National Guard, who manually enter the data into computers, from whence it begins a complicated process of validation and de-duplication before it enters the main pipeline. You can imagine that this system doesn’t scale particularly well as case counts rise.
At a broad scale, what’s happening is that an immense amount of data is trying to enter a legacy system that was designed for less than one percent of its current load. Some of the data comes from sleek modern hospitals with state of the art medical informatics systems. And some comes from computer illiterate rural doctors, and some comes from nursing homes that had never reported lab results before Covid, and some comes from employers who test their employees, and some comes from private labs, and some comes from sovereign tribes that have complicated data sharing agreements with the state, and...
If I can find the time, I might write a post explaining in more detail how surveillance data is generated and processed. But for now, I assure you this problem is incredibly hard. Update: here’s the post
Important disclaimer: my opinions are mine alone and I don’t speak for any government agency.
I understand that data collection is difficult and empathize with the people responsible for doing the work.
The thing is, SF used to publish everything as soon as they could! We accepted that numbers could be revised up or down as data was fully coded. This 5 day lag is IMO far on the wrong side of timeliness vs correctness.
I can’t speak to San Francisco specifically. But if it’s anything like many other locations in the US, the problem isn’t malice or indifference: it’s that generating this data is vastly harder than you realize. The politicians get the data the same time you do: as soon as it’s ready.
Here’s one tiny true example, from one part of the pipeline in one particular location. A substantial amount of data enters the system as faxes. The faxes go to a room full of National Guard, who manually enter the data into computers, from whence it begins a complicated process of validation and de-duplication before it enters the main pipeline. You can imagine that this system doesn’t scale particularly well as case counts rise.
At a broad scale, what’s happening is that an immense amount of data is trying to enter a legacy system that was designed for less than one percent of its current load. Some of the data comes from sleek modern hospitals with state of the art medical informatics systems. And some comes from computer illiterate rural doctors, and some comes from nursing homes that had never reported lab results before Covid, and some comes from employers who test their employees, and some comes from private labs, and some comes from sovereign tribes that have complicated data sharing agreements with the state, and...
If I can find the time, I might write a post explaining in more detail how surveillance data is generated and processed. But for now, I assure you this problem is incredibly hard. Update: here’s the post
Important disclaimer: my opinions are mine alone and I don’t speak for any government agency.
I understand that data collection is difficult and empathize with the people responsible for doing the work.
The thing is, SF used to publish everything as soon as they could! We accepted that numbers could be revised up or down as data was fully coded. This 5 day lag is IMO far on the wrong side of timeliness vs correctness.