data scientists / statisticians mostly need access to computing power, which is fairly cheap these days.
This is true for each marginal data scientist. But there’s a catch, which is that those folks need data. Collecting and promulgating that data, in the application domains we care about, can sometimes be very costly. You might want to consider some of those as part of the cost for the data science.
For example, many countries are spending a huge amount of money on electronic health records, in part to allow better data mining. The health records aren’t primarily for scientific purposes, but making them researcher-friendly is a big indirect cost. Similarly, the census is a very expensive data-collection process that enables a lot of “cheap” analytics downstream.
While each data scientist might be cheap, there was a big up-front investment, at the national level, to enable them.
This is true for each marginal data scientist. But there’s a catch, which is that those folks need data. Collecting and promulgating that data, in the application domains we care about, can sometimes be very costly. You might want to consider some of those as part of the cost for the data science.
For example, many countries are spending a huge amount of money on electronic health records, in part to allow better data mining. The health records aren’t primarily for scientific purposes, but making them researcher-friendly is a big indirect cost. Similarly, the census is a very expensive data-collection process that enables a lot of “cheap” analytics downstream.
While each data scientist might be cheap, there was a big up-front investment, at the national level, to enable them.