With the scaling in compute it will not take long until small groups or even a single individual can train or fine tune an open source model to reach o1s level (and beyond). So I am wondering about the data. Does for instance o1 training set in these subjects contain data that is very hard to come by or is it mostly publicly available data? If it is the first, the limiting factor is the access to data and it should be reasonably easy to contain the risks. If it is the latter… O´boy...
With the scaling in compute it will not take long until small groups or even a single individual can train or fine tune an open source model to reach o1s level (and beyond). So I am wondering about the data. Does for instance o1 training set in these subjects contain data that is very hard to come by or is it mostly publicly available data? If it is the first, the limiting factor is the access to data and it should be reasonably easy to contain the risks. If it is the latter… O´boy...