the number of queries to the model / specification required to obtain worst-case guarantees is orders of magnitude more than the number of queries needed to train the model, and this ratio gets worse the more complex your environment is
Not clear to me whether this is true in general—if the property you are specifying is in some sense “easy” to satisfy (e.g. it holds of the random model, holds for some model near any given model), and the behavior you are training is “hard” (e.g. requires almost all of the model’s capacity) then it seems possible that verification won’t add too much.
Not clear to me whether this is true in general—if the property you are specifying is in some sense “easy” to satisfy (e.g. it holds of the random model, holds for some model near any given model), and the behavior you are training is “hard” (e.g. requires almost all of the model’s capacity) then it seems possible that verification won’t add too much.
Yeah, I agree it is not always true and so isn’t a clear obstruction.