I would go with the first one in general. The first one has proved itself on some test data, while all the second one has done is to fit a model on given data. There is always the risk that the second theory has overfitted a model with no worthwhile generalization accuracy. Even if the second theory is simpler than the first the fact that the first theory has been proved right on unseen data makes it a slam dunk winner. Of course further experiments may cause us to update our beliefs, particularly if theory 2 is proving just as accurate.
I would go with the first one in general. The first one has proved itself on some test data, while all the second one has done is to fit a model on given data. There is always the risk that the second theory has overfitted a model with no worthwhile generalization accuracy. Even if the second theory is simpler than the first the fact that the first theory has been proved right on unseen data makes it a slam dunk winner. Of course further experiments may cause us to update our beliefs, particularly if theory 2 is proving just as accurate.