Lo and behold, this poor choice of ontology doesn’t work very well; the modeler requires a huge amount of complexity to decently represent the real-world system in their poorly-chosen ontology. For instance, maybe they need a ridiculously large decision tree or random forest to represent a neural net to decent precision.
That can happen because your choice of ontology was bad, but it can also be the case that representing the real-world system with “decent” precision in any ontology requires a ridiculously large model. Concretely, I expect that this is true of e.g. human language—e.g. for the Hutter Prize I don’t expect it to be possible to get a lossless compression ratio better than 0.08 on enwik9 no matter what ontology you choose.
It would be nice if we had a better way of distinguishing between “intrinsically complex domain” and “skill issue” than “have a bunch of people dedicate years of their lives to trying a bunch of different approaches” though.
That can happen because your choice of ontology was bad, but it can also be the case that representing the real-world system with “decent” precision in any ontology requires a ridiculously large model. Concretely, I expect that this is true of e.g. human language—e.g. for the Hutter Prize I don’t expect it to be possible to get a lossless compression ratio better than 0.08 on enwik9 no matter what ontology you choose.
It would be nice if we had a better way of distinguishing between “intrinsically complex domain” and “skill issue” than “have a bunch of people dedicate years of their lives to trying a bunch of different approaches” though.