Given that computation has costs, memory is limited, to make the best possible predictions given some resources one needs to use the computationally least expensive way.
Assuming that generating a mathematical model is (at least on average) more difficult for more complex theories, wasting time by creating (at the end equivalent) models by having to incorporate epiphenomenal concepts leads to practically worse predictions.
So not using the strong Occam’s razor would lead to worse results.
And because we have taking moral issues with us: not using the best possible way would even be morally bad, as we would lose important information for optimizing our moral behavior, as we cannot look as far into the future/would have less accurate predictions at our disposal due to our limited resources.
ETA: The difference to your post above is mainly that this holds true for a perfect bayesian superintelligence still, and should be invariant to different computation substrate.
I was thinking on a similar line:
Given that computation has costs, memory is limited, to make the best possible predictions given some resources one needs to use the computationally least expensive way.
Assuming that generating a mathematical model is (at least on average) more difficult for more complex theories, wasting time by creating (at the end equivalent) models by having to incorporate epiphenomenal concepts leads to practically worse predictions.
So not using the strong Occam’s razor would lead to worse results.
And because we have taking moral issues with us: not using the best possible way would even be morally bad, as we would lose important information for optimizing our moral behavior, as we cannot look as far into the future/would have less accurate predictions at our disposal due to our limited resources.
ETA: The difference to your post above is mainly that this holds true for a perfect bayesian superintelligence still, and should be invariant to different computation substrate.