I wonder if this can’t be considered more pragmatically? There was a passage in the MIT Encyclopedia of Cognitive Sciences in the Logic entry that seems relevant:
Johnson-Laird and Byrne (1991) have argued that postulating
more imagelike MENTAL MODELS make better predictions
about the way people actually reason. Their proposal,
applied to our sample argument, might well help to explain
the difference in difficulty in the various inferences mentioned
earlier, because it is easier to visualize “some people”
and “at least three people” than it is to visualize “most
people.” Cognitive scientists have recently been exploring
computational models of reasoning with diagrams. Logicians,
with the notable exceptions of Euler, Venn, and
Peirce, have until the past decade paid scant attention to
spatial forms of representation, but this is beginning to
change (Hammer 1995).
This made me think a bit differently about how we might choose between two abstract models with the same explanatory power. It seems that the rational thing to do is to choose the one that allows you to reason the most fluently so as to minimize the likelihood of fallacious reasoning.
In fact, it seems that we should expect the cognitive sciences to provide clues about how we could adjust formal systems with the view of easy of understanding and technical fluency when reasoning about/with them.
Taking this view; assuming we had finished physics, all the future work would be about tweaking the formalisms toward the most intuitive possible ones with respect to the knowledge we have of human reasoning. What would be important is that they be as easy to understand as possible. That way we could hope to ensure more efficiency in technological development as well as better general understanding among the public.
Given that computation has costs, memory is limited, to make the best possible predictions given some resources one needs to use the computationally least expensive way.
Assuming that generating a mathematical model is (at least on average) more difficult for more complex theories, wasting time by creating (at the end equivalent) models by having to incorporate epiphenomenal concepts leads to practically worse predictions.
So not using the strong Occam’s razor would lead to worse results.
And because we have taking moral issues with us: not using the best possible way would even be morally bad, as we would lose important information for optimizing our moral behavior, as we cannot look as far into the future/would have less accurate predictions at our disposal due to our limited resources.
ETA: The difference to your post above is mainly that this holds true for a perfect bayesian superintelligence still, and should be invariant to different computation substrate.
I wonder if this can’t be considered more pragmatically? There was a passage in the MIT Encyclopedia of Cognitive Sciences in the Logic entry that seems relevant:
This made me think a bit differently about how we might choose between two abstract models with the same explanatory power. It seems that the rational thing to do is to choose the one that allows you to reason the most fluently so as to minimize the likelihood of fallacious reasoning.
In fact, it seems that we should expect the cognitive sciences to provide clues about how we could adjust formal systems with the view of easy of understanding and technical fluency when reasoning about/with them.
Taking this view; assuming we had finished physics, all the future work would be about tweaking the formalisms toward the most intuitive possible ones with respect to the knowledge we have of human reasoning. What would be important is that they be as easy to understand as possible. That way we could hope to ensure more efficiency in technological development as well as better general understanding among the public.
I was thinking on a similar line:
Given that computation has costs, memory is limited, to make the best possible predictions given some resources one needs to use the computationally least expensive way.
Assuming that generating a mathematical model is (at least on average) more difficult for more complex theories, wasting time by creating (at the end equivalent) models by having to incorporate epiphenomenal concepts leads to practically worse predictions.
So not using the strong Occam’s razor would lead to worse results.
And because we have taking moral issues with us: not using the best possible way would even be morally bad, as we would lose important information for optimizing our moral behavior, as we cannot look as far into the future/would have less accurate predictions at our disposal due to our limited resources.
ETA: The difference to your post above is mainly that this holds true for a perfect bayesian superintelligence still, and should be invariant to different computation substrate.