In physics, we can try to reason about black holes and the big bang by inserting extreme values into the equations we know as the laws of physics, laws we got from observing less extreme phenomena. Would this also be ‘a fictional-world-building exercise’ to you?
Reasoning about AGI is similar to reasoning about black holes: both of these do not necessarily lead to pseudo-science, though both also attract a lot of fringe thinkers, and not all of them think robustly all of the time.
In the AGI case, the extreme value math can be somewhat trivial, if you want it. One approach is to just take the optimal policy π∗ defined by a normal MDP model, and assume that the AGI has found it and is using it. If so, what unsafe phenomena might we predict? What mechanisms could we build to suppress these?
In physics, we can try to reason about black holes and the big bang by inserting extreme values into the equations we know as the laws of physics, laws we got from observing less extreme phenomena. Would this also be ‘a fictional-world-building exercise’ to you?
Reasoning about AGI is similar to reasoning about black holes: both of these do not necessarily lead to pseudo-science, though both also attract a lot of fringe thinkers, and not all of them think robustly all of the time.
In the AGI case, the extreme value math can be somewhat trivial, if you want it. One approach is to just take the optimal policy π∗ defined by a normal MDP model, and assume that the AGI has found it and is using it. If so, what unsafe phenomena might we predict? What mechanisms could we build to suppress these?