PyryP comments on Concept Safety: World-models as tools

PyryP 9 May 2015 15:14 UTC
2 points

...trying to switch to a QM model where location was ill-defined would be a very bad thing for the goal of surviving

This only applies because switching to a QM model is computationally prohibitive. QM is generally held to be more true than CM and even if you’re trying to optimize for things in terms of CM you’re still better off using the QM model as long as you have a good mapping from your QM model to your CM goals.

Humans do indeed find it difficult to think in terms of QM, but this need not be the case for a future AI with access to a quantum computer. If the CM model and the QM model could be run with similar efficiencies then the real issue becomes the mapping from QM model to CM goals. All maps from QM to CM leak in terms of what counts as being located inside the box so the AI might find ways to act outside the box (according to a different mapping). This highlights the point that with computational resources being equal the AI will always prefer the most general available world model for decision making even if its goals are defined in terms a less general model.

I have to point out that the issue with QM to CM mappings is mostly of theoretical interest and in practice it should be possible to define a mapping that safely maximizes the probability of the AI staying in the box while still being able to function optimally. The latter condition is required because a mapping from QM to CM that purely maximizes the probability of the AI staying in box will cause the AI to move in to the middle of the box and cool down.