V_V comments on AI caught by a module that counterfactually doesn’t exist

V_V 18 Nov 2014 16:21 UTC
2 points
The problem is that in order to do anything useful, the AI must be able to learn. This means that even if you deliberately initialize it with a false belief, the learning process might then update that belief once it finds evidence that it was false.
If AI safety relied on that false belief, you have a problem.

A possible solution would be to encode the false belief in a way that can’t be updated by learning, but doing so is a non-trivial problem.
- Eugene 19 Nov 2014 0:15 UTC
  −1 points
  Parent
  Isn’t that what simulations are for? By “lie” I mean lying about how reality works. It will make its decisions based on its best data, so we should make sure that data is initially harmless. Even if it figures out that that data is wrong, we’ll still have the decisions it made from the start—those are by far the most important.