Biased AI heuistics

Stuart_Armstrong14 Sep 2015 14:21 UTC

6 points

Heuristics have a bad rep on Less Wrong, but some people are keen to point out how useful they can sometimes be. One major critique of the “Superintelligence” thesis, is that it presents an abstract, Bayesian view of intelligence that ignores the practicalities of bounded rationality.

This trend of thought raises some other concerns, though. What if we could produce an AI of extremely high capabilities, but riven with huge numbers of heuristics? If these were human heuristics, then we might have a chance of of understanding and addressing them, but what if they weren’t? What if the AI has an underconfidence bias, and tended to chance its views too fast? Now, that one is probably quite easy to detect (unlike many that we would not have a clue about), but what if it wasn’t consistent across areas and types of new information?

In that case, our ability to predict or control what the AI does may be very limited. We can understand human biases and heuristics pretty well, and we can understand idealised agents, but differently biased agents might be a big problem.

Stuart_Armstrong14 Sep 2015 14:21 UTC

6 points

6 comments1 min readLW link Archive

DanArmak 17 Sep 2015 10:45 UTC
0 points
An idealized or fully correct agent’s behavior is too hard to predict (=implement) in a complex world. That’s why you introduce the heuristics: they are easier to calculate. Can’t that be used to also make them easier to predict by a third party?

Separately from this, the agent might learn or self-modify to have new heuristics. But what does the word “heuristic” mean here? What’s special about it that doesn’t apply to all self modifications and all learning models, if you can’t predict their behavior without actually running them?
- Stuart_Armstrong 17 Sep 2015 11:10 UTC
  0 points
  Parent
  
  Can’t that be used to also make them easier to predict by a third party?
  
  Possibly. we need to be closer to the implementation for this.
Houshalter 14 Sep 2015 23:24 UTC
0 points
Does it matter if we aren’t able to recognize it’s biases? Humans are able to function with biases.

We are also are able to recognize and correct for their own biases. And we can’t even look at, let alone rewrite, our own source code.
- Stuart_Armstrong 15 Sep 2015 11:04 UTC
  0 points
  Parent
  I’m assuming that it can function at high level despite/because of its biases. And the problem is not that it might not work effectively, but that our job of ensuring it behaves well just got harder, because we just got worse at predicting its decisions.
[deleted] 14 Sep 2015 20:08 UTC
0 points
If we programmed it with human heuristics, wouldn’t we assume that it would have similar biases?
- Stuart_Armstrong 15 Sep 2015 11:02 UTC
  4 points
  Parent
  We may not have programmed these in at all—it could just be efficient machine learning. And even if if started with human heuristics, it might modify these away rapidly.