Mark_Friedenbach comments on Groundwork for AGI safety engineering

Mark_Friedenbach 7 Aug 2014 23:53 UTC
1 point

We might instead try to instill the AGI with humane values via machine learning — training it to promote outcomes associated with camera inputs of smiling humans, for example. But a powerful search process is likely to hit on solutions that would never occur to a developing human. If the agent becomes more powerful or general over time, initially benign outputs may be a poor indicator of long-term safety.

This objection sounds like it boils down to “but we might fail!”, which is not very convincing. It seems to me that you (and Yudkowsky) are assuming an already superhuman intelligence, which is a bit of a strawman.

We have a century of literature on the proper emotional response of a normal human to various ethical scenarios at various levels of their development. A reasonable approach would be to do psychological evaluations of a sub-human intelligence proto-AGI being educated in a pre-school like environment, with the intent of developing an intelligence with the emotional and moral response matching that of human children before it enters into a recursive self-improvement. At this stage the mind would be simple and transparent enough to audit and ensure no deception is going on.

Some more nitpicky notes:

Because AGIs are intelligent, they will tend to be complex

Complexity does not follow from intelligence, even general intelligence. Complexity is a product of intelligence, not a feature. Just as the simple process of evolution created complex organisms, simple intelligences such as AIXI are capable of doing complex things. But that doesn’t make them inherently complex.

detecting and deflecting asteroids are intuitively difficult

This is not a hard problem. It just inherently works on long time scales. Deflecting an asteroid is easy… if you know about it well enough in advance.