William_S comments on Superintelligence 12: Malignant failure modes

William_S 7 Dec 2014 1:01 UTC
2 points
Stuart Russell, in his comment on the Edge.org AI discussion, offered a concise mathematical description of perverse instantiation, and seems to suggest that it is likely to occur:

A system that is optimizing a function of n variables, where the objective depends on a subset of size k<n, will often set the remaining unconstrained variables to extreme values; if one of those unconstrained variables is actually something we care about, the solution found may be highly undesirable.

I’m curious if there’s more information about this behavior occurring in practice.