Nice collection of anecdotes from the Evolutionary Computation and Artificial Life research communities about evolutionary algorithms subverting researchers intentions, exposing unrecognized bugs in their code, producing unexpected adaptations, or exhibiting outcomes uncannily convergent with ones in nature. Some of my favorites:
In other experiments, the fitness function rewarded minimizing the difference between what the program generated and the ideal target output, which was stored in text files. After several generations of evolution, suddenly and strangely, many perfectly fit solutions appeared, seemingly out of nowhere. Upon manual inspection, these highly fit programs still were clearly broken. It turned out that one of the individuals had deleted all of the target files when it was run! With these files missing, because of how the test function was written, it awarded perfect fitness scores to the rogue candidate and to all of its peers
...
To test a distributed computation platform called EC-star [84], Babak Hodjat implemented a multiplexer problem [85], wherein the objective is to learn how to selectively forward an input signal. Interestingly, the system had evolved solutions that involved too few rules to correctly perform the task. Thinking that evolution had discovered an exploit, the impossibly small solution was tested over all possible cases. The experimenters expected this test to reveal a bug in fitness calculation. Surprisingly, all cases were validated perfectly, leaving the experimenters confused. Carefully examination of the code provided the solution: The system had exploited the logic engine’s rule evaluation order to come up with a compressed solution. In other words, evolution opportunistically offloaded some of its work into those implicit conditions.
I also liked this one, on how easily a program became adversarial with its implementer:
Terence Tao has a comment on this paper on G+ that I quite liked:
I believe Terence is describing extremal goodhart in his second paragraph.
This is a really interesting paper.
These are really good examples, but I think it’s important to distinguish between ill-advised proxies, which is what is described in many of these cases, ones which are misaligned even in the typical case, and ones that fail in the different ways we discussed in our paper https://arxiv.org/abs/1803.04585