It’s how you draw your system box. Evolutionary search is equivalent to a self-modifying program, if you think of the whole search process as the program. The same issues apply.
I think the sequences do a good job at demolishing the idea that human testers can possibly judge friendliness directly, so long as the AI operates as a black box. If you have a debug view into the operation of the AI that is a different story, but then you don’t need friendliness anyway.
It’s how you draw your system box. Evolutionary search is equivalent to a self-modifying program, if you think of the whole search process as the program. The same issues apply.
I think the sequences do a good job at demolishing the idea that human testers can possibly judge friendliness directly, so long as the AI operates as a black box. If you have a debug view into the operation of the AI that is a different story, but then you don’t need friendliness anyway.
If I draw a box around the selection algorithm and find there is nothing self modifying inside …where’s the circularity?