..the Ape Constraint encodes “Be Nice to Apes + do not question the Ape Constraint”
...so in this story, the Human CEV = [Real-Life human CEV + Be Nice to Apes + do not modify the Ape Constraint]
Professor Insanitus proposes to modify the Ape Constraint, leading to various different versions of “Be Nice to Apes” code (some of which might not actually be that nice for apes, and some of which might in fact be nicer for apes)
But by what metric will the humans measure the “success” of the novel “Be Nice to Apes” varients?
Wouldn’t the metric be the original “Be Nice to Apes” instinct? So what would this experiment actually tell them?
So, to make the metaphor explicit …
..the Ape Constraint encodes “Be Nice to Apes + do not question the Ape Constraint”
...so in this story, the Human CEV = [Real-Life human CEV + Be Nice to Apes + do not modify the Ape Constraint]
Professor Insanitus proposes to modify the Ape Constraint, leading to various different versions of “Be Nice to Apes” code (some of which might not actually be that nice for apes, and some of which might in fact be nicer for apes)
But by what metric will the humans measure the “success” of the novel “Be Nice to Apes” varients?
Wouldn’t the metric be the original “Be Nice to Apes” instinct? So what would this experiment actually tell them?