I’d expect a designed thing to have much cleaner, much more comprehensible internals. If you gave a human a compromise utility function and told them that it was a perfect average of their desires (or their tribe’s desires) and their opponents’ desires, they would not be able to verify this, they wouldn’t recognise their utility function, they might not even individually possess it (again, human values seem to be a bit distributed), and they would be inclined to reject a fair deal, humans tend to see their other only in extreme shades, more foreign than they really are.
Do you not believe that an AGI is likely to be self-comprehending? I wonder, sir, do you still not anticipate foom? Is it connected to that disagreement?
I’d expect a designed thing to have much cleaner, much more comprehensible internals. If you gave a human a compromise utility function and told them that it was a perfect average of their desires (or their tribe’s desires) and their opponents’ desires, they would not be able to verify this, they wouldn’t recognise their utility function, they might not even individually possess it (again, human values seem to be a bit distributed), and they would be inclined to reject a fair deal, humans tend to see their other only in extreme shades, more foreign than they really are.
Do you not believe that an AGI is likely to be self-comprehending? I wonder, sir, do you still not anticipate foom? Is it connected to that disagreement?