I skimmed the article. First, good idea. I would never have thought of that. But I do think there is a flaw. Given evolution, we would expect humans to have fairly complex utility functions and not simple utility functions. The complexity penalty for evolution + simple utility function could actually be higher than that of evolution + complicated utility function, depending on precisely how complex the simple and complicated utility functions are. For example, I assert that the complexity penalty for [evolution + a utility function with only one value (e.g. paper clips or happiness)] is higher than the complexity penalty for [evolution + any reasonable approximation to our current values].
This is only to say that a more complicated utility function for an evolved agent doesn’t necessarily imply a high complexity penalty. You could still be right in this particular case, but I’m not sure without actually being able to evaluate the relevant complexity penalties.
I skimmed the article. First, good idea. I would never have thought of that. But I do think there is a flaw. Given evolution, we would expect humans to have fairly complex utility functions and not simple utility functions. The complexity penalty for evolution + simple utility function could actually be higher than that of evolution + complicated utility function, depending on precisely how complex the simple and complicated utility functions are. For example, I assert that the complexity penalty for [evolution + a utility function with only one value (e.g. paper clips or happiness)] is higher than the complexity penalty for [evolution + any reasonable approximation to our current values].
This is only to say that a more complicated utility function for an evolved agent doesn’t necessarily imply a high complexity penalty. You could still be right in this particular case, but I’m not sure without actually being able to evaluate the relevant complexity penalties.
That’s a good point, and I’ll have to think about it.