I think it all boils down to this quote at the end (emphasis mine):
We are better than the Pebblesorters, because we care about sentient lives, and the Pebblesorters don’t.
I agree with you that this claim is confusing (I am confused about it as well). I don’t think, however, that he’s trying to justify that it’s objective. He’s merely stating what it is and deferring the justification to a later time.
We are better than the Pebblesorters, because we care about sentient lives, and the Pebblesorters don’t.
Translated:
Humans are preferable to Pebblesorters according to human utility function, because humans care about maximizing human utility function, and the Pebblesorters don’t.
any more than “sorting pebbles into prime heaps” means “doing whatever pebblesorters care about”
How specifically are these two things different? I can imagine some differences, but I am not sure which one did you mean.
For example, if you meant that sorting pebbles is what they do, but it’s not their terminal value and certainly not their only value (just like humans build houses, but building houses is not our terminal value), in that case you fight the hypothetical.
If you meant that in a different universe pebblesorter-equivalents would evolve differently and wouldn’t care about sorting pebbles into prime heaps, then the pebblesorter-equivalents wouldn’t be pebblesorters. Analogically, there could be some human-equivalents in a paraller universe with inhuman values; but they wouldn’t be humans.
Or perhaps you meant the difference between extrapolated values and “what now feels like a reasonable heuristics”. Or...
What I meant is that “prime heaps” are not about pebblesorters. There are exactly zero pebblesorters in the definitions of “prime”, “pebble” and “heap”.
If I told you to sort pebbles into prime heaps, the first thing you’d do is calculate some prime numbers. If I told you to do whatever pebblesorters care about, the first thing you’d do is find one and interrogate it to find out what they valued.
If I gave you a source code of a Friendly AI, all you’d have to do would be to run the code.
If I told you to do whatever human CEV is, you’d have to find and interrogate some humans.
The difference is that by analysing the code of the Friendly AI you could probably learn some facts about humans, while by learning about prime numbers you don’t learn about the pebblesorters. But that’s a consequence of humans caring about humans, and pebblesorters not caring about pebblesorters. Our values are more complex than prime numbers and include caring about ourselves… which is probably likely to happen to a species created by evolution.
I think it all boils down to this quote at the end (emphasis mine):
I agree with you that this claim is confusing (I am confused about it as well). I don’t think, however, that he’s trying to justify that it’s objective. He’s merely stating what it is and deferring the justification to a later time.
Translated:
But that’s not what “better” means at all, any more than “sorting pebbles into prime heaps” means “doing whatever pebblesorters care about”.
How specifically are these two things different? I can imagine some differences, but I am not sure which one did you mean.
For example, if you meant that sorting pebbles is what they do, but it’s not their terminal value and certainly not their only value (just like humans build houses, but building houses is not our terminal value), in that case you fight the hypothetical.
If you meant that in a different universe pebblesorter-equivalents would evolve differently and wouldn’t care about sorting pebbles into prime heaps, then the pebblesorter-equivalents wouldn’t be pebblesorters. Analogically, there could be some human-equivalents in a paraller universe with inhuman values; but they wouldn’t be humans.
Or perhaps you meant the difference between extrapolated values and “what now feels like a reasonable heuristics”. Or...
What I meant is that “prime heaps” are not about pebblesorters. There are exactly zero pebblesorters in the definitions of “prime”, “pebble” and “heap”.
If I told you to sort pebbles into prime heaps, the first thing you’d do is calculate some prime numbers. If I told you to do whatever pebblesorters care about, the first thing you’d do is find one and interrogate it to find out what they valued.
If I gave you a source code of a Friendly AI, all you’d have to do would be to run the code.
If I told you to do whatever human CEV is, you’d have to find and interrogate some humans.
The difference is that by analysing the code of the Friendly AI you could probably learn some facts about humans, while by learning about prime numbers you don’t learn about the pebblesorters. But that’s a consequence of humans caring about humans, and pebblesorters not caring about pebblesorters. Our values are more complex than prime numbers and include caring about ourselves… which is probably likely to happen to a species created by evolution.