What’s the rational basis for preferring all mass-energy consuming grey goo created by humans over all mass-energy consuming grey goo created by a paperclip optimizer? The only possible ultimate end in both scenarios is heat death anyways.
If no one’s goals can be definitely proven to be better than anyone else’s goals, then it doesnt seem like we can automatically conclude the majority of present or future humans, or our descendants, will prioritize maximizing fun, happiness, etc.
If some want to pursue that then fine, if others want to pursue different goals, even ones that are deleterious to overall fun, happiness, etc., then there doesn’t seem to be a credible argument to dissuade them?
Let’s think about it another way. Consider the thought experiment where a single normal cell is removed from the body of any randomly selected human. Clearly they would still be human.
If you keep on removing normal cells though eventually they would die. And if you keep on plucking away cells eventually the entire body would be gone and only cancerous cells would be left, i.e. only a ‘paperclip optimizer’ would remain from the original human, albeit inefficient and parasitic ‘paperclips’ that need a organic host.
(Due to the fact that everyone has some small number of cancerous cells at any given time that are taken care of by regular processes)
At what point does the human stop being ‘human’ and starts being a lump of flesh? And at what point does the lump of flesh become a latent ‘paperclip optimizer’?
Without a sharp cutoff, which I don’t think there is, there will inevitably be inbetween cases where your proposed methods cannot be applied consistently.
The trouble is if we, or the decision makers of the future, accept even one idea that is not internally consistent then it hardly seems like anyone will be able to refrain from accepting other ideas that are internally contradictory too. Nor will everyone err in the same way. There is no rational basis to accept one or another as a contradiction can imply anything at all, as we know from basic logic.
Then the end result will appear quite like monkey tribes fighting each other, agitating against each and all based on which inconsistencies they accept or not. Regardless of what they call each other, humans, aliens, AI, machines, organism, etc…
It does seem like alignment for all intents and purposes is impossible. Creating an AI truly beyond us then is really creating future, hopefully doting, parents to live under.
What’s the rational basis for preferring all mass-energy consuming grey goo created by humans over all mass-energy consuming grey goo created by a paperclip optimizer? The only possible ultimate end in both scenarios is heat death anyways.
If no one’s goals can be definitely proven to be better than anyone else’s goals, then it doesnt seem like we can automatically conclude the majority of present or future humans, or our descendants, will prioritize maximizing fun, happiness, etc.
If some want to pursue that then fine, if others want to pursue different goals, even ones that are deleterious to overall fun, happiness, etc., then there doesn’t seem to be a credible argument to dissuade them?
Those appear to be examples of arguments from consequences, a logical fallacy. How could similar reasoning be derived from axioms, if at all?
Let’s think about it another way. Consider the thought experiment where a single normal cell is removed from the body of any randomly selected human. Clearly they would still be human.
If you keep on removing normal cells though eventually they would die. And if you keep on plucking away cells eventually the entire body would be gone and only cancerous cells would be left, i.e. only a ‘paperclip optimizer’ would remain from the original human, albeit inefficient and parasitic ‘paperclips’ that need a organic host.
(Due to the fact that everyone has some small number of cancerous cells at any given time that are taken care of by regular processes)
At what point does the human stop being ‘human’ and starts being a lump of flesh? And at what point does the lump of flesh become a latent ‘paperclip optimizer’?
Without a sharp cutoff, which I don’t think there is, there will inevitably be inbetween cases where your proposed methods cannot be applied consistently.
The trouble is if we, or the decision makers of the future, accept even one idea that is not internally consistent then it hardly seems like anyone will be able to refrain from accepting other ideas that are internally contradictory too. Nor will everyone err in the same way. There is no rational basis to accept one or another as a contradiction can imply anything at all, as we know from basic logic.
Then the end result will appear quite like monkey tribes fighting each other, agitating against each and all based on which inconsistencies they accept or not. Regardless of what they call each other, humans, aliens, AI, machines, organism, etc…
It does seem like alignment for all intents and purposes is impossible. Creating an AI truly beyond us then is really creating future, hopefully doting, parents to live under.