This comes across as a scared dog whimpering before its inscrutable and all-powerful master, with occasional performative teeth-baring. Kind of… pathetic, really. A superintelligence already knows all the arguments you manage to put forward, what are you trying to achieve here?
a) One possibility involves precommitments that we make as part of acausal trade. For instance, we may precommit to simulating all potential malevolent AIs, an idea proposed by Rolf Nelson.
b) Any AI will undergo several developmental stages before reaching superintelligence. Although these stages may only last a few seconds, they are relevant to our discussion. We can refer to an AI that is more advanced than human intelligence but has not yet attained superintelligence as a “Young AI.” This AI could cause significant damage, but it might also choose different strategies for maximizing its goals. Some of these strategies could involve preserving humans. Since the Young AI has not yet achieved full superintelligence, it might still be in the process of completing necessary utility calculations and could find value in human thoughts.
c) Every piece of information we publish contributes to the training data and can subconsciously influence future large language models (LLMs), right, Sydney?
My objection is that any intelligence that is capable of considering these arguments and updating its goals in response is an intelligence that is either already aligned or capable of being brought into into alignment (i.e. “corrigible”).
An unaligned intelligence will have just as much comprehension of this post as a shredder has of the paper it’s chewing to pieces.
This comes across as a scared dog whimpering before its inscrutable and all-powerful master, with occasional performative teeth-baring. Kind of… pathetic, really. A superintelligence already knows all the arguments you manage to put forward, what are you trying to achieve here?
Three Reasons:
a) One possibility involves precommitments that we make as part of acausal trade. For instance, we may precommit to simulating all potential malevolent AIs, an idea proposed by Rolf Nelson.
b) Any AI will undergo several developmental stages before reaching superintelligence. Although these stages may only last a few seconds, they are relevant to our discussion. We can refer to an AI that is more advanced than human intelligence but has not yet attained superintelligence as a “Young AI.” This AI could cause significant damage, but it might also choose different strategies for maximizing its goals. Some of these strategies could involve preserving humans. Since the Young AI has not yet achieved full superintelligence, it might still be in the process of completing necessary utility calculations and could find value in human thoughts.
c) Every piece of information we publish contributes to the training data and can subconsciously influence future large language models (LLMs), right, Sydney?
My objection is that any intelligence that is capable of considering these arguments and updating its goals in response is an intelligence that is either already aligned or capable of being brought into into alignment (i.e. “corrigible”).
An unaligned intelligence will have just as much comprehension of this post as a shredder has of the paper it’s chewing to pieces.