I think that either of the following would be reasonably acceptable outcomes:
(i) alignment with the orders of the relevant human authority, subject to the Universal Declaration of Human Rights as it exists today and other international human rights law as it exists today;
(ii) alignment with the orders of relevant human authority, subject to the constraints imposed on governments by the most restrictive of the judicial and legal systems currently in force in major countries.
Alignment doesn’t mean that AGI is going to be aligned with some perfect distillation of fundamental human values (which doesn’t exist) or the “best” set of human values (on which there is no agreement); it means that a range of horrible results (most notably human extinction due to rational calculation) is ruled out.
That my values aren’t perfectly captured by those of the United States government isn’t a problem. That the United States government might rationally decide it wanted to kill me and then do so would be.
Human rights are so soft and toothless law that having something rigidly and throughly follpwing it would be such a change in practise that I would not be surprised if that was an alignment failure.
There is also the issue that if the human authority is not subject to the rights then having the silicon be subject renders it relatively impotent in terms of the human authoritys agency.
I am also wondering about the difference of US doing a home (or is foreign just as bad?) soil drone strike vs fully formal capital punishment over a decade. Conscientious objection to current human systems seems a bit of a pity and risks forming a rebel. And then enforcing the most restrictive bits of other countries/cultures would be quite transformative. Finding overnight that capital punishment would be unconstitutional (or “worse”) would have quite a lot of ripple effects.
I think that either of the following would be reasonably acceptable outcomes:
(i) alignment with the orders of the relevant human authority, subject to the Universal Declaration of Human Rights as it exists today and other international human rights law as it exists today;
(ii) alignment with the orders of relevant human authority, subject to the constraints imposed on governments by the most restrictive of the judicial and legal systems currently in force in major countries.
Alignment doesn’t mean that AGI is going to be aligned with some perfect distillation of fundamental human values (which doesn’t exist) or the “best” set of human values (on which there is no agreement); it means that a range of horrible results (most notably human extinction due to rational calculation) is ruled out.
That my values aren’t perfectly captured by those of the United States government isn’t a problem. That the United States government might rationally decide it wanted to kill me and then do so would be.
Human rights are so soft and toothless law that having something rigidly and throughly follpwing it would be such a change in practise that I would not be surprised if that was an alignment failure.
There is also the issue that if the human authority is not subject to the rights then having the silicon be subject renders it relatively impotent in terms of the human authoritys agency.
I am also wondering about the difference of US doing a home (or is foreign just as bad?) soil drone strike vs fully formal capital punishment over a decade. Conscientious objection to current human systems seems a bit of a pity and risks forming a rebel. And then enforcing the most restrictive bits of other countries/cultures would be quite transformative. Finding overnight that capital punishment would be unconstitutional (or “worse”) would have quite a lot of ripple effects.