Determining the language to use is a classic case of premature optimization. No matter what the case, it will have to be provably free of ambiguities, which leaves us programming languages. In addition, in terms of the math of FAI, we’re still at the “is this Turing complete” sort of stage in development. So it doesn’t really matter yet. I guess one consideration is that the algorithm design is going to take way more time and effort than the programming, and the program has essentially no room for bugs (Corrigibility is an effort to make it easier to test an AI without it resisting). So in that sense, it could be argued that the lower level the language, the better.
Directly programming human values into an AI has always been the worst option, partially for your reason. In addition, the religious concept you gave can be trivially broken by two different beings having different or conflicting utility functions, and so acting as if they were the same is a bad outcome. A better option is to construct a scheme so that the smarter the AI gets, the better it approximates human values, by using its own intelligence to determine them, as in coherent extrapolated volition.
Determining the language to use is a classic case of premature optimization. No matter what the case, it will have to be provably free of ambiguities, which leaves us programming languages. In addition, in terms of the math of FAI, we’re still at the “is this Turing complete” sort of stage in development. So it doesn’t really matter yet. I guess one consideration is that the algorithm design is going to take way more time and effort than the programming, and the program has essentially no room for bugs (Corrigibility is an effort to make it easier to test an AI without it resisting). So in that sense, it could be argued that the lower level the language, the better.
Directly programming human values into an AI has always been the worst option, partially for your reason. In addition, the religious concept you gave can be trivially broken by two different beings having different or conflicting utility functions, and so acting as if they were the same is a bad outcome. A better option is to construct a scheme so that the smarter the AI gets, the better it approximates human values, by using its own intelligence to determine them, as in coherent extrapolated volition.