I don’t see anyway that it is possible to entirely avoid the ‘mountain of temptation’ to get to an inherently kind AI that can’t be altered even by its creators. Maybe there is such a path, if you set up siloed teams of developers each working on separate parts of the system, and set up lots of oversight. I can’t imagine this being anywhere near competitive with a strategy that doesn’t do this.
So, I think that this is a lovely dream which is simply infeasible. I made a prediction market which tries to point out this infeasibility:
I don’t see anyway that it is possible to entirely avoid the ‘mountain of temptation’ to get to an inherently kind AI that can’t be altered even by its creators. Maybe there is such a path, if you set up siloed teams of developers each working on separate parts of the system, and set up lots of oversight. I can’t imagine this being anywhere near competitive with a strategy that doesn’t do this.
So, I think that this is a lovely dream which is simply infeasible. I made a prediction market which tries to point out this infeasibility:
I like the way you’ve operationalized the question