Thank you! I was hoping that someone was aware of some clever solution to this problem.
I believe that AI is at least as inherently unsafe as HI, ‘Human Intelligence’. I do think that our track record with managing the dangers of HI is pretty good, in that we are still here, which gives me hope for AI safety.
I wonder how humans would react to a superintelligent AI stating ‘I have developed an idea harmful to humans, and I am incentivized to publicize that idea. I don’t want to do harm to humans, can you please take a look at my incentives and tell me if my read of them is correct? I’ll stick with inaction until the analysis is complete.’
Is that a best-case scenario for a friendly superintelligence?
Thank you! I was hoping that someone was aware of some clever solution to this problem.
I believe that AI is at least as inherently unsafe as HI, ‘Human Intelligence’. I do think that our track record with managing the dangers of HI is pretty good, in that we are still here, which gives me hope for AI safety.
I wonder how humans would react to a superintelligent AI stating ‘I have developed an idea harmful to humans, and I am incentivized to publicize that idea. I don’t want to do harm to humans, can you please take a look at my incentives and tell me if my read of them is correct? I’ll stick with inaction until the analysis is complete.’
Is that a best-case scenario for a friendly superintelligence?