Hi Rohin, thank you so much for your feedback. I agree with everything you said and will try to update the post for clarity.
I don’t follow.
Sorry, that part was not well written (or well thought out), and so I’ll try to clarify:
What I meant by ‘is the NAH true for ethics?’ is ‘do sufficiently intelligent agents tend to converge on the same goals?’, which, now that I think about it, is just the negation of the orthogonality thesis.
I’m not sure I understand the tree realism post other than that a tree is a fuzzy category. While I am also fuzzy on the question of ‘what are my values’, that’s not the argument I’m trying to make.
I definitely think GPT-N will be able to answer questions about how humans would make ethical decisions, and wouldn’t be surprised if GPT-3 already performs fairly well at this.
Thanks for pointing that out, I hadn’t read that comment.
I object to the implication that the linked post argues for this claim: the “without specific countermeasures” part of that post does a lot of work.
Hm, yeah sorry for that poor reasoning, I think I should qualify that more. I do think that the default right now is that sufficient countermeasures are likely to not be deployed, but that point definitely deserves to be scrutinized more by me.
What I meant by ‘is the NAH true for ethics?’ is ‘do sufficiently intelligent agents tend to converge on the same goals?’, which, now that I think about it, is just the negation of the orthogonality thesis.
Ah, got it, that makes sense. The reason I was confused is that NAH applied to ethics would only say that the AI system has a concept of ethics similar to the ones humans have; it wouldn’t claim that the AI system would be motivated by that concept of ethics.
Hi Rohin, thank you so much for your feedback. I agree with everything you said and will try to update the post for clarity.
Sorry, that part was not well written (or well thought out), and so I’ll try to clarify:
What I meant by ‘is the NAH true for ethics?’ is ‘do sufficiently intelligent agents tend to converge on the same goals?’, which, now that I think about it, is just the negation of the orthogonality thesis.
I’m not sure I understand the tree realism post other than that a tree is a fuzzy category. While I am also fuzzy on the question of ‘what are my values’, that’s not the argument I’m trying to make.
I definitely think GPT-N will be able to answer questions about how humans would make ethical decisions, and wouldn’t be surprised if GPT-3 already performs fairly well at this.
Thanks for pointing that out, I hadn’t read that comment.
Hm, yeah sorry for that poor reasoning, I think I should qualify that more. I do think that the default right now is that sufficient countermeasures are likely to not be deployed, but that point definitely deserves to be scrutinized more by me.
Ah, got it, that makes sense. The reason I was confused is that NAH applied to ethics would only say that the AI system has a concept of ethics similar to the ones humans have; it wouldn’t claim that the AI system would be motivated by that concept of ethics.