Thomas Larsen comments on (My understanding of) What Everyone in Technical Alignment is Doing and Why

Thomas Larsen 30 Aug 2022 1:15 UTC
4 points
0
Hi Rohin, thank you so much for your feedback. I agree with everything you said and will try to update the post for clarity.
I don’t follow.
Sorry, that part was not well written (or well thought out), and so I’ll try to clarify:
What I meant by ‘is the NAH true for ethics?’ is ‘do sufficiently intelligent agents tend to converge on the same goals?’, which, now that I think about it, is just the negation of the orthogonality thesis.
- I’m not sure I understand the tree realism post other than that a tree is a fuzzy category. While I am also fuzzy on the question of ‘what are my values’, that’s not the argument I’m trying to make.
- I definitely think GPT-N will be able to answer questions about how humans would make ethical decisions, and wouldn’t be surprised if GPT-3 already performs fairly well at this.
Truthful AI
The authors don’t view Truthful AI as a solution to alignment.
Thanks for pointing that out, I hadn’t read that comment.
I object to the implication that the linked post argues for this claim: the “without specific countermeasures” part of that post does a lot of work.
Hm, yeah sorry for that poor reasoning, I think I should qualify that more. I do think that the default right now is that sufficient countermeasures are likely to not be deployed, but that point definitely deserves to be scrutinized more by me.
- Rohin Shah 30 Aug 2022 7:35 UTC
  10 points
  3
  Parent
  What I meant by ‘is the NAH true for ethics?’ is ‘do sufficiently intelligent agents tend to converge on the same goals?’, which, now that I think about it, is just the negation of the orthogonality thesis.
  Ah, got it, that makes sense. The reason I was confused is that NAH applied to ethics would only say that the AI system has a concept of ethics similar to the ones humans have; it wouldn’t claim that the AI system would be motivated by that concept of ethics.