beren comments on The Computational Anatomy of Human Values

beren 8 Apr 2023 11:11 UTC
2 points
0
Nice. My main issue is that just because humans have values a certain way, doesn’t mean we want to build an AI that way, and so I’d draw pretty different implications for alignment. I’m pessimistic about anything that even resembles “make an AI that’s like a human child,” and more interested in “use a model of a human child to help an inhuman AI understand humans in the way we want.”

I pretty much agree with this sentiment. I don’t literally think we should build AGI like a human and expect it to be aligned. Humans themselves are far from aligned enough for my taste! However, trying to understand how human values and their value learning system works is extremely important and undoubtedly has lessons for how to align brain-like AGI systems which I think are what we will end up with in the near-term.