Like Wei, I’m similarly in favor of research in this direction. I suspect we need, for example, an adequate theory of human values so that we can construct and, more importantly, verify aligned AI, but right now we are so confused about human values I’m not sure we could even tell if an AI was aligned or not.
I have a lot of developing thoughts in this area that have moved beyond what I was thinking the last time I tried to write up my thinking in this area a couple years ago. I’m not sure what I’ll find time for in the coming months or if I’ll solidify my ideas enough for them to be in a shareable state, but happy to talk more if you’re interested in pursuing this direction.
Like Wei, I’m similarly in favor of research in this direction. I suspect we need, for example, an adequate theory of human values so that we can construct and, more importantly, verify aligned AI, but right now we are so confused about human values I’m not sure we could even tell if an AI was aligned or not.
I have a lot of developing thoughts in this area that have moved beyond what I was thinking the last time I tried to write up my thinking in this area a couple years ago. I’m not sure what I’ll find time for in the coming months or if I’ll solidify my ideas enough for them to be in a shareable state, but happy to talk more if you’re interested in pursuing this direction.
Sure, I’m happy to read/discuss your ideas about this topic.