I agree that the “typical tools developed around the study of human psychology are vastly less accurate than necessary to do the job”, but it still seems like figuring out what humans value is a problem of human psychology. I don’t see how theoretical physics has anything to do with it.
Whether it’s a “problem of human psychology” is a question of assigning an area-of-study label to the problem. The area-of-study characteristic doesn’t seem to particularly help with finding methods appropriate for solving the problem in this case. So I propose to focus on the other characteristics of the problem, namely the necessary rigor in an acceptable solution and the potential difficulty of the concepts necessary to formulate the solution (in the study of a real-world phenomenon). These characteristics match mathematics and physics best (probably more mathematics than physics).
I would expect all FAI team members to have strong math skills in addition to whatever other background they may have, and I expect them to approach the psychological aspects of the problem with greater rigor than is typical of mainstream psychology, and that their math backgrounds will contribute to this. But I think that mainstream psychology would be of some use to them, even if just to provide some concepts to be explored more rigorously.
the potential difficulty of the concepts necessary to formulate the solution
As I see it, there might be considerable difficulty of concepts in formulating even the exact problem statement. For instance, given that we want a ‘friendly’ AI; our problem statement very much depends on our notion of friendliness; hence the necessity of including psychology.
Going further, considering that SI aims to minimize AI risk, we need to be clear on which AI behavior is said to constitute a ‘risk’. If I remember correctly, the AI in the movie “I-robot” inevitably concludes that killing the human race is the only way to save the planet. The definition of risk in such a scenario is a very delicate problem.
I agree that the “typical tools developed around the study of human psychology are vastly less accurate than necessary to do the job”, but it still seems like figuring out what humans value is a problem of human psychology. I don’t see how theoretical physics has anything to do with it.
Whether it’s a “problem of human psychology” is a question of assigning an area-of-study label to the problem. The area-of-study characteristic doesn’t seem to particularly help with finding methods appropriate for solving the problem in this case. So I propose to focus on the other characteristics of the problem, namely the necessary rigor in an acceptable solution and the potential difficulty of the concepts necessary to formulate the solution (in the study of a real-world phenomenon). These characteristics match mathematics and physics best (probably more mathematics than physics).
I would expect all FAI team members to have strong math skills in addition to whatever other background they may have, and I expect them to approach the psychological aspects of the problem with greater rigor than is typical of mainstream psychology, and that their math backgrounds will contribute to this. But I think that mainstream psychology would be of some use to them, even if just to provide some concepts to be explored more rigorously.
As I see it, there might be considerable difficulty of concepts in formulating even the exact problem statement. For instance, given that we want a ‘friendly’ AI; our problem statement very much depends on our notion of friendliness; hence the necessity of including psychology.
Going further, considering that SI aims to minimize AI risk, we need to be clear on which AI behavior is said to constitute a ‘risk’. If I remember correctly, the AI in the movie “I-robot” inevitably concludes that killing the human race is the only way to save the planet. The definition of risk in such a scenario is a very delicate problem.