The threats problem seems like a specific case of problems that might arise by putting real intelligence in to the agents in the system. Especially if this moral theory was being run on a superintelligent AI, it seems like the agents might be able to come up with all sorts of creative unexpected stuff. And I’m doubtful that creative unexpected stuff would make the parliament’s decisions more isomorphic to the “right answer”.
One way to solve this problem might be to drop any notion of “intelligence” in the delegates and instead specific a deterministic algorithm that any individual delegate follows in deciding which “deals” they accept. Or take the same idea even further and specify a deterministic algorithm for resolving moral uncertainty that is merely inspired by the function of parliaments, in the same sense that the stable marriage problem and algorithms for solving it could have been inspired by the way people decide who to marry.
Eliezer’s notion of a “right answer” sounds appealing, but I’m a little skeptical. In computer science, it’s possible to prove that a particular algorithm, when run, will always achieve the maximal “score” on a criterion it’s attempting to optimize. But in this case, if we could formalize a score we wanted to optimize for, that would be equivalent to solving the problem! That’s not to say this is a bad angle of approach, however… it may be useful to take the idea of a parliament and use it to formalize a scoring system that captures our intuitions about how different moral theories trade off and then maximize this score using whatever method seems to work best. For example waves hands perhaps we could score the total regret of our parliamentarians and minimize that.
Another approach might be to formalize a set of criteria that a good solution to the problem of moral uncertainty should achieve and then set out to design an algorithm that achieves all of these criteria. In other words, making a formal problem description that’s more like that of the stable marriage problem and less like that of the assignment problem.
So one plan of attack on the moral uncertainty problem might be:
Generate a bunch of “problem descriptions” for moral uncertainty that specify a set of criteria to satisfy/optimize.
Figure out which “problem description” best fits our intuitions about how moral uncertainty should be solved.
Find an algorithm that provably solves the problem as specified in its description.
The threats problem seems like a specific case of problems that might arise by putting real intelligence in to the agents in the system. Especially if this moral theory was being run on a superintelligent AI, it seems like the agents might be able to come up with all sorts of creative unexpected stuff. And I’m doubtful that creative unexpected stuff would make the parliament’s decisions more isomorphic to the “right answer”.
One way to solve this problem might be to drop any notion of “intelligence” in the delegates and instead specific a deterministic algorithm that any individual delegate follows in deciding which “deals” they accept. Or take the same idea even further and specify a deterministic algorithm for resolving moral uncertainty that is merely inspired by the function of parliaments, in the same sense that the stable marriage problem and algorithms for solving it could have been inspired by the way people decide who to marry.
Eliezer’s notion of a “right answer” sounds appealing, but I’m a little skeptical. In computer science, it’s possible to prove that a particular algorithm, when run, will always achieve the maximal “score” on a criterion it’s attempting to optimize. But in this case, if we could formalize a score we wanted to optimize for, that would be equivalent to solving the problem! That’s not to say this is a bad angle of approach, however… it may be useful to take the idea of a parliament and use it to formalize a scoring system that captures our intuitions about how different moral theories trade off and then maximize this score using whatever method seems to work best. For example waves hands perhaps we could score the total regret of our parliamentarians and minimize that.
Another approach might be to formalize a set of criteria that a good solution to the problem of moral uncertainty should achieve and then set out to design an algorithm that achieves all of these criteria. In other words, making a formal problem description that’s more like that of the stable marriage problem and less like that of the assignment problem.
So one plan of attack on the moral uncertainty problem might be:
Generate a bunch of “problem descriptions” for moral uncertainty that specify a set of criteria to satisfy/optimize.
Figure out which “problem description” best fits our intuitions about how moral uncertainty should be solved.
Find an algorithm that provably solves the problem as specified in its description.