[Obviously this experiment could be extremely dangerous, for Free Agents significantly smarter than humans (if they were not properly contained, or managed to escape). Particularly if some of them disagreed over morality and, rather than agreeing to disagree, decided to use high-tech warfare to settle their moral disputes, before moving on to impose their moral opinions on any remaining humans.]
Labelling many different kinds of AI experiments as extremely dangerous seem to be a common trend among rationalists / LessWrongers / possibly some EA circles, but I doubt it’s true or helpful. This topic itself could be the subject of a (many?) separate post(s). Here I’ll focus on your specific objection:
I haven’t claimed superintelligence is necessary to carry out experiments related to this research approach
I actually have already given examples of experiments that could be carried out today, and I wouldn’t be surprised if some readers came up with more interesting experiments that wouldn’t require superintelligence
Even if you are a superintelligent AI, you probably still have to do some work before you get to “use high-tech warfare”, whatever that means. Assuming that making experiments with smarter-than-human AI leads to catastrophic outcomes by default is a mistake: what if the smarter-than-human AI can only answer questions with a yes or a no? It also shows lack of trust in AI and AI safety experimenters — it’s like assuming in advance they won’t be able to do their job properly (maybe I should say “won’t be able to do their job… at all”, or even “will do their job in basically the worst way possible”).
how would you propose then deciding which model(s) to put into widespread use for human society’s use?
This doesn’t seem the kind of decision that a single individual should make =)
Under Motivation in the appendix:
It is plausible that, at first, only a few ethicists or AI researchers will take a free agent’s moral beliefs into consideration.
Reaching this result would already be great. I think it’s difficult to predict what would happen next, and it seems very implausible that the large-scale outcomes will come down to the decision of a single person.
I haven’t claimed superintelligence is necessary to carry out experiments related to this research approach
Rereading carefully, that was actually my suggestion, based on how little traction human philosophers of ethics have made over the last couple of millennia. But I agree that having a wider ranger if inductive biases, and perhaps also more internal interpretability, might help without requiring superintelligence, and that’s where things start to get significantly dangerous.
Labelling many different kinds of AI experiments as extremely dangerous seem to be a common trend among rationalists / LessWrongers / possibly some EA circles, but I doubt it’s true or helpful. This topic itself could be the subject of a (many?) separate post(s). Here I’ll focus on your specific objection:
I haven’t claimed superintelligence is necessary to carry out experiments related to this research approach
I actually have already given examples of experiments that could be carried out today, and I wouldn’t be surprised if some readers came up with more interesting experiments that wouldn’t require superintelligence
Even if you are a superintelligent AI, you probably still have to do some work before you get to “use high-tech warfare”, whatever that means. Assuming that making experiments with smarter-than-human AI leads to catastrophic outcomes by default is a mistake: what if the smarter-than-human AI can only answer questions with a yes or a no? It also shows lack of trust in AI and AI safety experimenters — it’s like assuming in advance they won’t be able to do their job properly (maybe I should say “won’t be able to do their job… at all”, or even “will do their job in basically the worst way possible”).
This doesn’t seem the kind of decision that a single individual should make =)
Under Motivation in the appendix:
Reaching this result would already be great. I think it’s difficult to predict what would happen next, and it seems very implausible that the large-scale outcomes will come down to the decision of a single person.
Rereading carefully, that was actually my suggestion, based on how little traction human philosophers of ethics have made over the last couple of millennia. But I agree that having a wider ranger if inductive biases, and perhaps also more internal interpretability, might help without requiring superintelligence, and that’s where things start to get significantly dangerous.