This is the first step (pointed more towards philosophers). Formalise the “we could construct an AI with arbitrary goals”, and with that in the background, zoom in on the practical arguments with the AI researchers.
Will restructure the Bayesian section. Some philosophers argue things like “we don’t know what moral theories are true, but a rational being would certainly find them”; I want to argue that this is equivalent, from our perspective, with the AI’s goals ending up anywhere. What I meant to say is that ignorance of this type is like any other type of ignorance, hence the “Bayesian” terminology.
This is the first step (pointed more towards philosophers). Formalise the “we could construct an AI with arbitrary goals”, and with that in the background, zoom in on the practical arguments with the AI researchers.
Ok, in that case I would just be wary about people being tempted to cite the paper to AI researchers without having the followup arguments in place, who would then think that their debating/discussion partners are attacking a strawman.
Thanks. To go back to my original point a bit, how useful is it to debate philosophers about this? (When debating AI researchers, given that they probably have a limited appetite for reading papers arguing that what they’re doing is dangerous, it seems like it would be better to skip this paper and give the practical arguments directly.)
Maybe I’ve spent too much time around philosophers—but there are some AI designers who seem to spout weak arguments like that, and this paper can’t hurt. When we get a round to writing a proper justification for AI researchers, having this paper to refer back to avoids going over the same points again.
Plus, it’s a lot easier to write this paper first, and was good practice.
This is the first step (pointed more towards philosophers). Formalise the “we could construct an AI with arbitrary goals”, and with that in the background, zoom in on the practical arguments with the AI researchers.
Will restructure the Bayesian section. Some philosophers argue things like “we don’t know what moral theories are true, but a rational being would certainly find them”; I want to argue that this is equivalent, from our perspective, with the AI’s goals ending up anywhere. What I meant to say is that ignorance of this type is like any other type of ignorance, hence the “Bayesian” terminology.
Ok, in that case I would just be wary about people being tempted to cite the paper to AI researchers without having the followup arguments in place, who would then think that their debating/discussion partners are attacking a strawman.
Hum, good point; I’ll try and put in some disclaimer, emphasising that this is a partial result...
Thanks. To go back to my original point a bit, how useful is it to debate philosophers about this? (When debating AI researchers, given that they probably have a limited appetite for reading papers arguing that what they’re doing is dangerous, it seems like it would be better to skip this paper and give the practical arguments directly.)
Maybe I’ve spent too much time around philosophers—but there are some AI designers who seem to spout weak arguments like that, and this paper can’t hurt. When we get a round to writing a proper justification for AI researchers, having this paper to refer back to avoids going over the same points again.
Plus, it’s a lot easier to write this paper first, and was good practice.