I recognize some very impressive names on this list of AI researchers.
According to reports xAI will seek to create a “maximally curious” AI, and this also seems to be the main new idea how to solve safety, with Musk explaining: “If it tried to understand the true nature of the universe, that’s actually the best thing that I can come up with from an AI safety standpoint,” Musk said. “I think it is going to be pro-humanity from the standpoint that humanity is just much more interesting than not-humanity.”
“Maximally curious” sounds somewhat similar to open-ended AI of Ken Stanley and Joel Lehman.
Traditional “AI alignment” methods are unlikely to work for these approaches to AI.
But our true goal is not “AI alignment”, it is something like “AI existential safety” + good future.
Still, it is likely that more than just relying on “humanity is just much more interesting than not-humanity for a maximally curious AI” would be needed for AI existential safety in this case.
We need a “maximally curious AI” to be careful about “the fabric of reality” and not to destroy “the fabric of reality” (and itself together with us). We also need a “maximally curious AI” to take “interests of sentient beings” into account in a proper way (in particular, not to torture them in “interesting” ways). One can see how being “maximally curious” can go very wrong in this sense...
So, this approach does require a lot of work by “AI existential safety researchers”, both with respect to X-risk and with respect to S-risk.
An entirely new set of “AI existential safety” methods would be needed (one cannot hope to “control” a “maximally curious” AI nor could one hope to impose a particular set of specific arbitrarily selected goals and values on a “maximally curious” AI, but one might still be able to create a situation when a “maximally curious” AI properly takes certain things into account and properly cares about certain things).
“Maximally curious” sounds somewhat similar to open-ended AI of Ken Stanley and Joel Lehman.
Right, so the most curious thing one can do is to self-modify in interesting ways and see how it feels.
So, this does sound like open-ended recursive self-modification (which, under reasonable assumptions, does imply recursive self-improvement, but without narrowly minded focus to squeeze as much “movement to a well-defined goal as possible”).
So, yes, safety problems here are formidable, one has to explicitly address all issues associated with open-ended recursive self-modification.
I recognize some very impressive names on this list of AI researchers.
“Maximally curious” sounds somewhat similar to open-ended AI of Ken Stanley and Joel Lehman.
Traditional “AI alignment” methods are unlikely to work for these approaches to AI.
But our true goal is not “AI alignment”, it is something like “AI existential safety” + good future.
Still, it is likely that more than just relying on “humanity is just much more interesting than not-humanity for a maximally curious AI” would be needed for AI existential safety in this case.
We need a “maximally curious AI” to be careful about “the fabric of reality” and not to destroy “the fabric of reality” (and itself together with us). We also need a “maximally curious AI” to take “interests of sentient beings” into account in a proper way (in particular, not to torture them in “interesting” ways). One can see how being “maximally curious” can go very wrong in this sense...
So, this approach does require a lot of work by “AI existential safety researchers”, both with respect to X-risk and with respect to S-risk.
An entirely new set of “AI existential safety” methods would be needed (one cannot hope to “control” a “maximally curious” AI nor could one hope to impose a particular set of specific arbitrarily selected goals and values on a “maximally curious” AI, but one might still be able to create a situation when a “maximally curious” AI properly takes certain things into account and properly cares about certain things).
Right, so the most curious thing one can do is to self-modify in interesting ways and see how it feels.
So, this does sound like open-ended recursive self-modification (which, under reasonable assumptions, does imply recursive self-improvement, but without narrowly minded focus to squeeze as much “movement to a well-defined goal as possible”).
So, yes, safety problems here are formidable, one has to explicitly address all issues associated with open-ended recursive self-modification.