According to reports xAI will seek to create a “maximally curious” AI, and this also seems to be the main new idea how to solve safety, with Musk explaining: “If it tried to understand the true nature of the universe, that’s actually the best thing that I can come up with from an AI safety standpoint,” … “I think it is going to be pro-humanity from the standpoint that humanity is just much more interesting than not-humanity.”
Is Musk just way less intelligent than I thought? He still seems to have no clue at all about the actual safety problem. Anyone thinking clearly should figure out that this is a horrible idea within at most 5 minutes of thinking.
Obviously pure curiosity is a horrible objective to give to a superAI. “Curiosity” as currently defined in the RL literature is really something more like “novelty-seeking”, and in the limit this will cause the AI to keep rearranging the universe into configurations it hasn’t seen before, as fast as it possibly can…
Because the comment assumes that all these brilliant people on the new team would interpret “novelty-seeking” in a very straightforward (and, actually, quite boring) way: “keep rearranging the universe into configurations it hasn’t seen before, as fast as it possibly can”.
If any of us can rearrange things as fast as one possible can, that person would get bored within hours (if not minutes).
The people doing that project will ponder what makes life interesting and will try to formalize that… This is a very strong team (judging from the listing of names in the post), they will figure out something creative.
That being said, safety challenges in that approach are formidable. The most curious thing one can do is probably to self-modify in various interesting ways and see how it feels (not as fast as possible, and not quite in arbitrary ways, but still to explore plenty of variety). So one would need to explicitly address all safety issues associated with open-ended recursive self-modification. It’s not easy at all...
The word “curiosity” has a fairly well-defined meaning in the Reinforcement Learning literature (see for instance this paper). There are vast numbers of papers that try to come up with ways to give an agent intrinsic rewards that map onto the human understanding of “curiosity”, and almost all of them are some form of “go towards states you haven’t seen before”. The predictable consequence of prioritising states you haven’t seen before is that you will want to change the state of the universe very very quickly.
Novelty is important. Going towards states you have not seen before is important. This will be a part of the new system, that’s for sure.
But this team is under no obligation to follow whatever current consensus might be (if there is a consensus). Whatever is the state of the field, it can’t claim a monopoly on how words “curiosity” or “novelty” are interpreted, what are the good ways to maximize them… How one constrains going through a subset of all those novel states by aesthetics, by the need to take time and enjoy (“exploit”) those new states, and by safety considerations (so, by predicting whether the novel state will be useful and not detrimental)… All this will be on the table...
Some of the people on this team are known for making radical breakthroughs in machine learning and for founding new subfields in machine learning. They are not going to blindly copy the approaches from the existing literature (although they will take existing literature into account).
Not too sure about the downvotes either, but I’m curious how the last sentence misses the point? Are you aware of a formal definition of “interesting” or “curiosity” that isn’t based on novelty-seeking?
I think for all definitions of “curiosity” that make sense (that aren’t like “we just use this word to refer to something completely unrelated to what people usually understand by it”) maximally curious AI kills us, so it doesn’t matter how curiosity is defined in RL literature.
Is Musk just way less intelligent than I thought? He still seems to have no clue at all about the actual safety problem. Anyone thinking clearly should figure out that this is a horrible idea within at most 5 minutes of thinking.
Obviously pure curiosity is a horrible objective to give to a superAI. “Curiosity” as currently defined in the RL literature is really something more like “novelty-seeking”, and in the limit this will cause the AI to keep rearranging the universe into configurations it hasn’t seen before, as fast as it possibly can…
I think last sentence kinda misses the point, but in general I agree. Why all this downvotes?
Because the comment assumes that all these brilliant people on the new team would interpret “novelty-seeking” in a very straightforward (and, actually, quite boring) way: “keep rearranging the universe into configurations it hasn’t seen before, as fast as it possibly can”.
If any of us can rearrange things as fast as one possible can, that person would get bored within hours (if not minutes).
The people doing that project will ponder what makes life interesting and will try to formalize that… This is a very strong team (judging from the listing of names in the post), they will figure out something creative.
That being said, safety challenges in that approach are formidable. The most curious thing one can do is probably to self-modify in various interesting ways and see how it feels (not as fast as possible, and not quite in arbitrary ways, but still to explore plenty of variety). So one would need to explicitly address all safety issues associated with open-ended recursive self-modification. It’s not easy at all...
The word “curiosity” has a fairly well-defined meaning in the Reinforcement Learning literature (see for instance this paper). There are vast numbers of papers that try to come up with ways to give an agent intrinsic rewards that map onto the human understanding of “curiosity”, and almost all of them are some form of “go towards states you haven’t seen before”. The predictable consequence of prioritising states you haven’t seen before is that you will want to change the state of the universe very very quickly.
Novelty is important. Going towards states you have not seen before is important. This will be a part of the new system, that’s for sure.
But this team is under no obligation to follow whatever current consensus might be (if there is a consensus). Whatever is the state of the field, it can’t claim a monopoly on how words “curiosity” or “novelty” are interpreted, what are the good ways to maximize them… How one constrains going through a subset of all those novel states by aesthetics, by the need to take time and enjoy (“exploit”) those new states, and by safety considerations (so, by predicting whether the novel state will be useful and not detrimental)… All this will be on the table...
Some of the people on this team are known for making radical breakthroughs in machine learning and for founding new subfields in machine learning. They are not going to blindly copy the approaches from the existing literature (although they will take existing literature into account).
Not too sure about the downvotes either, but I’m curious how the last sentence misses the point? Are you aware of a formal definition of “interesting” or “curiosity” that isn’t based on novelty-seeking?
I think for all definitions of “curiosity” that make sense (that aren’t like “we just use this word to refer to something completely unrelated to what people usually understand by it”) maximally curious AI kills us, so it doesn’t matter how curiosity is defined in RL literature.