a research AI that will train agents based on its understanding of described benchmarks does sound interesting. Although how it would get ahold of things like ‘human feedback’, and craft setups for that, isn’t clear.*
Trying to create a setup so AIs can learn from other AIs (crafting rewards—seems unlikely. Expert demonstrations—might be doable. Whether this would be more or less useful than a human demonstration, I’m not sure. There actually might be the option to ask ‘what would _ do in this situation’ if you can actually run _.)
Edited to add:
*First I was imagining Skynet. Now I’m imagining a really weird license agreement: you may use this AI in exchange for [$ + billing schedule], but with the data that’s trained on (even if seemingly totally unrelated), you must also have it try working on this AI benchmark, (frequency based on usage), and publicly share the score (but not necessarily pictures of outputs—there’s the risk of user data being leaked, by means of being recreated in minecraft, and the user retains full responsibility for the security of such data, and is encouraged but not required to run it ‘offline’, i.e. not on Minecraft servers).
In general:
a research AI that will train agents based on its understanding of described benchmarks does sound interesting. Although how it would get ahold of things like ‘human feedback’, and craft setups for that, isn’t clear.*
Trying to create a setup so AIs can learn from other AIs (crafting rewards—seems unlikely. Expert demonstrations—might be doable. Whether this would be more or less useful than a human demonstration, I’m not sure. There actually might be the option to ask ‘what would _ do in this situation’ if you can actually run _.)
Edited to add:
*First I was imagining Skynet. Now I’m imagining a really weird license agreement: you may use this AI in exchange for [$ + billing schedule], but with the data that’s trained on (even if seemingly totally unrelated), you must also have it try working on this AI benchmark, (frequency based on usage), and publicly share the score (but not necessarily pictures of outputs—there’s the risk of user data being leaked, by means of being recreated in minecraft, and the user retains full responsibility for the security of such data, and is encouraged but not required to run it ‘offline’, i.e. not on Minecraft servers).