Going from zero to “produce an AI that learns the task entirely from demonstrations and/or natural language description” is really hard for the modern AI research hive mind. You have to instead give it a shaped reward, breadcrumbs along the way that are easier, (such as allowing handcrafted heuristics and such, and allowing knowledge of a particular target task) to get the hive mind started making progress.
a research AI that will train agents based on its understanding of described benchmarks does sound interesting. Although how it would get ahold of things like ‘human feedback’, and craft setups for that, isn’t clear.*
Trying to create a setup so AIs can learn from other AIs (crafting rewards—seems unlikely. Expert demonstrations—might be doable. Whether this would be more or less useful than a human demonstration, I’m not sure. There actually might be the option to ask ‘what would _ do in this situation’ if you can actually run _.)
Edited to add:
*First I was imagining Skynet. Now I’m imagining a really weird license agreement: you may use this AI in exchange for [$ + billing schedule], but with the data that’s trained on (even if seemingly totally unrelated), you must also have it try working on this AI benchmark, (frequency based on usage), and publicly share the score (but not necessarily pictures of outputs—there’s the risk of user data being leaked, by means of being recreated in minecraft, and the user retains full responsibility for the security of such data, and is encouraged but not required to run it ‘offline’, i.e. not on Minecraft servers).
Going from zero to “produce an AI that learns the task entirely from demonstrations and/or natural language description” is really hard for the modern AI research hive mind. You have to instead give it a shaped reward, breadcrumbs along the way that are easier, (such as allowing handcrafted heuristics and such, and allowing knowledge of a particular target task) to get the hive mind started making progress.
To clarify, by hive mind, do you mean humans?
Yes. I’m being a bit whimsical here, I’m tickled by the analogy to training neural nets.
In general:
a research AI that will train agents based on its understanding of described benchmarks does sound interesting. Although how it would get ahold of things like ‘human feedback’, and craft setups for that, isn’t clear.*
Trying to create a setup so AIs can learn from other AIs (crafting rewards—seems unlikely. Expert demonstrations—might be doable. Whether this would be more or less useful than a human demonstration, I’m not sure. There actually might be the option to ask ‘what would _ do in this situation’ if you can actually run _.)
Edited to add:
*First I was imagining Skynet. Now I’m imagining a really weird license agreement: you may use this AI in exchange for [$ + billing schedule], but with the data that’s trained on (even if seemingly totally unrelated), you must also have it try working on this AI benchmark, (frequency based on usage), and publicly share the score (but not necessarily pictures of outputs—there’s the risk of user data being leaked, by means of being recreated in minecraft, and the user retains full responsibility for the security of such data, and is encouraged but not required to run it ‘offline’, i.e. not on Minecraft servers).
It’s not “from zero” though, I think that we already have ML techniques that should be applicable here.