Daniel Kokotajlo comments on BASALT: A Benchmark for Learning from Human Feedback

Daniel Kokotajlo 9 Jul 2021 11:05 UTC
LW: 9 AF: 6
AF
Going from zero to “produce an AI that learns the task entirely from demonstrations and/or natural language description” is really hard for the modern AI research hive mind. You have to instead give it a shaped reward, breadcrumbs along the way that are easier, (such as allowing handcrafted heuristics and such, and allowing knowledge of a particular target task) to get the hive mind started making progress.
- Pattern 9 Jul 2021 17:19 UTC
  4 points
  Parent
  To clarify, by hive mind, do you mean humans?
  - Daniel Kokotajlo 9 Jul 2021 17:58 UTC
    2 points
    Parent
    Yes. I’m being a bit whimsical here, I’m tickled by the analogy to training neural nets.
    - Pattern 9 Jul 2021 20:17 UTC
      2 points
      Parent
      In general:
      a research AI that will train agents based on its understanding of described benchmarks does sound interesting. Although how it would get ahold of things like ‘human feedback’, and craft setups for that, isn’t clear.*
      Trying to create a setup so AIs can learn from other AIs (crafting rewards—seems unlikely. Expert demonstrations—might be doable. Whether this would be more or less useful than a human demonstration, I’m not sure. There actually might be the option to ask ‘what would _ do in this situation’ if you can actually run _.)
      
      Edited to add:
      *First I was imagining Skynet. Now I’m imagining a really weird license agreement: you may use this AI in exchange for [$ + billing schedule], but with the data that’s trained on (even if seemingly totally unrelated), you must also have it try working on this AI benchmark, (frequency based on usage), and publicly share the score (but not necessarily pictures of outputs—there’s the risk of user data being leaked, by means of being recreated in minecraft, and the user retains full responsibility for the security of such data, and is encouraged but not required to run it ‘offline’, i.e. not on Minecraft servers).
- Vanessa Kosoy 9 Jul 2021 21:17 UTC
  LW: 2 AF: 1
  AF Parent
  It’s not “from zero” though, I think that we already have ML techniques that should be applicable here.