Steven Byrnes comments on Four Motivations for Learning Normativity

Steven Byrnes 12 Mar 2021 10:59 UTC
LW: 4 AF: 3
AF

To be meaningful, this requires whole-process feedback: we need to judge thoughts by their entire chain of origination. (This is technically challenging, because the easiest way to implement process-level feedback is to create a separate meta-level which oversees the rest of the system; but then this meta-level would not itself be subject to oversight.)

I thought you were going to say it’s technically challenging because you need transparency / intepretability … At least in human cognition (and logical induction too right?) thoughts-about-stuff and thoughts-about-thoughts-about-stuff and thoughts-about-thoughts-about-thoughts-about-stuff and thoughts-about-all-levels and so on are all mixed together in a big pot, and they share the same data type, and they’re all inside the learned black box.
- abramdemski 12 Mar 2021 16:09 UTC
  LW: 6 AF: 5
  AF Parent
  Well, transparency is definitely a challenge. I’m mostly saying this is a technical challenge even if you have magical transparency tools, and I’m kind of trying to design the system you would want to use if you had magical transparency tools.
  But I don’t think it’s difficult for the reason you say. I don’t think multi-level feedback or whole-process feedback should be construed as requiring the levels to be sorted out nicely. Whole-process feedback in particular just means that you can give feedback on the whole chain of computation; it’s basically against sorting into levels.
  Multi-level feedback means, to me, that if we have an insight about, EG, how to think about value uncertainty (which is something like a 3rd-level thought: 1st level is information about object level; 2nd level is information about the value function; 3rd level is information about how to learn the value function), we can give the system feedback about that. So the system doesn’t need to sort things out into levels; it just needs to be capable of accepting feedback of each type.