Steven Byrnes comments on Alignment as Translation

Steven Byrnes 20 Mar 2020 12:03 UTC
LW: 5 AF: 2
0
AF
I’m not sure why your default assumption is that the AGI’s understanding of the world is at a “low level”. My default assumption would be that it would develop a predictive world-model with entities that are at many different levels at once, sorta like humans do. (Or is that just a toy example to illustrate what you’re talking about?)
- johnswentworth 20 Mar 2020 17:51 UTC
  LW: 4 AF: 2
  0
  AF Parent
  I do expect that systems trained with limited information/compute will often learn multi-level models. That said, there’s a few reasons why low-level is still the right translation target to think about.
  First, there’s the argument from the beginning of the OP: in the limit of abundant information & compute, there’s no need for multi-level models; just directly modelling the low-level will have better predictive power. That’s a fairly general argument, which applies even beyond AI, so it’s useful to keep in mind.
  But the main reason to treat low-level as the translation target is: assuming an AI does use high-level models, translating into those models directly will only be easier than translating into the low level to the extent that the AI’s high-level models are similar to a human’s high-level models. We don’t have any reason to expect AI to use similar abstraction levels as humans except to the extent that those abstraction levels are determined by the low-level structure. In studying how to translate our own high-level models into low-level structure, we also learn when and to what extent an AI is likely to learn similar high-level structures, and what the correspondence looks like between ours and theirs.