This requires hitting a window—our data needs to be good enough that the system can tell it should use human values as a proxy, but bad enough that the system can’t figure out the specifics of the data-collection process enough to model it directly. This window may not even exist.
are there any real world examples of this? not necessarily in human-values setting
are there any real world examples of this? not necessarily in human-values setting