Rohin Shah comments on Plausible cases for HRAD work, and locating the crux in the “realism about rationality” debate

Rohin Shah 29 Jun 2020 2:10 UTC
LW: 2 AF: 2
AF
One way to reject this case for HRAD work is by saying that imprecise theories of rationality are insufficient for helping to align AI systems. This is what Rohin does in this comment where he says imprecise theories cannot build things “2+ levels above”.
I should note that there are some things in world 1 that I wouldn’t reject this way—e.g. one of the examples of deconfusion is “anyhow, we could just unplug [the AGI].” That is directly talking about AGI safety, and so deconfusion on that point is “1 level away” from the systems we actually build, and isn’t subject to the critique. (And indeed, I think it is important and great that this statement has been deconfused!)
It is my impression though that current HRAD work is not “directly talking about AGI safety”, and is instead talking about things that are “further away”, to which I would apply the critique.