Connor Leahy comments on Testing The Natural Abstraction Hypothesis: Project Intro

Connor Leahy 8 Apr 2021 11:47 UTC
4 points
I am so excited about this research, good luck! I think it’s almost impossible this won’t turn up at least some interesting partial results, even if the strong versions of the hypothesis don’t work out (my guess would be you run into some kind of incomputability or incoherence results in finding an algorithm that works for every environment).
This is one of the research directions that make me the most optimistic that alignment might really be tractable!