I don’t think the failure of evolution is evidence that alignment is impossible or even hard.
Evolution wasn’t smart enough to realize that alignment was a problem. It put the retina backwards. Evolution can do really dumb things. The failure of evolution is consistent with a world where alignment takes 2 lines of python and is obvious to any smart human who gives the problem a few hours thought.
“Smart humans haven’t solved it yet” gives a much stronger lower bound on difficulty than evolutions failure. At least if alignment is the sort of problem best solved with general simple principles (where humans are better) as opposed to piling on the spaghetti code (where evolution can sometimes beat humans)
I don’t think the failure of evolution is evidence that alignment is impossible or even hard.
Evolution wasn’t smart enough to realize that alignment was a problem. It put the retina backwards. Evolution can do really dumb things. The failure of evolution is consistent with a world where alignment takes 2 lines of python and is obvious to any smart human who gives the problem a few hours thought.
“Smart humans haven’t solved it yet” gives a much stronger lower bound on difficulty than evolutions failure. At least if alignment is the sort of problem best solved with general simple principles (where humans are better) as opposed to piling on the spaghetti code (where evolution can sometimes beat humans)