He’s talking about “modern AI training” i.e. “giant, inscrutable matrices of floating-point numbers”. My impression is that he thinks it is possible (but extremely difficult) to build aligned ASI, but nearly impossible to bootstrap modern DL systems to alignment.
He’s talking about “modern AI training” i.e. “giant, inscrutable matrices of floating-point numbers”. My impression is that he thinks it is possible (but extremely difficult) to build aligned ASI, but nearly impossible to bootstrap modern DL systems to alignment.
Would you agree calling it “poorly defined” instead of “aligned” is an accurate phrasing for his argument or not? I edited the post.