I am very very vaguely in the Natural Abstractions area of alignment approaches. I’ll give this paper a closer read tomorrow (because I promised myself I wouldn’t try to get work done today) but my quick quick take is—it’d be huge if true, but there’s not much more than that there yet, and it also has no argument that even if representations are converging for now, that it’ll never be true that (say) adding a whole bunch more effectively-usable compute means that the AI no longer has to chunk objectspace into subtypes rather than understanding every individual object directly.
I am very very vaguely in the Natural Abstractions area of alignment approaches. I’ll give this paper a closer read tomorrow (because I promised myself I wouldn’t try to get work done today) but my quick quick take is—it’d be huge if true, but there’s not much more than that there yet, and it also has no argument that even if representations are converging for now, that it’ll never be true that (say) adding a whole bunch more effectively-usable compute means that the AI no longer has to chunk objectspace into subtypes rather than understanding every individual object directly.