In last year’s update, you suspected that alignment was gradually converging towards a paradigm. What do you think is the state of the paradigmatic convergence now?
Also as @Chris_Leong asked, does using sparse autoencoders to find monosemantic neurons help find natural abstractions? Or is that still Choosing The Ontology? What, if not these types of concepts, are you thinking natural abstractions are/will be?
Thanks for the update! I have a few questions:
In last year’s update, you suspected that alignment was gradually converging towards a paradigm. What do you think is the state of the paradigmatic convergence now?
Also as @Chris_Leong asked, does using sparse autoencoders to find monosemantic neurons help find natural abstractions? Or is that still Choosing The Ontology? What, if not these types of concepts, are you thinking natural abstractions are/will be?