Brain like AGI safety
Shard Theory
Iterated Amplification
Much of interpretability work
Possibly Pragmatic AI Safety, idk much about it.
The selection theorems branch of research
The particular selection theorem case of modularity
Thanks! By interpretability work, you mean in the vein of Colah and the like?
Yes
Brain like AGI safety
Shard Theory
Iterated Amplification
Much of interpretability work
Possibly Pragmatic AI Safety, idk much about it.
The selection theorems branch of research
The particular selection theorem case of modularity
Thanks! By interpretability work, you mean in the vein of Colah and the like?
Yes