Pattern comments on DL towards the unaligned Recursive Self-Optimization attractor

Pattern 2 Jan 2022 6:26 UTC
2 points
I’m not actually convinced that interpretability is doomed—in the OP I was exploring something of a worst case possibility.
Might be useful to mark this in the post. Perhaps a comment at the beginning about this post exploring a model, and its implications.