I think making this post was a good idea. I’m personally interested in deconfusing the topic of universality (which basically should capture what “learning everything the model knows”), and you brought up a good “simple” example to try to build intuition on.
What I would call your mistake is a mostly 8, but a bit of the related ones (so 3 and 4?). Phrasing it as “can we do that” is a mistake in my opinion because the topic is very confused (as shown by the comments). On the other hand, I think asking the question of what it would mean is a very exciting problem. It also gives a more concrete form to the problem of deconfusing universality, which is important AFAIK to Paul’s approaches to alignment.
My take is:
I think making this post was a good idea. I’m personally interested in deconfusing the topic of universality (which basically should capture what “learning everything the model knows”), and you brought up a good “simple” example to try to build intuition on.
What I would call your mistake is a mostly 8, but a bit of the related ones (so 3 and 4?). Phrasing it as “can we do that” is a mistake in my opinion because the topic is very confused (as shown by the comments). On the other hand, I think asking the question of what it would mean is a very exciting problem. It also gives a more concrete form to the problem of deconfusing universality, which is important AFAIK to Paul’s approaches to alignment.