I’m less interested in robustness to scaling down than scaling up; if we can verify empirically that a relevant component is (well) above the capability level at which the overall system would fail, then I don’t see a strong reason for concern.
I’m ambivalent about robustness to relative scale, since it seems like you can often have strong empirical evidence about relative capabilities + pretty good theoretical reasons. As a particularly extreme example, my current scheme probably requires some assumption like “The model after N+1 gradient updates isn’t that much better than model after N gradient updates.” which I think is likely to be OK.
While I’m not an AI researcher so not sure which concerns are most relevant, I’ll say that “risk of scaling down” and “risk to relative scale” hadn’t even been on my radar as things to pay attention to, so having them succinctly given handles to refer to seemed handy.
I am worried that if you train both sides of playing an asymmetric game, you run into problems where you scale up at playing one side faster than playing the other side. This makes me think “The model after N+1 gradient updates isn’t that much better than model after N gradient updates.” is not enough of an assumption if you operationalize it in a way that ensures you are using the model to do the same thing in both cases, and if you don’t operationalize it in that way, it seem like too strong of an assumption.
I’m less interested in robustness to scaling down than scaling up; if we can verify empirically that a relevant component is (well) above the capability level at which the overall system would fail, then I don’t see a strong reason for concern.
I’m ambivalent about robustness to relative scale, since it seems like you can often have strong empirical evidence about relative capabilities + pretty good theoretical reasons. As a particularly extreme example, my current scheme probably requires some assumption like “The model after N+1 gradient updates isn’t that much better than model after N gradient updates.” which I think is likely to be OK.
While I’m not an AI researcher so not sure which concerns are most relevant, I’ll say that “risk of scaling down” and “risk to relative scale” hadn’t even been on my radar as things to pay attention to, so having them succinctly given handles to refer to seemed handy.
It does seem nice to group them and have clear handles.
ML researchers more often think about risks from scaling down and relative scale, since those come up more frequently (and are harder to fix) today.
I am worried that if you train both sides of playing an asymmetric game, you run into problems where you scale up at playing one side faster than playing the other side. This makes me think “The model after N+1 gradient updates isn’t that much better than model after N gradient updates.” is not enough of an assumption if you operationalize it in a way that ensures you are using the model to do the same thing in both cases, and if you don’t operationalize it in that way, it seem like too strong of an assumption.
I agree that asymmetric games are the interesting case, and it’s rare you can use an assumption this weak.