I think nets are usually increased in depth as well as width when they are ‘scaled up’, so the NTK limit doesn’t apply—the convergence to NTK is controlled by the ratio of depth to width, only approaching a deterministic kernel if this ratio approaches 0.
I think nets are usually increased in depth as well as width when they are ‘scaled up’, so the NTK limit doesn’t apply—the convergence to NTK is controlled by the ratio of depth to width, only approaching a deterministic kernel if this ratio approaches 0.