I was wondering how the models perform on the multiplication test by default. If they were performing better when incentivized to do well than they were by default, that might mean they are not using their full capabilities by default.
I was wondering how the models perform on the multiplication test by default. If they were performing better when incentivized to do well than they were by default, that might mean they are not using their full capabilities by default.