Whatever is happening, I’m really concerned about the current “sufficiently big model starts to exhibit <weird behaviour A>. I don’t understand, but also don’t care, here is a dirty workaround and just give it more compute lol” paradigm. I don’t think this is very safe.
Whatever is happening, I’m really concerned about the current “sufficiently big model starts to exhibit <weird behaviour A>. I don’t understand, but also don’t care, here is a dirty workaround and just give it more compute lol” paradigm. I don’t think this is very safe.
If I could get people to change that paradigm, you bet I would.