My understanding of capabilities training is that there are a lot of knobs and fiddly bits and characteristics of your data and if you screw them up then the thing doesn’t work right, but you can tinker with them until you get them right and fix the issues, and if you have the experience and intuition you can do a huge ‘YOLO run’ where you guess at all of them and have a decent chance of that part working out.
Pressman is almost certainly not referring to YOLO runs, but rather stuff like frakenmerges where you can just take random bits from completely different neural networks, stick them together in a way that looks plausible and it just works. For a while the top open source model was Goliath, a model created in this way. It’s also frequently the case that researchers discover they failed to correctly implement some aspect of a model, and yet it still trained just fine.
Pressman is almost certainly not referring to YOLO runs, but rather stuff like frakenmerges where you can just take random bits from completely different neural networks, stick them together in a way that looks plausible and it just works. For a while the top open source model was Goliath, a model created in this way. It’s also frequently the case that researchers discover they failed to correctly implement some aspect of a model, and yet it still trained just fine.