One failure mode is that the modification makes the model very dumb in all instances.
Yea, good point. Perhaps an extra condition we’d need to include is that the “difficulty of meta-level questions” should be the same before and after the modification—e.g. - the distribution over stuff it’s good at and stuff its bad at should be just as complex (not just good at everything or bad at everything) before and after
Thanks James!
Yea, good point. Perhaps an extra condition we’d need to include is that the “difficulty of meta-level questions” should be the same before and after the modification—e.g. - the distribution over stuff it’s good at and stuff its bad at should be just as complex (not just good at everything or bad at everything) before and after