Being able to unlearn/overwrite/forget bad unhelpful abstractions and heuristics is probably a very useful ability.
Are you imagining the training process doing this, or humans doing it after training?
I was imagining humans deliberately designing and testing a training process that did this automatically. I haven’t thought about how to define, much less automatically detect, unhelpful abstractions or heuristics though.
Are you imagining the training process doing this, or humans doing it after training?
I was imagining humans deliberately designing and testing a training process that did this automatically. I haven’t thought about how to define, much less automatically detect, unhelpful abstractions or heuristics though.