There is already a sizable amount of research done in this direction, the so called bertology. I believe the methodology that is being developed is useful, but knowing about specific models is probably superfluous. In few months / years we will have new models and anything that you know will not generalize.
There is already a sizable amount of research done in this direction, the so called bertology. I believe the methodology that is being developed is useful, but knowing about specific models is probably superfluous. In few months / years we will have new models and anything that you know will not generalize.