There is a Simple English Wikipedia with over 200 000 articles, which is not exactly what you want, but seems to be a thing that already exists and is somewhat in that direction.
I agree that this sounds interesting and that I haven’t heard of anyone doing this yet. I have heard of some interpretability experiments with TinyStories, as Zac mentioned. I think the more interesting thing would be a dataset focused on being enriched with synthetic data showing inherently logical things like deductive symbolic logic and math problems worked out (correctly!) step-by-step. You could have a dataset of this, plus the simplified-language versions of middle school through undergrad science textbooks. I expect the result would likely be more logical, and cohesive. It would be interesting to see if this made the model fundamentally more interpretable.
Did anyone try something like this?
Create a conlang with very simple grammar and small vocabulary (not like tokipona small, more like xkcd-thing-explainer small).
Use LLMs to translate a lot of texts into this conlang.
Train new LLM on this translations.
Try to research interpretability on this LLM.
There is a Simple English Wikipedia with over 200 000 articles, which is not exactly what you want, but seems to be a thing that already exists and is somewhat in that direction.
I don’t recall any interpretability experiments with TinyStories offhand, but I’d be surprised if there aren’t any.
I agree that this sounds interesting and that I haven’t heard of anyone doing this yet. I have heard of some interpretability experiments with TinyStories, as Zac mentioned. I think the more interesting thing would be a dataset focused on being enriched with synthetic data showing inherently logical things like deductive symbolic logic and math problems worked out (correctly!) step-by-step. You could have a dataset of this, plus the simplified-language versions of middle school through undergrad science textbooks. I expect the result would likely be more logical, and cohesive. It would be interesting to see if this made the model fundamentally more interpretable.