For Satoshi scenarios where you have a very small corpus or the corpus is otherwise problematic (in this case, you can’t easily get new Satoshi text heldout from training), you could do things like similarity/distance metrics: https://www.lesswrong.com/posts/dLg7CyeTE4pqbbcnp/language-models-model-us?commentId=MNk22rZeELjoh7bhW
For Satoshi scenarios where you have a very small corpus or the corpus is otherwise problematic (in this case, you can’t easily get new Satoshi text heldout from training), you could do things like similarity/distance metrics: https://www.lesswrong.com/posts/dLg7CyeTE4pqbbcnp/language-models-model-us?commentId=MNk22rZeELjoh7bhW