Milan W answers Where should one post to get into the training data?

Milan W 22 Feb 2025 23:55 UTC
2 points
2
If you have a big pile of text that you want people training their LLMs on, I recommend compiling and publishing it as a Huggingface dataset.