You want to make it clear to the LLM what the task is (multiplying n digit numbers is clear but “doing hard math questions” is vague) and also have some variety of difficulty levels (within LLMs and between LLMs) and a high ceiling. I think this would take some iteration at least.
You want to make it clear to the LLM what the task is (multiplying n digit numbers is clear but “doing hard math questions” is vague) and also have some variety of difficulty levels (within LLMs and between LLMs) and a high ceiling. I think this would take some iteration at least.