>If I thought large language models were already capable of doing simple plug-and-chug problems, I’m not sure why I’d update much on this development.
I suppose I just have different intuitions on this. Let’s just make a second bet. I imagine you can find another element for your list you will be comfortable adding—it doesn’t necessarily have to be a dataset, just something in the same spirit as the other items in the list.
I think I’ll pass up an opportunity for a second bet for now. My mistake was being too careless in the first place—and I’m not currently too interested in doing a deeper dive into what might be a good replacement for MATH.
>If I thought large language models were already capable of doing simple plug-and-chug problems, I’m not sure why I’d update much on this development.
I suppose I just have different intuitions on this. Let’s just make a second bet. I imagine you can find another element for your list you will be comfortable adding—it doesn’t necessarily have to be a dataset, just something in the same spirit as the other items in the list.
I think I’ll pass up an opportunity for a second bet for now. My mistake was being too careless in the first place—and I’m not currently too interested in doing a deeper dive into what might be a good replacement for MATH.
You could just drop MATH and make a bet at different odds on the remaining items.