Bogdan Ionut Cirstea comments on Thoughts on sharing information about language model capabilities

Bogdan Ionut Cirstea 18 Jul 2024 23:41 UTC
1 point
0
E.g. I’d bet on increasingly wide gaps between each of LM snap judgments, chain of thought, and tool-use
Some evidence in favor: https://x.com/YangjunR/status/1793681241398788319 (for increasingly wide gap between LM single forward pass (‘snap judgment’) and CoT), https://xwang.dev/mint-bench/ (for tool use being increasingly useful, with both model scale, and number of tool use turns).