Big-Bench would appear to provide another instance of this in the latest PaLM inner-monologue paper, “Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them”, Suzgun et al 2022: they select a subset of the hardest feasible-looking BIG-Bench tasks, and benchmark PaLM on them. No additional training, just better prompting on a benchmark designed to be as hard as possible. Inner-monologue prompts, unsurprisingly by this point, yields considerable improvement… and it also changes the scaling for several of the benchmarks—what looks like a flat scaling curve with the standard obvious 5-shot benchmark prompt can turns out to be a much steeper curve as soon as they use the specific chain-of-thought prompt. (For example, “Web of Lies” goes from a consistent random 50% at all model sizes to scaling smoothly from ~45% to ~100% performance.) And I don’t know any reason to think that CoT is the best possible inner-monologue prompt for PaLM, either.
“Sampling can show the presence of knowledge but not the absence.”
Big-Bench would appear to provide another instance of this in the latest PaLM inner-monologue paper, “Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them”, Suzgun et al 2022: they select a subset of the hardest feasible-looking BIG-Bench tasks, and benchmark PaLM on them. No additional training, just better prompting on a benchmark designed to be as hard as possible. Inner-monologue prompts, unsurprisingly by this point, yields considerable improvement… and it also changes the scaling for several of the benchmarks—what looks like a flat scaling curve with the standard obvious 5-shot benchmark prompt can turns out to be a much steeper curve as soon as they use the specific chain-of-thought prompt. (For example, “Web of Lies” goes from a consistent random 50% at all model sizes to scaling smoothly from ~45% to ~100% performance.) And I don’t know any reason to think that CoT is the best possible inner-monologue prompt for PaLM, either.
“Sampling can show the presence of knowledge but not the absence.”