Do you mean as a (presumably RL?) training method to make LLMs themselves superhuman in that area, or that the combined system can be superhuman? I think AlphaCode is some evidence for the latter, with the compiler in the role of proof-checker.
The former
Do you mean as a (presumably RL?) training method to make LLMs themselves superhuman in that area, or that the combined system can be superhuman? I think AlphaCode is some evidence for the latter, with the compiler in the role of proof-checker.
The former