Jason Hoelscher-Obermaier comments on How “Discovering Latent Knowledge in Language Models Without Supervision” Fits Into a Broader Alignment Scheme

Jason Hoelscher-Obermaier 27 Feb 2023 12:34 UTC
1 point
0
Show that we can recover superhuman knowledge from language models with our approach
Maybe applying CCS to a scientific context would be an option for extending the evaluation?
For example, harvesting undecided scientific statements, which we expect to become resolved soonish, and using CCS for predictions on these statements?