My read was that for systems where you have rock-solid checking steps, you can throw arbitrary amounts of compute at searching for things that check out and trust them, but if there’s any crack in the checking steps, then things that ‘check out’ aren’t trustable, because the proposer can have searched an unimaginably large space (from the rater’s perspective) to find them. [And from the proposer’s perspective, the checking steps are the real spec, not whatever’s in your head.]
In general, I think we can get a minor edge from “checking AI work” instead of “generating our own work” and that doesn’t seem like enough to tackle ‘cognitive megaprojects’ (like ‘cure cancer’ or ‘develop a pathway from our current society to one that can reliably handle x-risk’ or so on). Like, I’m optimistic about “current human scientists use software assistance to attempt to cure cancer” and “an artificial scientist attempts to cure cancer” and pretty pessimistic about “current human scientists attempt to check the work of an artificial scientist that is attempting to cure cancer.” It reminds me of translators who complained pretty bitterly about being given machine-translated work to ‘correct’; they basically still had to do it all over again themselves in order to determine whether or not the machine had gotten it right, and so it wasn’t nearly as much of a savings as hoped.
Like the value of ‘DocBot attempts to cure cancer’ is that DocBot can think larger and wider thoughts than humans, and natively manipulate an opaque-to-us dense causal graph of the biochemical pathways in the human body, and so on; if you insist on DocBot only thinking legible-to-human thoughts, then it’s not obvious it will significantly outperform humans.
My read was that for systems where you have rock-solid checking steps, you can throw arbitrary amounts of compute at searching for things that check out and trust them, but if there’s any crack in the checking steps, then things that ‘check out’ aren’t trustable, because the proposer can have searched an unimaginably large space (from the rater’s perspective) to find them. [And from the proposer’s perspective, the checking steps are the real spec, not whatever’s in your head.]
In general, I think we can get a minor edge from “checking AI work” instead of “generating our own work” and that doesn’t seem like enough to tackle ‘cognitive megaprojects’ (like ‘cure cancer’ or ‘develop a pathway from our current society to one that can reliably handle x-risk’ or so on). Like, I’m optimistic about “current human scientists use software assistance to attempt to cure cancer” and “an artificial scientist attempts to cure cancer” and pretty pessimistic about “current human scientists attempt to check the work of an artificial scientist that is attempting to cure cancer.” It reminds me of translators who complained pretty bitterly about being given machine-translated work to ‘correct’; they basically still had to do it all over again themselves in order to determine whether or not the machine had gotten it right, and so it wasn’t nearly as much of a savings as hoped.
Like the value of ‘DocBot attempts to cure cancer’ is that DocBot can think larger and wider thoughts than humans, and natively manipulate an opaque-to-us dense causal graph of the biochemical pathways in the human body, and so on; if you insist on DocBot only thinking legible-to-human thoughts, then it’s not obvious it will significantly outperform humans.