while this paradigm of ‘training a model that’s an agi, and then running it at inference’ is one way we get to transformative agi, i find myself thinking that probably WON’T be the first transformative AI, because my guess is that there are lots of tricks using lots of compute at inference to get not quite transformative ai to transformative ai.
Agreed that this is far from the only possibility, and we have some discussion of increasing inference time to make the final push up to generality in the bit beginning “If general intelligence is achievable by properly inferencing a model with a baseline of capability that is lower than human-level...” We did a bit more thinking around this topic which we didn’t think was quite core to the post, so Connor has written it up on his blog here: https://arcaderhetoric.substack.com/p/moravecs-sea
and i doubt that these tricks can funge against train time compute, as you seem to be assuming in your analysis.
Our method 5 is intended for this case—we’d use an appropriate ‘capabilities per token’ multiplier to account for needing extra inference time to reach human level.
Agreed that this is far from the only possibility, and we have some discussion of increasing inference time to make the final push up to generality in the bit beginning “If general intelligence is achievable by properly inferencing a model with a baseline of capability that is lower than human-level...” We did a bit more thinking around this topic which we didn’t think was quite core to the post, so Connor has written it up on his blog here: https://arcaderhetoric.substack.com/p/moravecs-sea
Our method 5 is intended for this case—we’d use an appropriate ‘capabilities per token’ multiplier to account for needing extra inference time to reach human level.