This seems like an underestimate because you don’t consider whether the first “AGI” will indeed make it so we only get one chance. If it can only self improve by more gradient steps, then humanity has a greater chance than if it self improves by prompt engineering or direct modification of its weights or latent states. Shard theory seems to have nonzero opinions on the fruitfulness of the non-data methods.
What does self-improvement via gradients vs prompt-engineering vs direct mods have to do with how many chances we get? I guess, we have at least a modicum more control over the gradient feedback loop, than over the other loops?
Shard theory seems to have nonzero opinions on the fruitfulness of the non-data methods.
This seems like an underestimate because you don’t consider whether the first “AGI” will indeed make it so we only get one chance. If it can only self improve by more gradient steps, then humanity has a greater chance than if it self improves by prompt engineering or direct modification of its weights or latent states. Shard theory seems to have nonzero opinions on the fruitfulness of the non-data methods.
What does self-improvement via gradients vs prompt-engineering vs direct mods have to do with how many chances we get? I guess, we have at least a modicum more control over the gradient feedback loop, than over the other loops?
Can you say more?