This is great, thanks so much for pulling this together (and for linking to our Gato explainer!)
It just so happens I’m working with a group of people through the Cambridge EA Technical AI alignment curriculum, and this idea of IDA is what week 5 is all about—lots of further reading for those who want.
One prompt in the weekly curriculum asks whether there are any tasks that cannot easily be broken down in the way described above, and therefore might not be useful for IDA. One thing I can thing of offhand is large leaps in scientific understanding. For example, if you took 20 physicists and gave them the problems of the day, it’s not clear that they ever would have come up with Einstein’s theory of relativity. Given that problem, I wonder what the implications are for trying to use IDA to create AGI—does this mean there are certain types of tasks that a IDA-based AGI will not be so good at?
Hi Jon! Yeah, that’s an interesting example, and I can confirm that when writing this distillation one of the hardest parts was coming up with a clear example that could use IDA. I think one idea to suggest amplification might apply to scientific development is that a lot of scientific advancements seem to have come clever intuitions and novel ideas. That is, while one scientist is pretty unlikely to get the “Eureka” insight that would lead to e.g. general relativity, 20 scientists collectively have a much higher chance that at least one of them could come up with a good idea, and 1000 scientists an even higher chance (taken to an extreme, you might imagine all of scientific progress on Earth so far has been a bunch of scientists vibing and every so often one of them reaches a useful insight). Scientific progress generally seems to be iterative anyway, so an IDA-amplified PASTA AGI could theoretically spin up a bunch of randomly perturbed versions of itself to work at scientific problems until one comes up with a uniquely good insight, and then it could be distilled down to become more creative and efficient at generating future insights.
This is great, thanks so much for pulling this together (and for linking to our Gato explainer!)
It just so happens I’m working with a group of people through the Cambridge EA Technical AI alignment curriculum, and this idea of IDA is what week 5 is all about—lots of further reading for those who want.
One prompt in the weekly curriculum asks whether there are any tasks that cannot easily be broken down in the way described above, and therefore might not be useful for IDA. One thing I can thing of offhand is large leaps in scientific understanding. For example, if you took 20 physicists and gave them the problems of the day, it’s not clear that they ever would have come up with Einstein’s theory of relativity. Given that problem, I wonder what the implications are for trying to use IDA to create AGI—does this mean there are certain types of tasks that a IDA-based AGI will not be so good at?
Hi Jon! Yeah, that’s an interesting example, and I can confirm that when writing this distillation one of the hardest parts was coming up with a clear example that could use IDA. I think one idea to suggest amplification might apply to scientific development is that a lot of scientific advancements seem to have come clever intuitions and novel ideas. That is, while one scientist is pretty unlikely to get the “Eureka” insight that would lead to e.g. general relativity, 20 scientists collectively have a much higher chance that at least one of them could come up with a good idea, and 1000 scientists an even higher chance (taken to an extreme, you might imagine all of scientific progress on Earth so far has been a bunch of scientists vibing and every so often one of them reaches a useful insight). Scientific progress generally seems to be iterative anyway, so an IDA-amplified PASTA AGI could theoretically spin up a bunch of randomly perturbed versions of itself to work at scientific problems until one comes up with a uniquely good insight, and then it could be distilled down to become more creative and efficient at generating future insights.