It’s unclear what sort of insight you might want to learn. In any case, it sounds to me like capability research that’s more likely to be net harmful for humanity rather than safety research.
My reasoning is partly that we know that large AGI-outfits do not necessarily publish their insights into the capabilities of their systems and architectures. But it seems to me to be quite important to develop a strong understanding of these capabilities.
Given that I would use existing techniques in a toy scenario, I think it’s very unlikely that I would create new capabilities. Maybe I would discover unknown capabilities but these would have existed in similar systems anyway. And of course what discoveries I would decide to publish is a separate question altogether.
I also wouldn’t call this “safety research”, though I think such a model might downstream be useful for prosaic alignment. My motivation is mostly to understand whether AGI is 5 years away or 30. And to know which breakthroughs fill the remaining gaps and which don’t.
It’s unclear what sort of insight you might want to learn. In any case, it sounds to me like capability research that’s more likely to be net harmful for humanity rather than safety research.
My reasoning is partly that we know that large AGI-outfits do not necessarily publish their insights into the capabilities of their systems and architectures. But it seems to me to be quite important to develop a strong understanding of these capabilities.
Given that I would use existing techniques in a toy scenario, I think it’s very unlikely that I would create new capabilities. Maybe I would discover unknown capabilities but these would have existed in similar systems anyway. And of course what discoveries I would decide to publish is a separate question altogether.
I also wouldn’t call this “safety research”, though I think such a model might downstream be useful for prosaic alignment. My motivation is mostly to understand whether AGI is 5 years away or 30. And to know which breakthroughs fill the remaining gaps and which don’t.