Thanks for asking. The reason artificial general intelligence is an existential risk is because agentic systems that construct predictive models of their environment can use those models to compute what actions will best achieve their goals (and most possible goals kill everyone when optimized hard enough because people are made of atoms that can be used for other things).
The “compute what actions will best achieve goals” trick doesn’t work when the models aren’t accurate! This continues to be the case when the agentic system is made out of humans. So if our scientific institutions systematically produce less-than-optimally-informative output due to misaligned incentives, that’s a factor that makes the “human civilization” AI dumber, and therefore less good at not accidentally killing itself.
I see. In that case, I don’t think it makes much sense to model scientific institutions or the human civilization as an agent. You can’t hope to achieve unanimity in a world as big as ours.
I mean, yes, but we still usually want to talk about collections of humans (like a “corporation” or “the economy”) producing highly optimized outputs, like pencils, even if no one human knows everything that must be known to make a pencil. If someone publishes bad science about the chemistry of graphite, which results in the people in charge of designing a pencil manufacturing line making a decision based on false beliefs about the chemistry of graphite, that makes the pencils worse, even if the humans never achieve unanimity and you don’t want to use the language of “agency” to talk about this process.
Thanks for asking. The reason artificial general intelligence is an existential risk is because agentic systems that construct predictive models of their environment can use those models to compute what actions will best achieve their goals (and most possible goals kill everyone when optimized hard enough because people are made of atoms that can be used for other things).
The “compute what actions will best achieve goals” trick doesn’t work when the models aren’t accurate! This continues to be the case when the agentic system is made out of humans. So if our scientific institutions systematically produce less-than-optimally-informative output due to misaligned incentives, that’s a factor that makes the “human civilization” AI dumber, and therefore less good at not accidentally killing itself.
I see. In that case, I don’t think it makes much sense to model scientific institutions or the human civilization as an agent. You can’t hope to achieve unanimity in a world as big as ours.
I mean, yes, but we still usually want to talk about collections of humans (like a “corporation” or “the economy”) producing highly optimized outputs, like pencils, even if no one human knows everything that must be known to make a pencil. If someone publishes bad science about the chemistry of graphite, which results in the people in charge of designing a pencil manufacturing line making a decision based on false beliefs about the chemistry of graphite, that makes the pencils worse, even if the humans never achieve unanimity and you don’t want to use the language of “agency” to talk about this process.