Collective Human Intelligence (CHI) represents both the current height of general intelligence and a model of alignment among intelligent agents.
Assuming CHI is aligned is circular reasoning. If humanity creates an unaligned AI that destroys us all, that ironically means that not even humanity was aligned.
Could you paraphrase? I’m not sure I follow your reasoning… Humans cooperate sufficiently to generate collective intelligence, and they cooperate sufficiently due to a range of alignment mechanics between humans, no?
It’s a bit tongue-in-cheek, but technically for an AI to be aligned, it isn’t allowed to create unaligned AIs. Like if your seed AI creates a paperclip maximizer, that’s bad.
So if humanity accidentally creates a paperclip maximizer, they are technically unaligned under this definition.
I disagree with this. I think the most useful definition of alignment is intent alignment. Humans are effectively intent-aligned on the goal to not kill all of humanity. They may still kill all of humanity, but that is not an alignment problem but a problem in capabilities: humans aren’t capable of knowing which AI designs will be safe.
The same holds for intent-aligned AI systems that create unaligned successors.
Assuming CHI is aligned is circular reasoning. If humanity creates an unaligned AI that destroys us all, that ironically means that not even humanity was aligned.
Could you paraphrase? I’m not sure I follow your reasoning… Humans cooperate sufficiently to generate collective intelligence, and they cooperate sufficiently due to a range of alignment mechanics between humans, no?
It’s a bit tongue-in-cheek, but technically for an AI to be aligned, it isn’t allowed to create unaligned AIs. Like if your seed AI creates a paperclip maximizer, that’s bad.
So if humanity accidentally creates a paperclip maximizer, they are technically unaligned under this definition.
I disagree with this. I think the most useful definition of alignment is intent alignment. Humans are effectively intent-aligned on the goal to not kill all of humanity. They may still kill all of humanity, but that is not an alignment problem but a problem in capabilities: humans aren’t capable of knowing which AI designs will be safe.
The same holds for intent-aligned AI systems that create unaligned successors.
Oooh gotcha. In that case, we are not remotely any good at avoiding the creation of unaligned humans either! ;)
Because we aren’t aligned.