Two points where I disagree with this argument:
We may not be able to prove something about an arbitrary AGI, but could interpret the resulting program and prove things about that
Alignment does not mean probably correct, I would define it as “empirically doesn’t kill us”
Two points where I disagree with this argument:
We may not be able to prove something about an arbitrary AGI, but could interpret the resulting program and prove things about that
Alignment does not mean probably correct, I would define it as “empirically doesn’t kill us”