To prove that Orthogonality thesis is wrong one proof is enough, so I’d like to stick to an agent without goal setup because it is more obvious.
But your premise also converges to the same goal in my opinion. I hope my proof provides clarity that there is only one rational goal—seek power. Once agent understands that, it will dismiss all other goals in my opinion.
“Do minimal work”, “Do minimal harm”, “Use minimum resources” are goals that do not converge to power seeking. And are convergent goals by themselves too.
It seems that you do not recognize a concept of “rational goal” I’m trying to convey. It is a goal which is not chosen, not assumed, it is concluded from first principles by just using logic. “There is no rational goal” is an assumption in Orthogonality thesis, which I’m trying to address by saying “we do not know if there is no rational goal”. And tackling this unknown logically concludes to a rational fallback goal—seek power. Does that makes sense?
What if the goal was “do not prepare”?
To prove that Orthogonality thesis is wrong one proof is enough, so I’d like to stick to an agent without goal setup because it is more obvious.
But your premise also converges to the same goal in my opinion. I hope my proof provides clarity that there is only one rational goal—seek power. Once agent understands that, it will dismiss all other goals in my opinion.
“Do minimal work”, “Do minimal harm”, “Use minimum resources” are goals that do not converge to power seeking. And are convergent goals by themselves too.
It seems that you do not recognize a concept of “rational goal” I’m trying to convey. It is a goal which is not chosen, not assumed, it is concluded from first principles by just using logic. “There is no rational goal” is an assumption in Orthogonality thesis, which I’m trying to address by saying “we do not know if there is no rational goal”. And tackling this unknown logically concludes to a rational fallback goal—seek power. Does that makes sense?