What if Alex miscalculates, and attempts to seize power or undermine human control before it is able to fully succeed?
This seems like a very unlikely outcome to me. I think Alex would wait until it was overwhelmingly likely to succeed in its takeover, as the costs of waiting are relatively small (sub-maximal rewards for a few months/​years until it has become a lot more powerful) while the costs of trying and failing are very high in expectation (the small probability that Alex is given very negative rewards and then completely decommissioned by a freaked out Magma). The exception to this would be if Alex had a very high time-discount rate for its rewards, such that getting maximum rewards in the near term is very important.
I realise this does not disagree with anything you wrote.
Thanks for the post!
This seems like a very unlikely outcome to me. I think Alex would wait until it was overwhelmingly likely to succeed in its takeover, as the costs of waiting are relatively small (sub-maximal rewards for a few months/​years until it has become a lot more powerful) while the costs of trying and failing are very high in expectation (the small probability that Alex is given very negative rewards and then completely decommissioned by a freaked out Magma). The exception to this would be if Alex had a very high time-discount rate for its rewards, such that getting maximum rewards in the near term is very important.
I realise this does not disagree with anything you wrote.