There is surprisingly little incentive for selfish AI writers to tilt the friendliness towards
themselves.
For normal humans and at the end of the game, I agree with you. However, there are two situations where people may want tilt:
Narcissists seem to have an unlimited appetite for adoration from others. That might translate to a desire to get the AI tilted as much as possible in their favor. They are 1% of the population according to the abnormal psych literature, but in my experience I see a much larger fraction of the population being subclinically narcissistic enough to be a problem.
If there’s a slow takeoff, the AI will be weak for some period of time. During this time the argument that it controls enough resources to satisfy everyone doesn’t hold. If the organization building it has no other available currency to pay people for help, it might pay in tilt. If the tilt decays toward zero at some rate we could end up with something that is fair. I don’t know how to reconcile that with the scheme described in another comment for dealing with utility monsters by tilting away from them.
It also creates a time window during which, if the creator dies or suffers brain
damage, the AI ends up unfriendly to everyone (including the creator).
I agree that there will be windows like that. To avoid that, we would need a committee taking the lead with some well-defined procedures that allow a member of the committee to be replaced if the others judge him to be insane or deceptive. Given how poorly committee decision making works, I don’t know if that presents more or less risk than simply having one leader and taking the risk of him going insane. The size of the window depends on whether there’s a hard or soft takeoff, and I don’t know which of those to expect.
For normal humans and at the end of the game, I agree with you. However, there are two situations where people may want tilt:
Narcissists seem to have an unlimited appetite for adoration from others. That might translate to a desire to get the AI tilted as much as possible in their favor. They are 1% of the population according to the abnormal psych literature, but in my experience I see a much larger fraction of the population being subclinically narcissistic enough to be a problem.
If there’s a slow takeoff, the AI will be weak for some period of time. During this time the argument that it controls enough resources to satisfy everyone doesn’t hold. If the organization building it has no other available currency to pay people for help, it might pay in tilt. If the tilt decays toward zero at some rate we could end up with something that is fair. I don’t know how to reconcile that with the scheme described in another comment for dealing with utility monsters by tilting away from them.
I agree that there will be windows like that. To avoid that, we would need a committee taking the lead with some well-defined procedures that allow a member of the committee to be replaced if the others judge him to be insane or deceptive. Given how poorly committee decision making works, I don’t know if that presents more or less risk than simply having one leader and taking the risk of him going insane. The size of the window depends on whether there’s a hard or soft takeoff, and I don’t know which of those to expect.