rk comments on AI development incentive gradients are not uniformly terrible

rk 14 Mar 2019 13:04 UTC
2 points
0
Yes, you’re quite right!

The intuition becomes a little clearer when I take the following alternative derivation:

Let us look at the change in expected value when I increase my capabilities. From the expected value stemming from worlds where I win, we have $(p * q)^{'} = p^{'} * q + p * q^{'}$ . For the other actor, their probability of winning decreases at a rate that matches my increase in probability of winning. Also, their probability of deploying a safe AI doesn’t change. So the change in expected value stemming fro m worlds where they win is $- p^{'} * r * q$ .

We should be indifferent to increasing capabilities when these sum to 0, so $p^{'} * q + p * q^{'} = p^{'} * r * q$ .

Let’s choose our units so $k_{m} = 1$ . Then, using the expressions for $q^{'}$ from your comment, we have $r q_{0} p_{0}^{'} = p_{0}^{'} q_{0} + p_{0} q_{0} (r - 1)$ .

Dividing through by $q_{0}$ we get $r p_{0}^{'} = p_{0}^{'} + p_{0} (r - 1)$ . Collecting like terms we have $(r - 1) * p_{0}^{'} = p_{0} * (r - 1)$ and thus $p_{0}^{'} = p_{0}$ . Substituting for $p_{0}^{'}$ we have $\frac{1}{2} - p_{0} = p_{0}$ and thus $p_{0} = \frac{1}{4}$
- BurntVictory 14 Mar 2019 19:53 UTC
  1 point
  0
  Parent
  Oh wait, yeah, this is just an example of the general principle “when you’re optimizing for xy, and you have a limited budget with linear costs on x and y, the optimal allocation is to spend equal amounts on both.”
  Formally, you can show this via Lagrange-multiplier optimization, using the Lagrangian $L (x, y) = x y - λ (a x + b y - M)$ . Setting the partials equal to zero gets you $λ = y / a = x / b$ , and you recover the linear constraint function $a x + b y = M$ . So $a x = b y = M / 2$ . (Alternatively, just optimizing $x \frac{M - a x}{b}$ works, but I like Lagrange multipliers.)
  In this case, we want to maximize $p q + (1 - p) r q_{0} = p (q - r q_{0}) - r q_{0}$ , which is equivalent to optimizing $p * (q - r q_{0})$ . Let’s define $w$ $=$ $q - r q_{0}$ , so we’re optimizing $p * w$ .
  Our constraint function is defined by the tradeoff between $p$ and $w$ . $p (k) = (.5 - p_{0}) k + p_{0}$ , so $k = \frac{p - p_{0}}{.5 - p_{0}}$ . $w (k) = (r - 1) q_{0} k + q_{0} - r q_{0} = (r - 1) q_{0} (k - 1)$ , so $k = \frac{- w}{(1 - r) q_{0}} + 1 = \frac{p - p_{0}}{.5 - p_{0}}$ .
  Rearranging gives the constraint function $\frac{.5 - p_{0}}{(1 - r) q_{0}} w + p = .5$ . This is indeed linear, with a total ‘budget’ $M$ of .5 and a p-coefficient $b$ of 1. So by the above theorem we should have $1 * p = .5 / 2 = .25$ .