Stuart_Armstrong comments on Proper value learning through indifference

Stuart_Armstrong 23 Jun 2014 13:50 UTC
1 point
In the first situation, you were donating £10 to AMF (10 utilons).

Then I ask you to which to Oxfam. You said yes, if I covered your donation to AMF. This would indeed give you £10+0.1*£10=£11, as you said.

I said “hang on.” I pointed out that this was pure profit for you, and that if in instead I gave £9 to AMF, then this would be equivalent to your first situations (£9 (from me to AMF) + 0.1*£10 (from you to Oxfam) = £10). This is the point at which you are indifferent to changing.

because I may not trust you to fulfil your end of the deal

We removed those potential issues to get a clearer example.
- GraceFu 29 Jun 2014 19:46 UTC
  3 points
  Parent
  Ah! I finally get it! Unfortunately I haven’t gotten the math. Let me try to apply it, and you can tell me where (if?) I went wrong.
  
  U = v + (Past Constants) →
  
  U = w + E(v|v→v) - E(w|v→w) + (Past Constants).
  
  Before, U = v + 0, setting (Past Constants) to 0 because we’re in the initial state. v = 0.1*Oxfam + 1*AMF.
  
  Therefore, U = 10 utilitons.
  
  After I met you, you want me to change my w to weight Oxfam higher, but only if a constant was given (the E terms) U’ = w + E(v|v->v) - E(w|v->w). w = 1*Oxfam + 0.1*AMF.
  
  What we want is for U = U’.
  
  E(v|v->v) = ? I’m guessing this term means, “Let’s say I’m a v maximiser. How much is v?” In that case, E(v|v->v) = 10 utilitons.
  
  E(w|v->w) = ? I’m guessing this term means, “Let’s say I become a w maximiser. How much is w?” In that case, E(w|v->w) = 10 utilitons.
  
  U’ = w + 10 − 10 = w.
  
  Let’s try a different U*, with utility function w* = 1*Oxfam + 10*AMF (It acts the same as a v-maximiser) E(v|v->v) = 10 utilitons. E(w*|v->w*) = 100 utilitons. U* = w* + 10 − 100 = w* − 90.
  
  Trying this out, we obviously will be donating 10 to AMF in both utility functions. U = v = 0.1*Oxfam + 1*AMF = 0.1*0 + 1*10 = 10 utilitons. U* = w* − 90 = 1*Oxfam + 10*AMF − 90 = 0 + 100 − 90 = 10 utilitons.
  
  Obviously all these experiments are useless. v = 0.1*Oxfam + 1*AMF is a completely useless utility function. It may as well be 0.314159265*Oxfam + 1*AMF. Let’s try something that actually makes some sense, (economically.)
  
  Let’s have a simple marginal utility curve, (note partial derivatives) dv/dOxfam = 1-0.1*Oxfam, dv/dAMF = 10-AMF. In both cases, donating more than 10 to either charity is plain stupid.
  
  U = v v = (Oxfam-0.05*Oxfam^2) + (10*AMF-0.5*AMF^2) Maximising U leads to AMF = ¹⁰⁰⁄₁₁ ≈ 9.09, Oxfam ≈ 0.91 v happens to be: v = ⁵⁵⁵⁄₁₁ ≈ 50.45
  
  (Note: Math is mostly intuitive to me, but when it comes to grokking quadratic curves by applying them to utility curves which I’ve never dabbled with before, let’s just say I have a sizeable headache about now.)
  
  Now you, because you’re so human and you think we simulated AI can so easily change our utility functions, come over to me and tell me to change v to w = (100*Oxfam-5*Oxfam^2) + (10*AMF-0.5*AMF^2). What you’re saying is to increase dw/dOxfam = 100 * dv/dOxfam, while leaving dw/dAMF = dv/dAMF. Again, partial derivatives.
  
  U’ = w + E(v|v->v) - E(w|v->w). Maximising w leads to Oxfam = ¹⁰⁰⁄₁₁ ≈ 9.09, AMF = 0.91, the opposite of before. w = ⁵⁵⁵⁰⁄₁₁ ≈ 504.5 U’ = w + ⁵⁵⁵⁄₁₁ − ⁵⁵⁵⁰⁄₁₁ = w − ⁴⁹⁹⁵⁄₁₁ Which still checks out.
  
  Also, I think I finally get the math too, after working this out numerically. It’s basically U = (Something), and trying to make the utility function change must preserve that (Something). U’ = (Something) is a requirement. so you have your U = v + (Constants), and you set U’ = U, just that you have to maximise v or w before determining your new set of (Constants) max(v) + (Constants) = max(w) + (New Constants)
  
  (New Constants) = max(v) - max(w) + (Constants), which are your E(v|v->v) - E(w|v->w) + (Constants) terms, except under different names.
  
  Huh. If only I had thought max(v) and max(w) from the start… but instead I got confused with the notation.
  - Stuart_Armstrong 30 Jun 2014 10:03 UTC
    3 points
    Parent
    Thanks for sticking it out to the end :-)