cousin_it comments on Tiling Agents for Self-Modifying AI (OPFAI #2)

cousin_it 7 Jun 2013 10:01 UTC
9 points
Congrats to Eliezer and Marcello on the writeup! It has helped me understand Benja’s “parametric polymorphism” idea better.

There’s a slightly different angle that worries me. What happens if you ask an AI to solve the AI reflection problem?

1) If an agent A_1 generates another agent A_0 by consequentialist reasoning, possibly using proofs in PA, then future descendants of A_0 also count as consequences. So at least A_0 should not face the problem of “telomere shortening”, because PA can see the possible consequences of “telomere shortening” already. But what will A_0 look like? That’s a mystery.

2) To figure out the answer to (1), it’s natural to try devising a toy problem where we could test different implementations of A_1. Benja made a good attempt, then Wei came up with an interesting quining solution to that. Eliezer has now formalized his objection to quining solutions as the “Vingean principle” (no, actually “naturalistic principle”, thx Eliezer), which is a really nice step. Now I just want a toy problem where we’re forced to apply that principle :-) Why such problems are hard to devise is another mystery.
What links here?
- Wei Dai's comment on Do Earths with slower economic growth have a better chance at FAI? by Eliezer Yudkowsky (13 Jun 2013 21:52 UTC; 14 points)
- Eliezer Yudkowsky 8 Jun 2013 6:43 UTC
  5 points
  Parent
  (Quick note: Wei’s quining violates the naturalistic principle, not the Vingean principle. Wei’s actions were still inside quantifiers but had separate forms for self-modification and action. So did Benja’s original proposal in the Quirrell game, which Wei modified—I was surprised and impressed when Benja’s polymorphism approach carried over to a naturalistic system.)
  - Wei Dai 9 Jun 2013 0:49 UTC
    3 points
    Parent
    Was it UDT1.1 (as a solution to this problem) that violates the Vingean principle?
    
    Also, I’m wondering if Benja’s polymorphism approach solves the “can’t decide whether or not to commit suicide” problem that I described here. Your paper doesn’t seem to address this problem since the criteria of action you use all talk about “NULL or GOAL” and since suicide leads to NULL, an AI using your criterion of action has trouble deciding whether or not to commit suicide for an even more immediate reason. Do you have any ideas how your framework might be changed to allow this problem to be addressed?
    - Eliezer Yudkowsky 9 Jun 2013 1:17 UTC
      3 points
      Parent
      
      Was it UDT1.1 (as a solution to this problem) that violates the Vingean principle?
      
      As I remarked in that thread, there are many possible designs that violate the Vingean principle, AFAICT UDT 1.1 is one of them.
      
      Also, I’m wondering if Benja’s polymorphism approach solves the “can’t decide whether or not to commit suicide” problem that I described here. Your paper doesn’t seem to address this problem since the criteria of action you use all talk about “NULL or GOAL” and since suicide leads to NULL, an AI using your criterion of action has trouble deciding whether or not to commit suicide for an even more immediate reason.
      
      Suicide being permitted by the NULL option is a different issue from suicide being mandated by self-distrust. Benja’s TK gets rid of distrust of offspring. Work on reflective/naturalistic trust is ongoing.
  - cousin_it 8 Jun 2013 7:38 UTC
    0 points
    Parent
    Thanks! Corrected.