paulfchristiano comments on Aligning a toy model of optimization

paulfchristiano 2 Jul 2019 21:49 UTC
LW: 2 AF: 1
AF
If dropping competitiveness, what counts as a solution? Is “imitate a human, but run it fast” fair game? We could try to hash out the details in something along those lines, and I think that’s worthwhile, but I don’t think it’s a top priority and I don’t think the difficulties will end up being that similar. I think it may be productive to relax the competitiveness requirement (e.g. to allow solutions that definitely have at most a polynomial slowdown), but probably not a good idea to eliminate it altogether.
- Wei Dai 3 Jul 2019 7:46 UTC
  LW: 3 AF: 2
  AF Parent
  
  If dropping competitiveness, what counts as a solution?
  
  I’m not sure, but mainly because I’m not sure what counts as a solution to your problem. If we had a specification of that, couldn’t we just remove the parts that deal with competitiveness?
  
  Is “imitate a human, but run it fast” fair game?
  
  I guess not, because a human imitation might have selfish goals and not be intent aligned to the user?
  
  We could try to hash out the details in something along those lines, and I think that’s worthwhile, but I don’t think it’s a top priority and I don’t think the difficulties will end up being that similar.
  
  What about my suggestion of hashing the details of how to implement IDA/DEBATE using Opt and then seeing if we can decide whether or not it’s aligned?
  What links here?
  - Three Kinds of Competitiveness by Daniel Kokotajlo (31 Mar 2020 1:00 UTC; 36 points)
  - Three kinds of competitiveness by AI Impacts (EA Forum; 2 Apr 2020 3:46 UTC; 10 points)