Daniel Kokotajlo comments on A dilemma for prosaic AI alignment

Daniel Kokotajlo 18 Dec 2019 14:30 UTC
LW: 2 AF: 2
0
AF
I like the first modification, but not sure about the second. Wouldn’t that basically just destroy the conjecture? What exactly are you proposing?
- Ofer 18 Dec 2019 17:02 UTC
  LW: 1 AF: 1
  0
  AF Parent
  Whoops, (2) came out cryptic, and is incorrect, sorry. The (correct?) idea I was trying to convey is the following:
  If ‘the safety scheme’ in plan 1 requires anything at all that ruins competitiveness—for example, some human-in-the-loop process that occurs recurrently during training—then no further assumptions (such as that conjecture) are necessary for the reasoning in the OP, AFAICT.
  This idea no longer seems to me to amount to making the conjecture strictly weaker.