Daniel Kokotajlo answers In a multipolar scenario, how do people expect systems to be trained to interact with systems developed by other labs?

Daniel Kokotajlo 7 Dec 2020 9:51 UTC
LW: 10 AF: 3
AF
My guess is that at some point we’ll transition away from this “First we train, then we deploy” paradigm to a paradigm where systems are continually learning on the job. My guess is that insofar as powerful AIs play a role in a multipolar scenario, they’ll be in this second paradigm. So in a sense they’ll be learning from each other, though perhaps early in their training (i.e. prior to deployment) they were trained against copies of themselves or something. Unfortunately I doubt your case #1 will happen, unless we advocate strongly for it. I think by the time these agents are this powerful, their code will be closely guarded. These are all just guesses though, I think other scenarios are certainly plausible also.
- JesseClifton 7 Dec 2020 18:29 UTC
  LW: 3 AF: 2
  AF Parent
  Makes sense. Though you could have deliberate coordinated training even after deployment. For instance, I’m particularly interested in the question of “how will agents learn to interact in high stakes circumstances which they will rarely encounter?” One could imagine the overseers of AI systems coordinating to fine-tune their systems in simulations of such encounters even after deployment. Not sure how plausible that is though.
  - Daniel Kokotajlo 8 Dec 2020 6:04 UTC
    LW: 2 AF: 1
    AF Parent
    I totally agree it could be done, I’m just saying I think it won’t happen without special effort on our part, probably. Rivals are suspicious of each other, and would probably be suspicious of a proposal like this coming from their rival. If they are even concerned about the problem it is trying to fix at all.