Daniel Kokotajlo comments on Preparing for “The Talk” with AI projects

Daniel Kokotajlo 17 Jun 2020 17:58 UTC
2 points
Interesting. I’d love to hear more about the sorts of worlds conditioned on in your (b). For my part, the worlds I described in the original post seem both the most likely and also not completely hopeless—maybe with a month of extra effort we can actually come up with a solution, or else a convincing argument that we need another month, etc. Or maybe we already have a mostly-working solution by the time The Talk happens and with another month we can iron out the bugs.
- Pongo 19 Jun 2020 21:59 UTC
  1 point
  Parent
  I just wanted to say that this is a good question, but I’m not sure I know the answer yet.
  Worlds that appear most often in my musings (but I’m not sure they’re likely enough to count) are:
  - an aligned group getting a decisive strategic advantage
  - safety concerns being clearly demonstrated and part of mainstream AI research
    Perhaps general reasoning about agents and intelligence improves, and we can apply these techniques to AI designs
    Perhaps things contiguous with alignment concerns cause failures in capable AI systems early on
  - A more alignable paradigm overtaking ML
    This seems like a fantasy
    Could be because ML gets bottlenecked or a different approach makes rapid progress
  - Daniel Kokotajlo 19 Jun 2020 22:10 UTC
    3 points
    Parent
    Thanks, that was an illuminating answer. I feel like those three worlds are decently likely, but that if those worlds occur purchasing additional expected utility in them will be hard, precisely because things will be so much easier. For example, if safety concerns are part of mainstream AI research, then safety research won’t be neglected anymore.
    - Pongo 19 Jun 2020 22:13 UTC
      1 point
      Parent
      You can purchase additional EU by pumping up their probability as well EDIT: I know I originally said to condition on these worlds, but I guess that’s not what I actually do. Instead, I think I condition on not-doomed worlds
      - Daniel Kokotajlo 19 Jun 2020 22:49 UTC
        3 points
        Parent
        Ah, that sounds much better to me. Yeah, maybe the cheapest EU lies in trying to make these worlds more likely. I doubt we have much control over which paradigms overtake ML, and I think that the intervention I’m proposing might help make the first and second kinds of world more likely (because maybe with a month of extra time to analyze their system, the relevant people will become convinced that the problem is real)