Interesting. I’d love to hear more about the sorts of worlds conditioned on in your (b). For my part, the worlds I described in the original post seem both the most likely and also not completely hopeless—maybe with a month of extra effort we can actually come up with a solution, or else a convincing argument that we need another month, etc. Or maybe we already have a mostly-working solution by the time The Talk happens and with another month we can iron out the bugs.
Thanks, that was an illuminating answer. I feel like those three worlds are decently likely, but that if those worlds occur purchasing additional expected utility in them will be hard, precisely because things will be so much easier. For example, if safety concerns are part of mainstream AI research, then safety research won’t be neglected anymore.
You can purchase additional EU by pumping up their probability as well EDIT: I know I originally said to condition on these worlds, but I guess that’s not what I actually do. Instead, I think I condition on not-doomed worlds
Ah, that sounds much better to me. Yeah, maybe the cheapest EU lies in trying to make these worlds more likely. I doubt we have much control over which paradigms overtake ML, and I think that the intervention I’m proposing might help make the first and second kinds of world more likely (because maybe with a month of extra time to analyze their system, the relevant people will become convinced that the problem is real)
Interesting. I’d love to hear more about the sorts of worlds conditioned on in your (b). For my part, the worlds I described in the original post seem both the most likely and also not completely hopeless—maybe with a month of extra effort we can actually come up with a solution, or else a convincing argument that we need another month, etc. Or maybe we already have a mostly-working solution by the time The Talk happens and with another month we can iron out the bugs.
I just wanted to say that this is a good question, but I’m not sure I know the answer yet.
Worlds that appear most often in my musings (but I’m not sure they’re likely enough to count) are:
an aligned group getting a decisive strategic advantage
safety concerns being clearly demonstrated and part of mainstream AI research
Perhaps general reasoning about agents and intelligence improves, and we can apply these techniques to AI designs
Perhaps things contiguous with alignment concerns cause failures in capable AI systems early on
A more alignable paradigm overtaking ML
This seems like a fantasy
Could be because ML gets bottlenecked or a different approach makes rapid progress
Thanks, that was an illuminating answer. I feel like those three worlds are decently likely, but that if those worlds occur purchasing additional expected utility in them will be hard, precisely because things will be so much easier. For example, if safety concerns are part of mainstream AI research, then safety research won’t be neglected anymore.
You can purchase additional EU by pumping up their probability as well EDIT: I know I originally said to condition on these worlds, but I guess that’s not what I actually do. Instead, I think I condition on not-doomed worlds
Ah, that sounds much better to me. Yeah, maybe the cheapest EU lies in trying to make these worlds more likely. I doubt we have much control over which paradigms overtake ML, and I think that the intervention I’m proposing might help make the first and second kinds of world more likely (because maybe with a month of extra time to analyze their system, the relevant people will become convinced that the problem is real)