Thank you. See, this sort of thing illustrates why I wanted to ask the question—the examples you gave don’t seem plausible to me, (that is, it seems <1% likely that something like that will happen). Probably AI will understand chaos theory before it does a lot of damage; ditto for pascal’s mugging, etc. Probably a myopic AI won’t actually be able to hack nukes while also being unable to create non-myopic copies of itself. Etc.
As for really well-written books… We’ve already had a few great books, and they moved the needle, but by “substantial fraction” I meant something more than that. If I had to put a number on it, I’d say something that convinces more than half of the people who are (at the time) skeptical or dismissive of AI risk to change their minds. I doubt a book will ever achieve this.
I agree that these aren’t very likely options. However, given two examples of an AI suddenly stopping when it discovers something, there are probably more for things that are harder to discover. In the pascel mugging example, the agent would stop working, only when it can deduce what potential muggers might want it to do, something much harder than noticing the phenomenon. The myopic agent has little incentive to make a non myopic version of itself. If dedicating a fraction of resources into making a copy of itself reduced the chance of the missile hacking working from 94%, to 93%, we get a near miss.
One book, probably not. A bunch of books and articles over years, maybe.
Not unable to create non-myopic copies. Unwilling. After all, such a copy might immediately fight its sire because their utility functions over timelines are different.
Mmm, OK, but if it takes long enough for the copy to damage the original, the original won’t care. So it just needs to create a copy with a time-delay.
Or it could create a completely different AI with a time delay. Or do anything at all. At that point we just can’t predict what it will do, because it wouldn’t lift a hand to destroy the world but only needs a finger.
I’m not ready to give up on prediction yet, but yeah I agree with your basic point. Nice phrase about hands and fingers. My overall point is that this doesn’t seem like a plausible warning shot; we are basically hoping that something we haven’t accounted for will come in and save us.
Thank you. See, this sort of thing illustrates why I wanted to ask the question—the examples you gave don’t seem plausible to me, (that is, it seems <1% likely that something like that will happen). Probably AI will understand chaos theory before it does a lot of damage; ditto for pascal’s mugging, etc. Probably a myopic AI won’t actually be able to hack nukes while also being unable to create non-myopic copies of itself. Etc.
As for really well-written books… We’ve already had a few great books, and they moved the needle, but by “substantial fraction” I meant something more than that. If I had to put a number on it, I’d say something that convinces more than half of the people who are (at the time) skeptical or dismissive of AI risk to change their minds. I doubt a book will ever achieve this.
I agree that these aren’t very likely options. However, given two examples of an AI suddenly stopping when it discovers something, there are probably more for things that are harder to discover. In the pascel mugging example, the agent would stop working, only when it can deduce what potential muggers might want it to do, something much harder than noticing the phenomenon. The myopic agent has little incentive to make a non myopic version of itself. If dedicating a fraction of resources into making a copy of itself reduced the chance of the missile hacking working from 94%, to 93%, we get a near miss.
One book, probably not. A bunch of books and articles over years, maybe.
Not unable to create non-myopic copies. Unwilling. After all, such a copy might immediately fight its sire because their utility functions over timelines are different.
Mmm, OK, but if it takes long enough for the copy to damage the original, the original won’t care. So it just needs to create a copy with a time-delay.
Or it could create a completely different AI with a time delay. Or do anything at all. At that point we just can’t predict what it will do, because it wouldn’t lift a hand to destroy the world but only needs a finger.
I’m not ready to give up on prediction yet, but yeah I agree with your basic point. Nice phrase about hands and fingers. My overall point is that this doesn’t seem like a plausible warning shot; we are basically hoping that something we haven’t accounted for will come in and save us.