And this disaster can’t be an unfriendly super-AI, because that should be visible
This is not necessarily true. If the goals of the AI do not involve a rapid acquisition of resources even outside its solar system, then we would not see evidence for it (E.g, wireheading that does not involve creating as many sentient organisms as possible).
However, because there would be many instances of this, AI being the filter is probably still not likely. If it’s very likely for UAI to be screwed up in a self-contained way, we would not expect to see evidence of life. If UAI has a non-negligible chance to gobble up everything it sees for energy, then we would expect to se it.
wireheading that does not involve creating as many sentient organisms as possible
Sure. For instance, consider the directive “Make everyone on Earth live as happily as possible for as long as the sun lasts.” Solution: Wirehead everyone, then figure out how to blow up the sun, then shut down — mission accomplished.
Not if the system is optimizing for the probability of success and can cheaply send out probes to eat the universe and use it to make sure the job is finished lest something go wrong (e.g. the sun-destroyer [???] failed, or aliens resuscitate the Sun under whatever criterion of stellar life is used).
“Our analysis of the alien probe shows that its intended function is to … um … go back to a star that blew up ten thousand years ago and make damn sure that it’s blown up and not just fooling.”
A UFAI that doesn’t go around eating stars to make paper-clips is probably already someone’s attempted FAI. Bringing arbitrarily large sums of mass-energy and negentropy under one’s control is a Basic AI Drive, so you have to program the utility function to actually penalize it.
Only if the AI has goals that both require additional energy, and don’t have a small, bounded success condition.
For example, if an UFAI for humans has a goal that requires humans to be there, but is not allowed to create/lead to the creation of more, then if all humans are already dead it won’t do anything.
This is not necessarily true. If the goals of the AI do not involve a rapid acquisition of resources even outside its solar system, then we would not see evidence for it (E.g, wireheading that does not involve creating as many sentient organisms as possible).
However, because there would be many instances of this, AI being the filter is probably still not likely. If it’s very likely for UAI to be screwed up in a self-contained way, we would not expect to see evidence of life. If UAI has a non-negligible chance to gobble up everything it sees for energy, then we would expect to se it.
Sure. For instance, consider the directive “Make everyone on Earth live as happily as possible for as long as the sun lasts.” Solution: Wirehead everyone, then figure out how to blow up the sun, then shut down — mission accomplished.
Not if the system is optimizing for the probability of success and can cheaply send out probes to eat the universe and use it to make sure the job is finished lest something go wrong (e.g. the sun-destroyer [???] failed, or aliens resuscitate the Sun under whatever criterion of stellar life is used).
“Our analysis of the alien probe shows that its intended function is to … um … go back to a star that blew up ten thousand years ago and make damn sure that it’s blown up and not just fooling.”
A UFAI that doesn’t go around eating stars to make paper-clips is probably already someone’s attempted FAI. Bringing arbitrarily large sums of mass-energy and negentropy under one’s control is a Basic AI Drive, so you have to program the utility function to actually penalize it.
Only if the AI has goals that both require additional energy, and don’t have a small, bounded success condition.
For example, if an UFAI for humans has a goal that requires humans to be there, but is not allowed to create/lead to the creation of more, then if all humans are already dead it won’t do anything.