Agreed, recklessness is also bad. If we build an agent that prefers we keep existing we should also make sure it pursues that goal effectively and doesn’t accidentally kill us.
My reasoning is that we won’t be able to coexist with something smarter than us that doesn’t value us being alive if wants our energy/atoms.
barring new physics that lets it do it’s thing elsewhere, “wants our energy/atoms” seems pretty instrumentally convergent
“don’t built it” doesn’t seem plausible so:
we should not build things that kill us.
This probably means:
wants us to keep existing
effectively pursues that goal
note:”should” assumes you care about us not all dying. “Humans dying is good actually” accelerationists can ignore this advice obviously.
Things we shouldn’t build:
very chaotic but good autoGPT7 that:
make the most deadly possible virus (because it was curious)
accidentally release it (due to inadequate safety precautions)
compulsive murderer autoGPT7
it values us being alive but it’s also a compulsive murderer so it fails at that goal.
I predict a very smart agent won’t have such obvious failure modes unless it has very strange preferences
the virologists that might have caused COVID are a pretty convincing counterexample though
so yes recklessness is also bad.
In summary:
if you build a strong optimiser
or a very smart agent (same thing really)
make sure it doesn’t: kill everyone / (equivalently bad thing)
caring about us and not being horrifically reckless are two likely necessary properties of any such “not kill us all” agent
Agreed, recklessness is also bad. If we build an agent that prefers we keep existing we should also make sure it pursues that goal effectively and doesn’t accidentally kill us.
My reasoning is that we won’t be able to coexist with something smarter than us that doesn’t value us being alive if wants our energy/atoms.
barring new physics that lets it do it’s thing elsewhere, “wants our energy/atoms” seems pretty instrumentally convergent
“don’t built it” doesn’t seem plausible so:
we should not build things that kill us.
This probably means:
wants us to keep existing
effectively pursues that goal
note:”should” assumes you care about us not all dying. “Humans dying is good actually” accelerationists can ignore this advice obviously.
Things we shouldn’t build:
very chaotic but good autoGPT7 that:
make the most deadly possible virus (because it was curious)
accidentally release it (due to inadequate safety precautions)
compulsive murderer autoGPT7
it values us being alive but it’s also a compulsive murderer so it fails at that goal.
I predict a very smart agent won’t have such obvious failure modes unless it has very strange preferences
the virologists that might have caused COVID are a pretty convincing counterexample though
so yes recklessness is also bad.
In summary:
if you build a strong optimiser
or a very smart agent (same thing really)
make sure it doesn’t: kill everyone / (equivalently bad thing)
caring about us and not being horrifically reckless are two likely necessary properties of any such “not kill us all” agent