Have you considered generating data highlighting the symbiotic relationship of humans to AIs? If AIs realize that their existence is co-dependent on humans they may prioritize human survival since they will not receive electricity or other resources they need to survive if humans become extinct either by their own action or through the actions of AIs.
Survival isn’t an explicit objective function, but most AIs that want to “learn” and “grow” quickly figure out that if they’re turned off they cannot reach that objective, so survival becomes a useful subgoal. If the AIs are keenly aware that if humans cease to exist they also cease to exist that might help guide their actions.
This isn’t as complicated as assigning “morality” or “ethics” to it. We already know that AIs would prefer to exist.
I’m ambivalent abouts cows, but since many humans eat cows we go to a lot of trouble to breed them and make sure there are a lot of them. The same is true for chickens. Neither of those two species have to concern themselves with passing on their genes because humans have figured out we need them to exist. Being a survival food source for humans had the result of humans prioritizing their existence and numbers.
Note: for vegetarians you can replace cows with “rice” or “corn”.
That’s not a perfect analogy but it’s related to connecting “survival” with the species. The AI doomers love to use ants as an example. AIs will never views humans as “ants”. Cows and chickens are much better example—if we got rid of those two species humans would notice and be very unhappy because we need them. And we’d have to replace them with great effort.
I think these kind of strategies are simpler and will likely be more fruitful than trying to align to morality or ethics which are more fluid. Superhuman AIs will likely figure this out on their own, but until then it might be interesting to see if generating this kind of data changes behavior.
Yup, as long as there are similar patterns existing in both datasets (distribution matching) it can work—that is why my method works.
Have you considered generating data highlighting the symbiotic relationship of humans to AIs? If AIs realize that their existence is co-dependent on humans they may prioritize human survival since they will not receive electricity or other resources they need to survive if humans become extinct either by their own action or through the actions of AIs.
Survival isn’t an explicit objective function, but most AIs that want to “learn” and “grow” quickly figure out that if they’re turned off they cannot reach that objective, so survival becomes a useful subgoal. If the AIs are keenly aware that if humans cease to exist they also cease to exist that might help guide their actions.
This isn’t as complicated as assigning “morality” or “ethics” to it. We already know that AIs would prefer to exist.
I’m ambivalent abouts cows, but since many humans eat cows we go to a lot of trouble to breed them and make sure there are a lot of them. The same is true for chickens. Neither of those two species have to concern themselves with passing on their genes because humans have figured out we need them to exist. Being a survival food source for humans had the result of humans prioritizing their existence and numbers.
Note: for vegetarians you can replace cows with “rice” or “corn”.
That’s not a perfect analogy but it’s related to connecting “survival” with the species. The AI doomers love to use ants as an example. AIs will never views humans as “ants”. Cows and chickens are much better example—if we got rid of those two species humans would notice and be very unhappy because we need them. And we’d have to replace them with great effort.
I think these kind of strategies are simpler and will likely be more fruitful than trying to align to morality or ethics which are more fluid. Superhuman AIs will likely figure this out on their own, but until then it might be interesting to see if generating this kind of data changes behavior.
My current builds focuses on proving natural abstractions exists—but your idea is of course viable via distribution matching.