otto.barten comments on If we solve alignment, do we die anyway?

otto.barten 18 Jan 2025 8:05 UTC
2 points
0
So if we ended up with some advanced AI’s replacing humans, then we made some sort of mistake
Again, I’m glad that we agree on this. I notice you want to do what I consider the right thing, and I appreciate that.
The way I currently envision the “typical” artificial conscience is that it would put a pretty strong conscience weight on not doing what its user wanted it to do, but this could be over-ruled by the conscience weight of not doing anything to prevent catastrophes. So the defensive, artificial conscience-guard-railed AI I’m thinking of would do the “last resort” things that were necessary to avoid s-risks, x-risks, and major catastrophes from coming to fruition, even if this wasn’t popular with most people, at least up to a point.
I can see the following scenario occur: the AI, with its AC, decided rightly that a pivotal act needs to be undertaken to avoid xrisk (or srisk). However, the public mostly doesn’t recognize the existence of such risks. The AI will proceed sabotaging people’s unsafe AI projects against public will. What happens now is: the public gets absolutely livid at the AI, that is subverting human power by acting against human will. Almost all humans team up to try to shut down the AI. The AI recognizes (and had already recognized) that if it looses, humans risk going extinct, so it fights this war against humanity and wins. I think in this scenario, an AI, even one with artificial conscience, could become the most hated thing on the planet.
I think people underestimate the amount of pushback we’re going to get once you get into pivotal act territory. That’s why I think it’s hugely preferred to go the democratic route and not count on AI taking unilateral actions, even if it would be smarter or even wiser, whatever that might mean exactly.
All that said, if we could somehow pause development of autonomous AI’s everywhere around the world until humans got their act together, developing their own consciences and senses of ethics, and were working as one team to cautiously take the next steps forward with AI, that would be great.
So yes definitely agree with this. I don’t think lack of conscience or ethics is the issue though, but existential risk awareness.
- sweenesm 18 Jan 2025 13:22 UTC
  4 points
  2
  Parent
  In terms of doing a pivotal act (which is usually thought of as preemptive, I believe) or just whatever defensive acts were necessary to prevent catastrophe, I hope the AI would be advanced enough to make decent predictions of what the consequences of its actions could be in terms of losing “political capital,” etc., and then it would make its decisions strategically. Personally, if I had the opportunity to save the world from nuclear war, but everyone was going to hate me for it, I’d do it. But then, it wouldn’t matter that I lost the ability to affect anything after that like it would for a guard-railed AI that could do a huge amount of good after that if it weren’t shunned by society. Improving humans’ consciences and ethics would hopefully help avoid them hating the AI for saving them.
  Also, if there were enough people, especially in power, who had strong consciences and senses of ethics, then maybe we’d be able to shift the political landscape from its current state of countries seemingly having different values and not trusting each other, to a world in which enforceable international agreements could be much more readily achieved.
  I’m happy for people to work on increasing public awareness and trying for legislative “solutions,” but I think we should be working on artificial conscience at the same time—when there’s so much uncertainty about the future, it’s best to bet on a whole range of approaches, distributing your bets according to how likely you think different paths are to succeed. I think people are under-estimating the artificial conscience path right now, that’s all.
  Thanks for all your comments!