tailcalled answers What are the best arguments for/against AIs being “slightly ‘nice’”?

tailcalled 24 Sep 2024 8:11 UTC
2 points
0
The big problem is excess aggregation inherent in the “AI” concept.

The world has a simple backbone of entities and ways to interact with them, and you can make software that unreflectingly propagates activity from one part of the backbone to another. Most currently addressed tasks can be solved by such software, but they haven’t yet been. This software can be nice but is also extremely exploitable by adversaries. Let’s call this an opportunity propagator.

Because it is exploitable, one task it cannot solve is providing security. To make something less exploitable, it needs to not just propagate things along the backbone, but also do wildly deep searches to find the most effective and robust methods. To search deeply, you need some guiding principle for the search, i.e. a utility function. Utility maximizers have all the standard AI safety issues.

Human society currently cares about human well-being because the opportunity propagators that have been arranged into an approximate utility maximizer to provide security (e.g. human military personnel arranged into NATO) depends on human thriving (even something as generous as liberty and equality allows military units to respond more dynamically to threats than traditional top-down structures do), which is then generalized in various ways to all of society. Artificial intelligence provides value by making it unnecessary to rely on humans for opportunity propagation, which breaks the natural attractor to corrigibility and promotion of human thriving that current systems have.

People intuit that there’s something wrong with the utility maximizer framing because current AI seems to be evolving in a different way. That’s true in the sense that opportunity propagators are a thing and constitute ~the fundamental atoms of agency. But it doesn’t actually solve the alignment problem because we need utility maximizers.

tailcalled answers What are the best arguments for/​against AIs being “slightly ‘nice’”?

tailcalled answers What are the best arguments for/against AIs being “slightly ‘nice’”?