AnthonyC comments on AGI Safety FAQ / all-dumb-questions-allowed thread

AnthonyC 12 Jun 2022 15:25 UTC
2 points
0
For one, I don’t think organizations of humans, in general, do have more computational power than the individual humans making them up. I mean, at some level, yes, they obviously do in an additive sense, but that power consists of human nodes, each not devoting their full power to the organization because they’re not just drones under centralized control, and with only low bandwidth and noisy connections between the nodes. The organization might have a simple officially stated goal written on paper and spoken by the humans involved, but the actual incentive structure and selection pressure may not allow the organization to actually focus on the official goal. I do think, in general, there is some goal an observer could usefully say these organizations are, in practice, trying to optimize for, and some other set of goals each human in them is trying to optimize for.
Perhaps an intelligence (artificial or natural) cannot necessarily, or even typically be described as optimisers? Instead we could only model them as an algorithm or as a collection of tools/behaviours executed in some pattern.
I don’t think the latter sentence distinguishes ‘intelligence’ from any other kind of algorithm or pattern. I think that’s an important distinction. There’s a lot of past posts explaining how an AI doesn’t have code, like a human holding instructions on paper, but rather is its code. I think you can make the same point within a human: that a human has lots of tools/behaviors, which it will execute in some pattern given a particular environment, and the the instructions we consciously hold in mind are only one part of what determines that pattern.
I contain subagents with divergent goals, some of which are smarter and have greater foresight and planning than others, and those aren’t always the ones that determine by immediate actions. As a result, I do a much poorer job optimizing for what the part-of-me-I-call-”I” wants my goals to be, than I theoretically could.
That gap is decreasing over time as I use the degree of control my intelligence gives me to gradually shape the rest of myself. It may never disappear, but I am much more goal-directed now than I was 10 years ago, or as a child. In other words, in some sense I am figuring out what I want my utility function to be (aka what I want my life, local environment, and world to look like), and self-modifying to increase my ability to apply optimization pressure towards achieving that.
My understanding of all this is partially driven by Robert Kegan’s model of adult mental development (see this summary by David Chapman), in which as we grow up we shift our point of view so that different aspects of ourselves become things we have, rather than things we are. We start seeing our sensory experiences, our impulses, our relationships to others, and our relationships to systems we use and operate in, as objects we can manipulate in pursuit of goals, instead of being what we are, and doing this makes us more effective in achieving our stated goals. I don’t know if the idea would translate to any particular AI system, but in general having explicit goals, and being able to redirect available resources towards those goals, makes a system more powerful, and so if a system has any goals and self-modifying ability at all, then becoming more like an optimizer will likely be a useful instrumental sub-goal, in the same way that accumulating other resources and forms of power is a common convergent sub-goal. And a system that can’t, in any way, be said to have goals at all… either it doesn’t act at all and we don’t need to worry about it so much, or it acts in ways we can’t predict and is therefore potentially extremely dangerous if it gets more powerful tools and behaviors.

AnthonyC comments on AGI Safety FAQ /​ all-dumb-questions-allowed thread

AnthonyC comments on AGI Safety FAQ / all-dumb-questions-allowed thread