Charlie Steiner comments on A Semiotic Critique of the Orthogonality Thesis

Charlie Steiner 7 Jun 2024 14:25 UTC
3 points
0
smaIntelligence is determined by the number of code orders, or levels of sign recursion, capable of being created and held in memory by an agent.
Well, depends on what you mean by “intelligence.”
Certainly you need to be pretty smart to even even represent the idea of “human flourishing” in a way that we’d recognize. But this kind of “smartness” need not correspond to a behavioristic sort of smartness that’s part of a system that optimizes for any particular goal, it’s more of a predictive sort of smartness, where you internally represent the world in a predictively powerful way.
(The specifics of your postulated definition, especially that recursion->intelligence, seems like a not-very-useful way to define things, since Turing completeness probably means that once you clear a fairly low bar, your amount of possible recursion is just a measure of your hardware, when we usually want ‘intelligence’ to also capture something about your software. But the more standard information-theoretic notion of coding for a goal within a world-model would also say that bigger world models need (on average) bigger codes.)
Its internal representation of the final step in its plan, its final goal, is now much more expensive than the simple shape and color somewhere in the environment the less smart AI takes for its final goal, and indeed, the cheese itself takes on new meaning as a digital asset.
Alright, we’ve established that the two axes of final goals and intelligence are connected in terms of complexity and sophistication
You’re trying to do all your thinking from “inside the head” of the AI, here. But what if we use a different, human-perspective way of talking about goals? I.e. rather than imagining a model of the world that gets more complex as the AI gets smarter, instead fix some outside perspective, and evaluate the agent’s goal by doing intentional-stance type reasoning within that outside perspective.
(Note that this has to confront the fact that any behavior is compatible with a wide range of possible goals, which you can only prune down through assumptions about what an “agent” is, and what it’s likely to do. Whether you say the dumb agent “wants to win the game” or “doesn’t want to win the game, just wants to have low distance to yellow pixels” depends on assumptions. But this is worth it because we, as humans, don’t actually care [for safety purposes] about the complexity of the agent’s goal in its own world-model, we care about the descriptions we’d use.)
When we’re taking the human perspective, it’s fine to say “the smarter agent has such a richer and more complex conception of its goal,” without that implying that the smarter agent’s goal has to be different than the dumber agent’s goal.
what I wish to dispel is the idea that it is likely that artificial agents will have alien values,
a) Actions like exploration or “play” could be derived (instrumental) behaviors, rather than final goals. The fact that exploration is given as a final goal in many present-day AI systems is certainly interesting, but isn’t very relevant to the abstract theoretical argument.
b) Even if you assume that every smart AI has “and also, explore and play” as part of its goals, doesn’t mean the other stuff can’t be alien.
- Nicolas Villarreal 7 Jun 2024 14:42 UTC
  1 point
  0
  Parent
  (The specifics of your postulated definition, especially that recursion->intelligence, seems like a not-very-useful way to define things, since Turing completeness probably means that once you clear a fairly low bar, your amount of possible recursion is just a measure of your hardware, when we usually want ‘intelligence’ to also capture something about your software. But the more standard information-theoretic notion of coding for a goal within a world-model would also say that bigger world models need (on average) bigger codes.)
  So it might be a bit confusing, but by recursion here I did not mean like how many loops you do in a program, I meant what order of signs you can create and store, which is a statement of software. Basically, how many signs can you meaningfully connect to another. Not all hardware can represent higher order signs, easy example is a single layer vs multilevel perceptron. Perhaps recursion was the wrong word, but at the time I was thinking about how a sign can refer to another sign that refers to another sign and so on, creating a chain of signifiers which is still meaningful so long as the higher order signs refer to more than one lower order sign.
  When we’re taking the human perspective, it’s fine to say “the smarter agent has such a richer and more complex conception of its goal,” without that implying that the smarter agent’s goal has to be different than the dumber agent’s goal.
  The point of bringing semiotics into the mix here is to show that the meaning of a sign, such as a goal, is dependent on the things we associate with it. The human perspective is just a way of expressing that goal at one moment in time with our specific associations with it
  a) Actions like exploration or “play” could be derived (instrumental) behaviors, rather than final goals. The fact that exploration is given as a final goal in many present-day AI systems is certainly interesting, but isn’t very relevant to the abstract theoretical argument.
  In my follow up post I actually show the way in which it is relevant.
  b) Even if you assume that every smart AI has “and also, explore and play” as part of its goals, doesn’t mean the other stuff can’t be alien.
  The argument about alien values isn’t the logical one but the statistical one, any AI situated in human culture will have values that are likely to be related to the signs created and used by that culture, although we can expect outliers.