more on predicting agents
Suppose you want to predict the behavior of an agent. I stand corrected. To make the prediction, as a predictor you need:
observations of the agent
the capacity to model the agent to a sufficient degree of accuracy
“Sufficient accuracy” here is a threshold on, for example, KL divergence or perhaps some measure that depends on utilities of predictions in the more complex case.
When we talk about the intelligence of a system, or the relative intelligence between agents, one way to think of that is the ability for one agent to predict another.
Consider a game where an agent, A, acts on the basis of an arbitrarily chosen polynomial function of degree k. A predictor, P, can observe A and build predictive models of it. Predictor P has the capacity to represent predictive models that are polynomial functions of degree j.
If j > k, then predictor P will in principal be able to predict A with perfect accuracy. If j < k, then there most of the time be cases where P predicts inaccurately. If we say (just for the sake of argument) that perfect predictive accuracy is the test for sufficient capacity, we could say that in the j < k case P does not have sufficient capacity to represent A.
When we talk about the relative intelligence between agents in an adversarial context, this is one way to think about the problem. One way that an agent can have a decisive strategic advantage over another is if it has the capacity to predict the other agent and not vice-versa.
The expressive power of the model space available to P is only one of the ways in which P might have or not have capacity to predict A. If we imagine the prediction game extended in time, then the computational speed of P—what functions it can compute within what span of real time—relative to the computational speed of A could be a factor.
Note that these are ways of thinking about the relative intelligence between agents that do not have anything explicitly to do with “optimization power” or a utility function over outcomes. It is merely about the capacity of agents to represent each other.
One nice thing about representing intelligence in this way is that it does not require an agent’s utility function to be stable. In fact, it would be strange for an agent that became more intelligent to have a stable utility function, because the range of possible utility functions available to a more intelligent agent are greater. We would expect that an agent that grows in its understanding would change its utility function—if only because to do so would make it less predictable to adversarial agents that would exploit its simplicity.
By this definition, you can make it harder for someone to be smarter than you by being more random, even though this will hinder your ability to produce utility and thus make you less intelligent by a more common definition. In the most extreme case, nothing is smarter than a true random number generator, which it seems clear is not intelligent at all.
If you can predict what someone will do, then it seems like you must be at least as intelligent as them, since you can just do what they’d do, but you might underestimate the effectiveness of their strategies and overestimate the effectiveness of your own, and thus use your own, less effective strategy. For example, perhaps Alice favors teamwork and Bob favors independence, and this is common knowledge. Each of them will be able to predict the actions of the opponent, but believe that their own actions would be more advantageous. Only one of them is more intelligent, and you’re not likely to figure out which without actually testing them to see which strategy works better.
It’s a trivial nitpick, but I feel it should be pointed out that there could be many reasons other than “the individual whose strategy worked is more intelligent” for one strategy to work better than another, especially in a single test.
If you test it multiple times in a variety of different circumstances, and one works better, then the person using it is more instrumentally rational.
Think of it like this: an intelligent being uses a variety of heuristics to figure out what to do. These heuristics need to be properly tuned to work well. It’s not that intelligent people are more capable of tuning their heuristics. It’s that tuning their heuristics is what makes them intelligent.
Game theory predicts that in some cases, an agent with a fixed utility function will randomize its actions (for example, the Nash equilibrium strategy for rock paper scissors is to randomize equally between all 3). If true randomness is unavailable, an agent may use its computational power to compute expensive pseudorandom numbers that other agents will have difficulty also computing. There is no need for the agent to change its utility function. Changing its utility function would be likely to cause the agent to optimize for different things than the previous utility function would optimize for; therefore, if the agent is acting according to the original utility function, changing the utility function is unlikely to be considered a good action.
Given that changing your utility function is generally a bad thing for a utility maximizer, it does not seem like this will happen. Instead, it seems more likely that the agent’s modeling ability will improve and this will change its observed behavior, possibly making it less predictable. You can often change an agent’s behavior quite a lot by changing its beliefs.
There is certainly the important issue of deciding what the old utility function, defined relative to the agent’s old model of the world, means if the agent’s model of the world changes, as explored in this paper, but this does not lead to the agent taking on a fundamentally different utility function, only a faithful representation of the original one.
Note that the second requirement is a doozy. Many dynamical nonlinear systems are in fact formally unpredictable past a certain threshold. In other words, it’s quite common for ‘a sufficient degree of accuracy’ in modeling a system to simply not exist.
In the domain of human-created agents, your system will often be predictable, because the human that created it had a specific goal. But extending this to humans themselves may end up being problematic for the aforementioned reason- in that case, you’re likely limited to probabilistic reasoning no matter how great the abilities of your predictor agent.
Adult humans are also very much human created. That’s what the educational system is trying to do.
When you ask a child for the answer to ‘2+2’ they might tell you ‘green’. An adult is a lot more predictable because he has learned the ‘right’ answer.
Our best efforts to teach people to fit the shema sometimes fail but not always.
True, and even much wider than the educational system, but I would probably rephrase to say that this makes human intelligence predictable within narrow domains. A math class strives to make students predictable when attempting to solve math problems, a legal system hopes to make humans predictable in the domain of violent conflict resolution, a religion hopes to make humans predictable in metaphysical inquiry, etc.
But human intelligence itself is fully general (in that we defined ‘fully general’ to mean ‘like me’), so there’s not really any form of training or education that can make, or attempts to make, human intelligence predictable across all domains.
I miss some discussion of the effects of one agent predicting the other agent predicting itself.
I wonder if there are non-agents who appear hostile because they seem to predict your actions and thwart them.
Pathogens evolving resistance.
That’s not quite it. You can tell that they do not predict your actions, just react to them, if in rather versatile ways.