I… actually don’t know what myopia is supposed to mean in the AI context (I had previously commented that the post Defining Myopia doesn’t define myopia and am still kinda waiting on a more succinct definition)
Heh. I actually struggled to figure out which post to link there because I was looking for one that would provide a clear, canonical definition, and ended up just picking the tag page. Here are a couple definitions buried in those posts though:
We can think of a myopic agent as one that only considers how best to answer the single question that you give to it rather than considering any sort of long-term consequences
I’ll define a myopic reinforcement learner as a reinforcement learning agent trained to maximise the reward received in the next timestep, i.e. with a discount rate of 0.
...
I should note that so far I’ve been talking about myopia as a property of a training process. This is in contrast to the cognitive property that an agent might possess, of not making decisions directly on the basis of their long-term consequences; an example of the latter is approval-directed agents.
So, a myopic agent is one that only considers the short-term consequences when deciding how to act. And a myopic learner is one that is only trained based on short-term feedback.
(And perhaps worth noting, in case it’s not obvious, I assume the name was chose because myopia means short-sightedness, and these potential AIs are deliberately made to be short-sighted, s.t. they’re not making long-term, consequentialist plans.)
My take on myopia is that it’s “shortsightedness” in the sense of only trying to do “local work”. If I ask you what two times two is, you say “four” because it’s locally true, rather than because you anticipate the consequences of different numbers, and say “five” because that will lead to a consequence you prefer. [Or you’re running a heuristic that approximates that anticipation.]
If you knew that everyone in a bureaucracy were just “doing their jobs”, this gives you a sort of transparency guarantee, where you just need to follow the official flow of information to see what’s happening. Unless asked to design a shadow bureaucracy or take over, no one will do that.
However, training doesn’t give you this by default; people in the bureaucracy are incentivized to make their individual departments better, to say what the boss wants to hear, to share gossip at the water cooler, and so on. One of the scenarios people consider is the case where you’re training an AI to solve some problem, and at some point it realizes it’s being trained to solve that problem and so starts performing as well as it can on that metric. In animal reinforcement training, people often talk about how both you’re training the animal to perform tricks for rewards, and the animal is training you to reward it! The situation is subtly different here, but the basic figure-ground inversion holds.
I… actually don’t know what myopia is supposed to mean in the AI context (I had previously commented that the post Defining Myopia doesn’t define myopia and am still kinda waiting on a more succinct definition)
Heh. I actually struggled to figure out which post to link there because I was looking for one that would provide a clear, canonical definition, and ended up just picking the tag page. Here are a couple definitions buried in those posts though:
(from: Towards a mechanistic understanding of corrigibility)
(from: Arguments against myopic training)
So, a myopic agent is one that only considers the short-term consequences when deciding how to act. And a myopic learner is one that is only trained based on short-term feedback.
(And perhaps worth noting, in case it’s not obvious, I assume the name was chose because myopia means short-sightedness, and these potential AIs are deliberately made to be short-sighted, s.t. they’re not making long-term, consequentialist plans.)
My take on myopia is that it’s “shortsightedness” in the sense of only trying to do “local work”. If I ask you what two times two is, you say “four” because it’s locally true, rather than because you anticipate the consequences of different numbers, and say “five” because that will lead to a consequence you prefer. [Or you’re running a heuristic that approximates that anticipation.]
If you knew that everyone in a bureaucracy were just “doing their jobs”, this gives you a sort of transparency guarantee, where you just need to follow the official flow of information to see what’s happening. Unless asked to design a shadow bureaucracy or take over, no one will do that.
However, training doesn’t give you this by default; people in the bureaucracy are incentivized to make their individual departments better, to say what the boss wants to hear, to share gossip at the water cooler, and so on. One of the scenarios people consider is the case where you’re training an AI to solve some problem, and at some point it realizes it’s being trained to solve that problem and so starts performing as well as it can on that metric. In animal reinforcement training, people often talk about how both you’re training the animal to perform tricks for rewards, and the animal is training you to reward it! The situation is subtly different here, but the basic figure-ground inversion holds.