ESRogs comments on Solving Math Problems by Relay

ESRogs 18 Jul 2020 18:24 UTC
5 points
Heh. I actually struggled to figure out which post to link there because I was looking for one that would provide a clear, canonical definition, and ended up just picking the tag page. Here are a couple definitions buried in those posts though:
We can think of a myopic agent as one that only considers how best to answer the single question that you give to it rather than considering any sort of long-term consequences
(from: Towards a mechanistic understanding of corrigibility)
I’ll define a myopic reinforcement learner as a reinforcement learning agent trained to maximise the reward received in the next timestep, i.e. with a discount rate of 0.
...
I should note that so far I’ve been talking about myopia as a property of a training process. This is in contrast to the cognitive property that an agent might possess, of not making decisions directly on the basis of their long-term consequences; an example of the latter is approval-directed agents.
(from: Arguments against myopic training)
So, a myopic agent is one that only considers the short-term consequences when deciding how to act. And a myopic learner is one that is only trained based on short-term feedback.
(And perhaps worth noting, in case it’s not obvious, I assume the name was chose because myopia means short-sightedness, and these potential AIs are deliberately made to be short-sighted, s.t. they’re not making long-term, consequentialist plans.)