Why should I agree that a boundedly rational agent’s goals need to be computationally tractable? Humans have goals and desires they lack the capability to achieve all the time. Sometimes they make plans to try to increase tractability, and sometimes those plans work, but there’s nothing odd about intractable goals. It might be a mistake in some senses to build such an agent, but that’s a different question.
Computationally tractable is Yudkowsky’s framing and might be too limited. The kind of thing I believe is for example, an animal without a certain brain complexity will tend not to be a social animal and is therefore unlikely to have the sort of values social animals have. And animals that can’t do math aren’t going to value mathematical aesthetics the way human mathematicians do.
Ah ok, that makes sense. That’s more about being able to understand what the goal is, not about the ability to compute what actions are able to achieve it.
Why should I agree that a boundedly rational agent’s goals need to be computationally tractable? Humans have goals and desires they lack the capability to achieve all the time. Sometimes they make plans to try to increase tractability, and sometimes those plans work, but there’s nothing odd about intractable goals. It might be a mistake in some senses to build such an agent, but that’s a different question.
Computationally tractable is Yudkowsky’s framing and might be too limited. The kind of thing I believe is for example, an animal without a certain brain complexity will tend not to be a social animal and is therefore unlikely to have the sort of values social animals have. And animals that can’t do math aren’t going to value mathematical aesthetics the way human mathematicians do.
Ah ok, that makes sense. That’s more about being able to understand what the goal is, not about the ability to compute what actions are able to achieve it.