I’m not sure why you seem to think that I think of optionality-empowerment estimates as requiring anything resembling omniscience.
If we assume omniscience, it allows a very convenient type of argument:
Argument I [invalid]: Suppose an animal has a generic empowerment drive. We want to know whether it will do X. We should ask: Is X actually empowering?
However, if we don’t assume omniscience, then we can’t make arguments of that form. Instead we need to argue:
Argument II [valid]: Suppose an animal has a generic empowerment drive. We want to know whether it will do X. We should ask: Has the animal come to believe (implicitly or explicitly) that doing X is empowering?
I have the (possibly false!) impression that you’ve been implicitly using Argument I sometimes. That’s how omniscience came up.
For example, has a newborn bat come to believe (implicitly or explicitly) that flapping its arm-wings is empowering? If so, how did it come to believe that? The flapping doesn’t accomplish anything, right? They’re too young and weak to fly, and don’t necessarily know that flying is an eventual option to shoot for. (I’m assuming that baby bats will practice flapping their wings even if raised away from other bats, but I didn’t check, I can look it up if it’s a crux.) We can explain a sporadic flap or two as random exploration / curiosity, but I think bats practice flapping way too much for that to be the whole explanation.
Back to play-fighting. A baby animal is sitting next to its sibling. It can either play-fight, or hang out doing nothing. (Or cuddle, or whatever else.) So why play-fight?
Here’s the answer I prefer. I note that play-fighting as a kid presumably makes you a better real-fighter as an adult. And I don’t think that’s a coincidence; I think it’s the main point. In fact, I thought that was so obvious that it went without saying. But I shouldn’t assume that—maybe you disagree!
If you agree that “child play-fighting helps train for adult real-fighting” not just coincidentally but by design, then I don’t see the “Argument II” logic going through. For example, animals will play-fight even if they’ve never seen a real fight in their life.
So again: Why don’t your dog & cat just ignore each other entirely? Sure, when they’re already play-fighting, there are immediately-obvious reasons that they don’t want to be pinned. But if they’re relaxing, and not in competition over any resources, why go out of their way to play-fight? How did they come to believe that doing so is empowering? Or if they are in competition over resources, why not real-fight, like undomesticated adult animals do?
maximizing optionality automatically learns all motor skills—even up to bipedal walking
I agree, but I don’t think that’s strong evidence that nothing else is going on in humans. For example, there’s a “newborn stepping reflex”—newborn humans have a tendency to do parts of walking, without learning, even long before their muscles and brains are ready for the whole walking behavior. So if you say “a simple generic mechanism is sufficient to explain walking”, my response is “Well it’s not sufficient to explain everything about how walking is actually implemented in humans, because when we look closely we can see non-generic things going on”.
Here’s a more theoretical perspective. Suppose I have two side-by-side RL algorithms, learning to control identical bodies. One has a some kind of “generic” empowerment reward. The other has that same reward, plus also a reward-shaping system directly incentivizing learning to use some small number of key affordances that are known to work well for that particular body (e.g. standing).
I think the latter would do all the same things as the former, but it would learn faster and more reliably, particularly very early on. Agree or disagree? If you agree, then we should expect to find that in the brain, right?
(When I say “more reliably”, I’m referring to the trope that programming RL agents is really finicky, moreso than other types of ML. I don’t really know if that trope is correct though.)
Sure, but there is hardly room in the brainstem to reward-shape for the [10100] different things humans can learn to do.
I hope we’re not having one of those silly arguments where we both agree that empowerment explains more than 0% and less than 100% of whatever, and then we’re going back and forth saying “It’s more than 0%!” “No way, it’s less than 100%!” “No way, it’s more than 0%!” … :)
Anyway, I think the brainstem “knows about” some limited number of species-typical behaviors, and can probably execute those behaviors directly without learning, and also probably reward-shapes the cortex into learning those behaviors faster. Obviously I agree that the cortex can also learn pretty much arbitrary other behaviors, like ballet and touch-typing, which are not specifically encoded in the brainstem.
Thanks!
If we assume omniscience, it allows a very convenient type of argument:
Argument I [invalid]: Suppose an animal has a generic empowerment drive. We want to know whether it will do X. We should ask: Is X actually empowering?
However, if we don’t assume omniscience, then we can’t make arguments of that form. Instead we need to argue:
Argument II [valid]: Suppose an animal has a generic empowerment drive. We want to know whether it will do X. We should ask: Has the animal come to believe (implicitly or explicitly) that doing X is empowering?
I have the (possibly false!) impression that you’ve been implicitly using Argument I sometimes. That’s how omniscience came up.
For example, has a newborn bat come to believe (implicitly or explicitly) that flapping its arm-wings is empowering? If so, how did it come to believe that? The flapping doesn’t accomplish anything, right? They’re too young and weak to fly, and don’t necessarily know that flying is an eventual option to shoot for. (I’m assuming that baby bats will practice flapping their wings even if raised away from other bats, but I didn’t check, I can look it up if it’s a crux.) We can explain a sporadic flap or two as random exploration / curiosity, but I think bats practice flapping way too much for that to be the whole explanation.
Back to play-fighting. A baby animal is sitting next to its sibling. It can either play-fight, or hang out doing nothing. (Or cuddle, or whatever else.) So why play-fight?
Here’s the answer I prefer. I note that play-fighting as a kid presumably makes you a better real-fighter as an adult. And I don’t think that’s a coincidence; I think it’s the main point. In fact, I thought that was so obvious that it went without saying. But I shouldn’t assume that—maybe you disagree!
If you agree that “child play-fighting helps train for adult real-fighting” not just coincidentally but by design, then I don’t see the “Argument II” logic going through. For example, animals will play-fight even if they’ve never seen a real fight in their life.
So again: Why don’t your dog & cat just ignore each other entirely? Sure, when they’re already play-fighting, there are immediately-obvious reasons that they don’t want to be pinned. But if they’re relaxing, and not in competition over any resources, why go out of their way to play-fight? How did they come to believe that doing so is empowering? Or if they are in competition over resources, why not real-fight, like undomesticated adult animals do?
I agree, but I don’t think that’s strong evidence that nothing else is going on in humans. For example, there’s a “newborn stepping reflex”—newborn humans have a tendency to do parts of walking, without learning, even long before their muscles and brains are ready for the whole walking behavior. So if you say “a simple generic mechanism is sufficient to explain walking”, my response is “Well it’s not sufficient to explain everything about how walking is actually implemented in humans, because when we look closely we can see non-generic things going on”.
Here’s a more theoretical perspective. Suppose I have two side-by-side RL algorithms, learning to control identical bodies. One has a some kind of “generic” empowerment reward. The other has that same reward, plus also a reward-shaping system directly incentivizing learning to use some small number of key affordances that are known to work well for that particular body (e.g. standing).
I think the latter would do all the same things as the former, but it would learn faster and more reliably, particularly very early on. Agree or disagree? If you agree, then we should expect to find that in the brain, right?
(When I say “more reliably”, I’m referring to the trope that programming RL agents is really finicky, moreso than other types of ML. I don’t really know if that trope is correct though.)
I hope we’re not having one of those silly arguments where we both agree that empowerment explains more than 0% and less than 100% of whatever, and then we’re going back and forth saying “It’s more than 0%!” “No way, it’s less than 100%!” “No way, it’s more than 0%!” … :)
Anyway, I think the brainstem “knows about” some limited number of species-typical behaviors, and can probably execute those behaviors directly without learning, and also probably reward-shapes the cortex into learning those behaviors faster. Obviously I agree that the cortex can also learn pretty much arbitrary other behaviors, like ballet and touch-typing, which are not specifically encoded in the brainstem.