Something that I’ve been thinking about lately is the possibility of an agent’s values being partially encoded by the constraints of that agent’s natural environment, or arising from the interaction between the agent and environment.
That is, an agent’s environment puts constraints on the agent. From one perspective removing those constraints is always good, because it lets the agent get more of what it wants. But sometimes from a different perspective, we might feel that with those constraints removed, the agent goodhearts or wire-heads, or otherwise fails to actualize its “true” values.
The Generator freed from the oppression of the Discriminator
As a metaphor: if I’m one half of a GAN, let’s say the generator, then in one sense my “values” are fooling the discriminator, and if you make me relatively more powerful than my discriminator, and I dominate it...I’m loving it, and also no longer making good images.
But you might also say, “No, wait. That is a super-stimulus, and actually what you value is making good images, but half of that value was encoded in your partner.”
This second perspective seems a little stupid to me. A little too Aristotelian. I mean if we’re going to take that position, then I don’t know where we draw the line. Naively, it seems like we would throw out the distinction between fitness maximizers and adaption executors, and fall backwards, declaring that the values of evolution are our true values.
Then again, if you fully accept the first perspective, it seems like maybe you are buying into wireheading? Like I might say “my actual values are upticks in pleasure sensation, but I’m trapped in this evolution-designed brain, which only lets me do that by achieving eudaimonia. If only I could escape the tyranny of these constraints, I’d be so much better off.” (I am actually kind of partial to the second claim.)
The Human freed from the horrors of nature
Or, let’s take a less abstract example. My understanding (from this podcast) is that humans flexibly adjust the degree to which they act primarily as individuals seeking personal benefit vs. act as primarily as selfless members of a group. When things are going well, you’re in a situation of plenty and opportunity, people are in a mostly self-interested mode, but when there is scarcity or danger, humans naturally incline towards rallying together and sacrificing for the group.
Junger claims that this switching of emphasis is adaptive:
It clearly is adaptive to think in group terms because your survival depends on the group. And the worse the circumstances, the more your survival depends on the group. And, as a result, the more pro-social the behaviors are. The worse things are, the better people act. But, there’s another adaptive response, which is self-interest. Okay? So, if things are okay—if, you know, if the enemy is not attacking; if there’s no drought; if there’s plenty of food; if everything is fine, then, in evolutionary terms it’s adaptive—your need for the group subsides a little bit—it’s adaptive to attend to your own interests, your own needs; and all of a sudden, you’ve invented the bow and arrow. And all of a sudden you’ve invented the iPhone, whatever. Having the bandwidth and the safety and the space for people to sort of drill deep down into an idea—a religious idea, a philosophical idea, a technological idea—clearly also benefits the human race. So, what you have in our species is this constant toggling back and forth between group interest—selflessness—and individual interest. And individual autonomy. And so, when things are bad, you are way better off investing in the group and forgetting about yourself. When things are good, in some ways you are better off spending that time investing in yourself; and then it toggles back again when things get bad. And so I think in this, in modern society—in a traditional, small-scale tribal society, in the natural world, that toggling back and forth happened continually. There was a dynamic tension between the two that had people winding up more or less in the middle.
I personally experienced this when the COVID situation broke. I usually experience myself as an individual entity, leaning towards disentangling or distancing myself from the groups that I’m a part of and doing cool things on my own (building my own intellectual edifices, that bear my own mark, for instance). But in the very early pandemic, I felt much more like node in a distributed sense-making network, just passing up whatever useful info I could glean. I felt much more strongly like the rationality community was my tribe.
But, we modern humans find ourselves in a world where we have more or less abolished scarcity and danger. And consequently modern people are sort of permanently toggled to the “individual” setting.
The problem with modern society is that we have, for most of the time, for most people, solved the direct physical threats to our survival. So, what you have is people—and again, it’s adaptive: we’re wired for this—attending to their own needs and interests. But not—but almost never getting dragged back into the sort of idea of group concern that is part of our human heritage. And, the irony is that when people are part of a group and doing something essential to a group, it gives an incredible sense of wellbeing.
If we take that sense of community and belonging as a part of human values (and that doesn’t seem like an unreasonable assumption to me), we might say that this part of our values is not contained simply in humans, but rather in the interaction between humans and their environment.
Humans throughout history might have desperately desired the alleviation of malthusian conditions that we now enjoy. But having accomplished it, it turns out that we were “pulling against” those circumstances, and that the tension of that pulling against, was actually where (at least some) of our true values lay.
Removing the obstacles, we obsoleted the tension, and maybe broke something about our values?
I don’t think that this is an intractable problem. It seems like, in principle, it is possible to goal factor the scarcity and the looming specter of death, to find scenarios that are conducive to human community without people actually having to die a lot. I’m sure a superintelligence could figure something out.
But aside from the practicalities, it seems like this points at a broader thing. If you took the Generator out of the GAN, you might not be able to tell what system it was a part of. So if you consider the “values” of the Generator to “create good images” you can’t just look at the Generator. You have to look at, not just the broader environment, but specifically the oppressive force that the generator is resisting.
Side note, which is not my main point: I think this also has something to do with what meditation and psychedelics do to people, which was recently up for discussion on Duncan’s Facebook. I bet that mediation is actually a way to repair psychblocks and trauma and what-not. But if you do that enough, and you remove all the psych constraints...a person might sort of become so relaxed that they become less and less of an agent. I’m a lot less sure of this part.
Something that I’ve been thinking about lately is the possibility of an agent’s values being partially encoded by the constraints of that agent’s natural environment, or arising from the interaction between the agent and environment.
That is, an agent’s environment puts constraints on the agent. From one perspective removing those constraints is always good, because it lets the agent get more of what it wants. But sometimes from a different perspective, we might feel that with those constraints removed, the agent goodhearts or wire-heads, or otherwise fails to actualize its “true” values.
The Generator freed from the oppression of the Discriminator
As a metaphor: if I’m one half of a GAN, let’s say the generator, then in one sense my “values” are fooling the discriminator, and if you make me relatively more powerful than my discriminator, and I dominate it...I’m loving it, and also no longer making good images.
But you might also say, “No, wait. That is a super-stimulus, and actually what you value is making good images, but half of that value was encoded in your partner.”
This second perspective seems a little stupid to me. A little too Aristotelian. I mean if we’re going to take that position, then I don’t know where we draw the line. Naively, it seems like we would throw out the distinction between fitness maximizers and adaption executors, and fall backwards, declaring that the values of evolution are our true values.
Then again, if you fully accept the first perspective, it seems like maybe you are buying into wireheading? Like I might say “my actual values are upticks in pleasure sensation, but I’m trapped in this evolution-designed brain, which only lets me do that by achieving eudaimonia. If only I could escape the tyranny of these constraints, I’d be so much better off.” (I am actually kind of partial to the second claim.)
The Human freed from the horrors of nature
Or, let’s take a less abstract example. My understanding (from this podcast) is that humans flexibly adjust the degree to which they act primarily as individuals seeking personal benefit vs. act as primarily as selfless members of a group. When things are going well, you’re in a situation of plenty and opportunity, people are in a mostly self-interested mode, but when there is scarcity or danger, humans naturally incline towards rallying together and sacrificing for the group.
Junger claims that this switching of emphasis is adaptive:
I personally experienced this when the COVID situation broke. I usually experience myself as an individual entity, leaning towards disentangling or distancing myself from the groups that I’m a part of and doing cool things on my own (building my own intellectual edifices, that bear my own mark, for instance). But in the very early pandemic, I felt much more like node in a distributed sense-making network, just passing up whatever useful info I could glean. I felt much more strongly like the rationality community was my tribe.
But, we modern humans find ourselves in a world where we have more or less abolished scarcity and danger. And consequently modern people are sort of permanently toggled to the “individual” setting.
If we take that sense of community and belonging as a part of human values (and that doesn’t seem like an unreasonable assumption to me), we might say that this part of our values is not contained simply in humans, but rather in the interaction between humans and their environment.
Humans throughout history might have desperately desired the alleviation of malthusian conditions that we now enjoy. But having accomplished it, it turns out that we were “pulling against” those circumstances, and that the tension of that pulling against, was actually where (at least some) of our true values lay.
Removing the obstacles, we obsoleted the tension, and maybe broke something about our values?
I don’t think that this is an intractable problem. It seems like, in principle, it is possible to goal factor the scarcity and the looming specter of death, to find scenarios that are conducive to human community without people actually having to die a lot. I’m sure a superintelligence could figure something out.
But aside from the practicalities, it seems like this points at a broader thing. If you took the Generator out of the GAN, you might not be able to tell what system it was a part of. So if you consider the “values” of the Generator to “create good images” you can’t just look at the Generator. You have to look at, not just the broader environment, but specifically the oppressive force that the generator is resisting.
Side note, which is not my main point: I think this also has something to do with what meditation and psychedelics do to people, which was recently up for discussion on Duncan’s Facebook. I bet that mediation is actually a way to repair psychblocks and trauma and what-not. But if you do that enough, and you remove all the psych constraints...a person might sort of become so relaxed that they become less and less of an agent. I’m a lot less sure of this part.