I don’t think the way you split things up into Alpha and Beta quite carves things at the joints. If you take an individual human as Beta, then stuff like “eudaimonia” is in Alpha—it’s a concept in the cultural environment that we get exposed to and sometimes come to value. The vast majority of an individual human’s values are not new abstractions that we develop over the course of our training process (for most people at least).
Basically people tend to value stuff they perceive in the biophysical environment and stuff they learn about through the social environment.
So that reduces the complexity of the problem—it’s not a matter of designing a learning algorithm that both derives and comes to value human abstractions from observations of gas particles or whatever. That’s not what humans do either.
Okay then, why aren’t we star-maximizers or number-of-nation-states maximizers? Obviously it’s not just a matter of learning about the concept. The details of how we get values hooked up to an AGI’s motivations will depend on the particular AGI design but probably reward, prompting, scaffolding or the like.
I don’t think the way you split things up into Alpha and Beta quite carves things at the joints. If you take an individual human as Beta, then stuff like “eudaimonia” is in Alpha—it’s a concept in the cultural environment that we get exposed to and sometimes come to value. The vast majority of an individual human’s values are not new abstractions that we develop over the course of our training process (for most people at least).
Basically people tend to value stuff they perceive in the biophysical environment and stuff they learn about through the social environment.
So that reduces the complexity of the problem—it’s not a matter of designing a learning algorithm that both derives and comes to value human abstractions from observations of gas particles or whatever. That’s not what humans do either.
Okay then, why aren’t we star-maximizers or number-of-nation-states maximizers? Obviously it’s not just a matter of learning about the concept. The details of how we get values hooked up to an AGI’s motivations will depend on the particular AGI design but probably reward, prompting, scaffolding or the like.