Indeed the altruistic part seems to be interestingly close to a broad ‘world empowerment’, but I’ve some doubts about a few elements surrounding this : “the short term component of utility is the easiest to learn via obvious methods”
It could be true, but there are worries that it might be hard, so I try to find a way to resolve this?
If the rule/policy to choose the utility function is a preference based on a model of humans/agents then there might be ways to circumvent/miss what we would truly prefer (the traction of maximization would cross the limited sharpness/completeness of models), because the model underfits reality (which would drift into more and more divergence as the model updates along the transformations performed by AI)
In practice this would allow a sort of intrusion of AI into agents to force mutations.
So, Intrusion could be instrumental
Which is why I want to escape the ‘trap of modelling’ even further by indirectly targeting our preferences through a primal goal of non-myopic optionality (even more externally focused) before guessing utility.
If your #2 is a least concern then indeed those worries aren’t as meaningful
I’m also trying to avoid us becoming grabby aliens, but if -> Altruism is naturally derived from a broad world empowerment
Then it could be functional because the features of the combination of worldwide utilities (empower all agencies) *are* altruism, sufficiently to generalize in the ‘latent space of altruism’ which implies being careful about what you do to other planets
The maximizer worry would also be tamed by design
And in fact my focus on optionality would essentially be the same to a worldwide agency concern (but I’m thinking of an universal agency to completely erase the maximizer issue)
All right! Thank you for the precision,
Indeed the altruistic part seems to be interestingly close to a broad ‘world empowerment’, but I’ve some doubts about a few elements surrounding this : “the short term component of utility is the easiest to learn via obvious methods”
It could be true, but there are worries that it might be hard, so I try to find a way to resolve this?
If the rule/policy to choose the utility function is a preference based on a model of humans/agents then there might be ways to circumvent/miss what we would truly prefer (the traction of maximization would cross the limited sharpness/completeness of models), because the model underfits reality (which would drift into more and more divergence as the model updates along the transformations performed by AI)
In practice this would allow a sort of intrusion of AI into agents to force mutations.
So,
Intrusion could be instrumental
Which is why I want to escape the ‘trap of modelling’ even further by indirectly targeting our preferences through a primal goal of non-myopic optionality (even more externally focused) before guessing utility.
If your #2 is a least concern then indeed those worries aren’t as meaningful
I’m also trying to avoid us becoming grabby aliens, but if
-> Altruism is naturally derived from a broad world empowerment
Then it could be functional because the features of the combination of worldwide utilities (empower all agencies) *are* altruism, sufficiently to generalize in the ‘latent space of altruism’ which implies being careful about what you do to other planets
The maximizer worry would also be tamed by design
And in fact my focus on optionality would essentially be the same to a worldwide agency concern (but I’m thinking of an universal agency to completely erase the maximizer issue)