The implied argument that “we cannot prove X, therefore X cannot be true or false” is not logically valid. I mentioned this recently when Caspar made a similar argument.
I think it is true, however, that humans do not have utility functions. I would not describe that, however, by saying that humans are not rational; on the contrary, I think pursuing utility functions is the irrational thing.
In practice, “humans don’t have values” and “humans have values, but we can never know what they are” are not meaningfully different.
I also wouldn’t get too hung up on utility function; a utility function just means that the values don’t go wrong when an agent tries to be consistent and avoid money pumps. If we want to describe human values, we need to find values that don’t go crazy when transformed into utility functions.
I would add that values are probably not actually existing objects but just useful ways to describe human behaviour. Thinking that they actually exist is mind projection fallacy.
In the world of facts we have: human actions, human claims about the actions and some electric potentials inside human brains. It is useful to say that a person has some set of values to predict his behaviour or to punish him, but it doesn’t mean that anything inside his brain is “values”.
If we start to think that values actually exist, we start to have all the problems of finding them, defining them and copying into an AI.
The problem with your “in practice” argument is that it would similarly imply that we can never know if someone is bald, since it is impossible to give a definition of baldness that rigidly separate bald people from non-bald people while respecting what we mean by the word. But in practice we can know that a particular person is bald regardless of the absence of that rigid definition. In the same way a particular person can know that he went to the store to buy milk, even if it is theoretically possible to explain what he did by saying that he has an abhorrence of milk and did it for totally different reasons.
Likewise, in practice we can avoid money pumps by avoiding them when they come up in practice. We don’t need to formulate principles which will guarantee that we will avoid them.
A person with less than 6% hair is bald, a person with 6% − 15% hair might be bald, but it is unknowable due to the nature of natural language. A person with 15% − 100% hair is not bald.
We can’t always say whether someone is bald, but more often, we can. Baldness remains applicable.
In the same way a particular person can know that he went to the store to buy milk
Yes. Isn’t this fascinating? What is going on in human minds that, not only can we say stuff about our own values and rationality, but about those of other humans? And can we copy that into an AI somehow?
What about a situation when a person says and thinks that he is going to buy a milk, but actually buy milk plus some sweets? And do it often, but do not acknowledge compulsive-obsessive behaviour towards sweets?
They don’t have to acknowledge compulsive-obsessive behavior. Obviously they want both milk and sweets, even if they don’t notice wanting the sweets. That doesn’t prevent other people from noticing it.
Also, they may be lying, since they might think that liking sweets is low status.
The implied argument that “we cannot prove X, therefore X cannot be true or false” is not logically valid. I mentioned this recently when Caspar made a similar argument.
I think it is true, however, that humans do not have utility functions. I would not describe that, however, by saying that humans are not rational; on the contrary, I think pursuing utility functions is the irrational thing.
In practice, “humans don’t have values” and “humans have values, but we can never know what they are” are not meaningfully different.
I also wouldn’t get too hung up on utility function; a utility function just means that the values don’t go wrong when an agent tries to be consistent and avoid money pumps. If we want to describe human values, we need to find values that don’t go crazy when transformed into utility functions.
That seems misguided. If you want to describe human values, you need to describe them as you find them, not as you would like them to be.
I would add that values are probably not actually existing objects but just useful ways to describe human behaviour. Thinking that they actually exist is mind projection fallacy.
In the world of facts we have: human actions, human claims about the actions and some electric potentials inside human brains. It is useful to say that a person has some set of values to predict his behaviour or to punish him, but it doesn’t mean that anything inside his brain is “values”.
If we start to think that values actually exist, we start to have all the problems of finding them, defining them and copying into an AI.
The problem with your “in practice” argument is that it would similarly imply that we can never know if someone is bald, since it is impossible to give a definition of baldness that rigidly separate bald people from non-bald people while respecting what we mean by the word. But in practice we can know that a particular person is bald regardless of the absence of that rigid definition. In the same way a particular person can know that he went to the store to buy milk, even if it is theoretically possible to explain what he did by saying that he has an abhorrence of milk and did it for totally different reasons.
Likewise, in practice we can avoid money pumps by avoiding them when they come up in practice. We don’t need to formulate principles which will guarantee that we will avoid them.
A person with less than 6% hair is bald, a person with 6% − 15% hair might be bald, but it is unknowable due to the nature of natural language. A person with 15% − 100% hair is not bald.
We can’t always say whether someone is bald, but more often, we can. Baldness remains applicable.
We can make similar answers about people’s intentions.
Yes. Isn’t this fascinating? What is going on in human minds that, not only can we say stuff about our own values and rationality, but about those of other humans? And can we copy that into an AI somehow?
That will be the subject of subsequent posts.
What about a situation when a person says and thinks that he is going to buy a milk, but actually buy milk plus some sweets? And do it often, but do not acknowledge compulsive-obsessive behaviour towards sweets?
They don’t have to acknowledge compulsive-obsessive behavior. Obviously they want both milk and sweets, even if they don’t notice wanting the sweets. That doesn’t prevent other people from noticing it.
Also, they may be lying, since they might think that liking sweets is low status.