The discussion of human preferences has to be anthropomorphic, because human preferences are human. Phil is not anthropomorphizing the AI, he’s anthropomorphizing the humans it serves, which is OK.
FAI doesn’t serve humans. It serves human preference, which is an altogether different kind of thing, and not even something humans have had experience with.
An analogy: the atomic structure of a spoon is not spoon-morphic, just because atomic structure of a spoon is of a spoon.
I disagree that humans have no experience with human preference. It is true that “formal preference” is not identical to the verbalized statements that people pragmatically pursue in their lives, but I also think that the difference between the two is somewhat bounded by various factors, including the bound of personal identity: if you diverge so much from what you are today, you are effectively dead.
Formal preference and verbalized preference are completely different kind of objects, almost nothing in common. Verbalized preference talks about natural categories, clusters of situations found in actual human experience. You don’t have any verbalized preference about novel configurations of atoms that can’t be seen as instances of the usual things, modified by usual verbs. Formal preference, on the other hand, talks about all possible configurations of matter.
Personal identity, as I discussed earlier, is a referent in the world sought by our moral intuition, a concept in terms of which a significant part of our moral intuition is implemented. When me-in-the-future concept fails to find a referent, this is a failure of verbalized preference, not formal preference. You’ll get a gap on your map, inability to estimate moral worth of situations in the future that lack you-in-the-future on many important aspects. But this gap on the map doesn’t correspond to a gap on the moral territory, to these configurations automatically having equal or no moral worth.
This is also a point of difference between formal preference and verbalized preference: formal preference refers to a definition that determines the truth about the moral worth of all situation, that establishes the moral territory, while verbalized preference is merely a non-rigorous human-level attempt to glimpse the shape of this territory. Of course, even in a FAI, formal preference doesn’t allow to get all the answers, but it is the criterion for the truth of imperfect answers that FAI will be able to find.
The following sounds worryingly like moral realism:
formal preference refers to a definition that determines the truth about the moral worth of all situation, that establishes the moral territory, while verbalized preference is merely a non-rigorous human-level attempt to glimpse the shape of this territory
Of course if you meant it in an antirealist sense, then
while verbalized preference is merely a non-rigorous human-level attempt to glimpse the shape of this territory
is problematic, because the map (verbalized preference) contributes causally to determining the territory, because (the brainware that creates) your verbalized preferences determines (in part) your formal preference.
Compare with your beliefs being implemented as patterns in your brain, which is a part of the territory. That the fact of your beliefs being a certain way, apart of their meaning and truth of that meaning, is a truth in its own right, doesn’t shatter the conceptual framework of map-territory distinction. You’d just need to be careful with what is the subject matter you currently consider.
I don’t know what you read in realist/anti-realist distinction; for me, there is a subject matter, and truth of that subject matter, in all questions. The questions of how that subject matter came to be established, in what way it is being considered, and who considers it, are irrelevant to the correctness of statements about the subject matter itself. Here, we consider “formal preference”. How is it defined, whether the way it came to be defined was influenced by verbal preference, is irrelevant to what it actually asserts, once it’s established what we are talking about.
If I consider “the program that was written in file1.c this morning”, this subject matter doesn’t change if the file was renamed in the afternoon, or was deleted without anyone knowing its contents, or modified, even perhaps self-modified by compiling and running the program determined by that file. The fact of “contents of file1.c” is a trajectory, a history of change, but it’s a fact separate from “contents of file1.c this morning”, and neither fact can be changed, though the former fact (the trajectory of change of the content of the file) can be determined by one’s actions, through doing something with the file.
verbalized preference is merely a non-rigorous human-level attempt to glimpse the shape of this territory
I think that one should add that verbalized preference is both an attempt to glimpse the “territory” that is formalized preference, and the thing that causes formalized preferences to exist in the first place. It is at one and the same time a map and a foundation.
On the other hand, your beliefs are only a map of how the world is. The world remains even if you don’t have beliefs about it.
I think that one should add that verbalized preference is both an attempt to glimpse the “territory” that is formalized preference, and the thing that causes formalized preferences to exist in the first place. It is at one and the same time a map and a foundation.
Verbalized preferences don’t specify formal preference. People use their formal preference through arriving at moral intuitions in specific situations they understand, and can form verbalized preferences as heuristic rules describing what kinds of moral intuitions are observed to appear upon considering what situations. Verbalized preferences are plain and simple summaries of observations, common sense understanding of the hidden territory of the machinery in the brain that produces the moral intuition in an opaque manner. While verbalized preferences are able to capture the important dimensions of what formal preference is, they no more determine formal preference than Newton’s laws, as written in a textbook, determine the way real world operates. They merely describe.
EDIT: And the conclusion to draw from this is that we can use our axiological intuitions to predict our formal preferences, in certain cases. In some cases, we might even predict perfectly: if you verbalize that you have some very simple preference, such as “I want there to be a brick on this table, and that’s all I want”, then your formal preference is just that.
Human preferences are too big and unwieldy to predict this simply. We each have many many preferences, and they interact with each other in complex ways. But I still claim that we can make educated guesses. As I said, verbalized preferences are the foundation for formal preferences. A foundation does not, in a simple way, determine the building on it. But if you see a 12 foot by 12 foot foundation, you can probably guess that the building on top of it is not going to be the Eiffel Tower.
This is closer. Still, verbalized preference is observation, not the reality of formal preference itself. The goal of preference theory is basically in coming up with a better experimental set-up than moral intuition to study formal preference. This is like moving on from study of physics by making observations of natural phenomena with naked eye, to lab experiments with rulers, clocks, microscopes and so on. Moral intuition, as experienced by humans, is too fuzzy and limited experimental apparatus, even if you use it to observe the outcomes of carefully constructed experiments.
On the other hand, your beliefs are only a map of how the world is. The world remains even if you don’t have beliefs about it.
Your beliefs shape the world though, if you allow high-level concepts to affect low-level ones. Aside from being made of the world in the first place, your actions will follow in part from your beliefs. If you don’t allow high-level concepts to affect low-level ones, then verbalized preference does not cause formalized preferences to exist.
FAI doesn’t serve humans. It serves human preference, which is an altogether different kind of thing, and not even something humans have had experience with.
An analogy: the atomic structure of a spoon is not spoon-morphic, just because atomic structure of a spoon is of a spoon.
I disagree that humans have no experience with human preference. It is true that “formal preference” is not identical to the verbalized statements that people pragmatically pursue in their lives, but I also think that the difference between the two is somewhat bounded by various factors, including the bound of personal identity: if you diverge so much from what you are today, you are effectively dead.
Formal preference and verbalized preference are completely different kind of objects, almost nothing in common. Verbalized preference talks about natural categories, clusters of situations found in actual human experience. You don’t have any verbalized preference about novel configurations of atoms that can’t be seen as instances of the usual things, modified by usual verbs. Formal preference, on the other hand, talks about all possible configurations of matter.
Personal identity, as I discussed earlier, is a referent in the world sought by our moral intuition, a concept in terms of which a significant part of our moral intuition is implemented. When me-in-the-future concept fails to find a referent, this is a failure of verbalized preference, not formal preference. You’ll get a gap on your map, inability to estimate moral worth of situations in the future that lack you-in-the-future on many important aspects. But this gap on the map doesn’t correspond to a gap on the moral territory, to these configurations automatically having equal or no moral worth.
This is also a point of difference between formal preference and verbalized preference: formal preference refers to a definition that determines the truth about the moral worth of all situation, that establishes the moral territory, while verbalized preference is merely a non-rigorous human-level attempt to glimpse the shape of this territory. Of course, even in a FAI, formal preference doesn’t allow to get all the answers, but it is the criterion for the truth of imperfect answers that FAI will be able to find.
The following sounds worryingly like moral realism:
Of course if you meant it in an antirealist sense, then
is problematic, because the map (verbalized preference) contributes causally to determining the territory, because (the brainware that creates) your verbalized preferences determines (in part) your formal preference.
Compare with your beliefs being implemented as patterns in your brain, which is a part of the territory. That the fact of your beliefs being a certain way, apart of their meaning and truth of that meaning, is a truth in its own right, doesn’t shatter the conceptual framework of map-territory distinction. You’d just need to be careful with what is the subject matter you currently consider.
I don’t know what you read in realist/anti-realist distinction; for me, there is a subject matter, and truth of that subject matter, in all questions. The questions of how that subject matter came to be established, in what way it is being considered, and who considers it, are irrelevant to the correctness of statements about the subject matter itself. Here, we consider “formal preference”. How is it defined, whether the way it came to be defined was influenced by verbal preference, is irrelevant to what it actually asserts, once it’s established what we are talking about.
If I consider “the program that was written in file1.c this morning”, this subject matter doesn’t change if the file was renamed in the afternoon, or was deleted without anyone knowing its contents, or modified, even perhaps self-modified by compiling and running the program determined by that file. The fact of “contents of file1.c” is a trajectory, a history of change, but it’s a fact separate from “contents of file1.c this morning”, and neither fact can be changed, though the former fact (the trajectory of change of the content of the file) can be determined by one’s actions, through doing something with the file.
You said:
I think that one should add that verbalized preference is both an attempt to glimpse the “territory” that is formalized preference, and the thing that causes formalized preferences to exist in the first place. It is at one and the same time a map and a foundation.
On the other hand, your beliefs are only a map of how the world is. The world remains even if you don’t have beliefs about it.
Verbalized preferences don’t specify formal preference. People use their formal preference through arriving at moral intuitions in specific situations they understand, and can form verbalized preferences as heuristic rules describing what kinds of moral intuitions are observed to appear upon considering what situations. Verbalized preferences are plain and simple summaries of observations, common sense understanding of the hidden territory of the machinery in the brain that produces the moral intuition in an opaque manner. While verbalized preferences are able to capture the important dimensions of what formal preference is, they no more determine formal preference than Newton’s laws, as written in a textbook, determine the way real world operates. They merely describe.
EDIT: And the conclusion to draw from this is that we can use our axiological intuitions to predict our formal preferences, in certain cases. In some cases, we might even predict perfectly: if you verbalize that you have some very simple preference, such as “I want there to be a brick on this table, and that’s all I want”, then your formal preference is just that.
Human preferences are too big and unwieldy to predict this simply. We each have many many preferences, and they interact with each other in complex ways. But I still claim that we can make educated guesses. As I said, verbalized preferences are the foundation for formal preferences. A foundation does not, in a simple way, determine the building on it. But if you see a 12 foot by 12 foot foundation, you can probably guess that the building on top of it is not going to be the Eiffel Tower.
This is closer. Still, verbalized preference is observation, not the reality of formal preference itself. The goal of preference theory is basically in coming up with a better experimental set-up than moral intuition to study formal preference. This is like moving on from study of physics by making observations of natural phenomena with naked eye, to lab experiments with rulers, clocks, microscopes and so on. Moral intuition, as experienced by humans, is too fuzzy and limited experimental apparatus, even if you use it to observe the outcomes of carefully constructed experiments.
Your beliefs shape the world though, if you allow high-level concepts to affect low-level ones. Aside from being made of the world in the first place, your actions will follow in part from your beliefs. If you don’t allow high-level concepts to affect low-level ones, then verbalized preference does not cause formalized preferences to exist.