That would only prove that you think you want to do that.
Isn’t that when I throw up my arms and say “congratulations, your hypothesis is unfalsifiable, the dragon is permeable to fluor”. What experimental setup would you suggest? Would you say any statement about one’s preferences is moot? It seems that we’re always under bounded thinking time constraints. Maybe the paperclipper really wants to help humankind and be moral, and mistakingly thinks otherwise. Who would know, it optimized its own actions under resource constraints, and then there’s the ‘Löbstacle’ to consider.
Is saying “I like vanilla ice cream” FAI-complete and must never be uttered or relied upon by anyone?
it’s fairly common for amateur philosophers to argue themselves into thinking they “should” be perfectly selfish egoists, or hedonistic utilitarians, because logic or rationality demands it
Or argue themselves into thinking that there is some subset of preferences such every other (human?) agent should voluntarily choose to adopt them, against their better judgment (edit; as it contradicts what they (perceive, after thorough introspection) as their own preferences)? You can add “objective moralists” to the list.
What would it be that is present in every single human’s brain architecture throughout human history that would be compatible with some fixed ordering over actions, called “morally good”? (Otherwise you’d have your immediate counterexample.) The notion seems so obviously ill-defined and misguided (hence my first comment asking Cousin_It).
It’s fine (to me) to espouse preferences that aim to change other humans (say, towards being more altruistic, or towards being less altruistic, or whatever), but to appeal to some objective guiding principle based on “human nature” (which constantly evolves in different strands) or some well-sounding ev-psych applause-light is just a new substitute for the good old Abrahamic heavenly father.
Would you say any statement about one’s preferences is moot? It seems that we’re always under bounded thinking time constraints. Maybe the paperclipper really wants to help humankind and be moral, and mistakingly thinks otherwise. Who would know, it optimized its own actions under resource constraints, and then there’s the ‘Löbstacle’ to consider.
Is saying “I like vanilla ice cream” FAI-complete and must never be uttered or relied upon by anyone?
I wouldn’t say any of those things. Obviously paperclippers don’t “really want to help humankind”, because they don’t have any human notion of morality built-in in the first place. Statements like “I like vanilla ice cream” are also more trustworthy on account of being a function of directly observable things like how you feel when you eat it.
The only point I’m trying to make here is that it is possible to be mistaken about your own utility function. It’s entirely consistent for the vast majority of humans to have a large shared portion of their built-in utility function (built-in by their genes), even though many of them seemingly want to do bad things, and that’s because humans are easily confused and not automatically self-aware.
It is possible to be mistaken about your own utility function.
For sure.
It’s entirely consistent for the vast majority of humans to have a large shared portion of their built-in utility function (built-in by their genes), even though many of them seemingly want to do bad things
I’d agree if humans were like dishwashers. There are templates for dishwashers, ways they are supposed to work. If you came across a broken dishwasher, there could be a case for the dishwasher to be repaired, to go back to “what it’s supposed to be”.
However, that is because there is some external authority (exasparated humans who want to fix their damn dishwasher, dirty dishes are piling up) conceiving of and enforcing such a purpose. The fact that genes and the environment shape utility functions in similar ways is a description, not a prescription. It would not be a case for any “broken” human to go back to “what his genes would want him to be doing”. Just like it wouldn’t be a case against brain uploading.
Some of the discussion seems to me like saying that “deep down in every flawed human, there is ‘a figure of light’, in our community ‘a rational agent following uniform human values with slight deviations accounting for ice-cream taste’, we just need to dig it up”. There is only your brain. With its values. There is no external standard to call its values flawed. There are external standards (rationality = winning) to better its epistemic and instrumental rationality, but those can help the serial killer and the GiveWell activist equally. (Also, both of those can be ‘mistaken’ about their values.)
Isn’t that when I throw up my arms and say “congratulations, your hypothesis is unfalsifiable, the dragon is permeable to fluor”. What experimental setup would you suggest? Would you say any statement about one’s preferences is moot? It seems that we’re always under bounded thinking time constraints. Maybe the paperclipper really wants to help humankind and be moral, and mistakingly thinks otherwise. Who would know, it optimized its own actions under resource constraints, and then there’s the ‘Löbstacle’ to consider.
Is saying “I like vanilla ice cream” FAI-complete and must never be uttered or relied upon by anyone?
Or argue themselves into thinking that there is some subset of preferences such every other (human?) agent should voluntarily choose to adopt them, against their better judgment (edit; as it contradicts what they (perceive, after thorough introspection) as their own preferences)? You can add “objective moralists” to the list.
What would it be that is present in every single human’s brain architecture throughout human history that would be compatible with some fixed ordering over actions, called “morally good”? (Otherwise you’d have your immediate counterexample.) The notion seems so obviously ill-defined and misguided (hence my first comment asking Cousin_It).
It’s fine (to me) to espouse preferences that aim to change other humans (say, towards being more altruistic, or towards being less altruistic, or whatever), but to appeal to some objective guiding principle based on “human nature” (which constantly evolves in different strands) or some well-sounding ev-psych applause-light is just a new substitute for the good old Abrahamic heavenly father.
I wouldn’t say any of those things. Obviously paperclippers don’t “really want to help humankind”, because they don’t have any human notion of morality built-in in the first place. Statements like “I like vanilla ice cream” are also more trustworthy on account of being a function of directly observable things like how you feel when you eat it.
The only point I’m trying to make here is that it is possible to be mistaken about your own utility function. It’s entirely consistent for the vast majority of humans to have a large shared portion of their built-in utility function (built-in by their genes), even though many of them seemingly want to do bad things, and that’s because humans are easily confused and not automatically self-aware.
For sure.
I’d agree if humans were like dishwashers. There are templates for dishwashers, ways they are supposed to work. If you came across a broken dishwasher, there could be a case for the dishwasher to be repaired, to go back to “what it’s supposed to be”.
However, that is because there is some external authority (exasparated humans who want to fix their damn dishwasher, dirty dishes are piling up) conceiving of and enforcing such a purpose. The fact that genes and the environment shape utility functions in similar ways is a description, not a prescription. It would not be a case for any “broken” human to go back to “what his genes would want him to be doing”. Just like it wouldn’t be a case against brain uploading.
Some of the discussion seems to me like saying that “deep down in every flawed human, there is ‘a figure of light’, in our community ‘a rational agent following uniform human values with slight deviations accounting for ice-cream taste’, we just need to dig it up”. There is only your brain. With its values. There is no external standard to call its values flawed. There are external standards (rationality = winning) to better its epistemic and instrumental rationality, but those can help the serial killer and the GiveWell activist equally. (Also, both of those can be ‘mistaken’ about their values.)