This is deserving of a much longer answer which I have not had the time to write and probably won’t any time soon, I’m sorry to say. But in short summary human drives and morals are more behaviorist that utilitarian. The utility function approximation is just that, an approximation.
Imagine you have a shovel, and while digging you hit a large rock and the handle breaks. What that shovel designed to break, in sense that its purpose was to break? No, shovels are designed to dig holes. Breakage, for the most part, is just an unintended side-effect of the materials used. Now in some cases things are intended to fail early for safety reasons, e,g, to have the shovel break before your bones will. But even then this isn’t some underlying root purpose. The purpose of the shovel is still to dig holes. The breakage is more a secondary consideration to prevent undesirable side effects in some failure modes.
Does learning that the shovel breaks when it exceeds normal digging stresses tell you anything about the purpose / utility function of the shovel? Pedantically, a little bit if you accept the breaking point being a designed-in safety consideration. But it doesn’t enlighten us about the hole digging nature at all.
Would you rather put dust in the eyes of 3^^^3 people, or torture one individual to death? Would you rather push one person onto the trolley tracks to save five others? These are failure mode analysis of edge cases. The real answer is I’d rather have dust in no one’s eyes and nobody tortured, and nobody hit by trolleys. Making an arbitrary what-if tradeoff between these scenarios doesn’t tell us much about our underlying desires because there isn’t some consistent mathematical utility function underlying our responses. At best it just reveals how we’ve been wired by genetics and upbringing and present environment to prioritize our behaviorist responses. Which is interesting, to be sure. But not very informative, to be honest.
This is deserving of a much longer answer which I have not had the time to write and probably won’t any time soon, I’m sorry to say. But in short summary human drives and morals are more behaviorist that utilitarian. The utility function approximation is just that, an approximation.
Imagine you have a shovel, and while digging you hit a large rock and the handle breaks. What that shovel designed to break, in sense that its purpose was to break? No, shovels are designed to dig holes. Breakage, for the most part, is just an unintended side-effect of the materials used. Now in some cases things are intended to fail early for safety reasons, e,g, to have the shovel break before your bones will. But even then this isn’t some underlying root purpose. The purpose of the shovel is still to dig holes. The breakage is more a secondary consideration to prevent undesirable side effects in some failure modes.
Does learning that the shovel breaks when it exceeds normal digging stresses tell you anything about the purpose / utility function of the shovel? Pedantically, a little bit if you accept the breaking point being a designed-in safety consideration. But it doesn’t enlighten us about the hole digging nature at all.
Would you rather put dust in the eyes of 3^^^3 people, or torture one individual to death? Would you rather push one person onto the trolley tracks to save five others? These are failure mode analysis of edge cases. The real answer is I’d rather have dust in no one’s eyes and nobody tortured, and nobody hit by trolleys. Making an arbitrary what-if tradeoff between these scenarios doesn’t tell us much about our underlying desires because there isn’t some consistent mathematical utility function underlying our responses. At best it just reveals how we’ve been wired by genetics and upbringing and present environment to prioritize our behaviorist responses. Which is interesting, to be sure. But not very informative, to be honest.