If an agent with goal G1 acquires sufficient “philosophical ability”, that it concludes that goal G is the right goal to have, that means that it decided that the best way to achieve goal G1 is to pursue goal G. For that to happen, I find it unlikely that goal G is anything other than a clarification of goal G1 in light of some confusion revealed by the “philosophical ability”, and I find it extremely unlikely that there is some universal goal G that works for any goal G1.
Offbeat counter: You’re assuming that this ontology that privileges “goals” over e.g. morality is correct. What if it’s not? Are you extremely confident that you’ve carved up reality correctly? (Recall that EU maximizers haven’t been shown to lead to AGI, and that many philosophers who have thought deeply about the matter hold meta-ethical views opposed to your apparent meta-ethics.) I.e., what if your above analysis is not even wrong?
You’re assuming that this ontology that privileges “goals” over e.g. morality is correct.
I don’t believe that goals are ontologically fundamental. I am reasoning (at a high level of abstraction) about the behavior of a physical system designed to pursue a goal. If I understood what you mean by “morality”, I could reason about a physical system designed to use that and likely predict different behaviors than for the physical system designed to pursue a goal, but that doesn’t change my point about what happens with goals.
Recall that EU maximizers haven’t been shown to lead to AGI
I don’t expect EU maximizers to lead to AGI. I expect EU maximizing AGIs, whatever has led to them, to be effective EU maximizers.
Sorry, I meant “ontology” in the information science sense, not the metaphysics sense; I simply meant that you’re conceptually (not necessarily metaphysically) privileging goals. What if you’re wrong to do that? I suppose I’m suggesting that carving out “goals” might be smuggling in conclusions that make you think universal convergence is unlikely. If you conceptually privileged rational morality instead, as many meta-ethicists do, then your conclusions might change, in which case it seems you’d have to be unjustifiably confident in your “goal”-centric conceptualization.
I think I am only “privileging” goals in a weak sense, since by talking about a goal driven agent, I do not deny the possibility of an agent built on anything else, including your “rational morality”, though I don’t know what that is.
Are you arguing that a goal driven agent is impossible? (Note that this is a stronger claim than it being wiser to build some other sort of agent, which would not contradict my reasoning about what a goal driven agent would do.)
(Yeah, the argument would have been something like, given a sufficiently rich and explanatory concept of “agent”, goal-driven agents might not be possible—or more precisely, they aren’t agents insofar as they’re making tradeoffs in favor of local homeostatic-like improvements as opposed to traditionally-rational, complex, normatively loaded decision policies. Or something like that.)
Let me try to strengthen your point. If an agent with goal G1 acquires sufficient “philosophical ability”, that it concludes that goal G is the right goal to have, that means that it decided that the best way to achieve goal G1 is to pursue what it thinks is the “right goal to have”. This would require it to take a kind of normative stance on goal fulfillment, which would require it to have normative machinery, which would need to be implemented in the agents mind. Is it impossible to create an agent without normative machinery of this kind? Does philosophical ability depend directly on normative machinery?
If an agent with goal G1 acquires sufficient “philosophical ability”, that it concludes that goal G is the right goal to have, that means that it decided that the best way to achieve goal G1 is to pursue goal G. For that to happen, I find it unlikely that goal G is anything other than a clarification of goal G1 in light of some confusion revealed by the “philosophical ability”, and I find it extremely unlikely that there is some universal goal G that works for any goal G1.
Offbeat counter: You’re assuming that this ontology that privileges “goals” over e.g. morality is correct. What if it’s not? Are you extremely confident that you’ve carved up reality correctly? (Recall that EU maximizers haven’t been shown to lead to AGI, and that many philosophers who have thought deeply about the matter hold meta-ethical views opposed to your apparent meta-ethics.) I.e., what if your above analysis is not even wrong?
I don’t believe that goals are ontologically fundamental. I am reasoning (at a high level of abstraction) about the behavior of a physical system designed to pursue a goal. If I understood what you mean by “morality”, I could reason about a physical system designed to use that and likely predict different behaviors than for the physical system designed to pursue a goal, but that doesn’t change my point about what happens with goals.
I don’t expect EU maximizers to lead to AGI. I expect EU maximizing AGIs, whatever has led to them, to be effective EU maximizers.
Sorry, I meant “ontology” in the information science sense, not the metaphysics sense; I simply meant that you’re conceptually (not necessarily metaphysically) privileging goals. What if you’re wrong to do that? I suppose I’m suggesting that carving out “goals” might be smuggling in conclusions that make you think universal convergence is unlikely. If you conceptually privileged rational morality instead, as many meta-ethicists do, then your conclusions might change, in which case it seems you’d have to be unjustifiably confident in your “goal”-centric conceptualization.
I think I am only “privileging” goals in a weak sense, since by talking about a goal driven agent, I do not deny the possibility of an agent built on anything else, including your “rational morality”, though I don’t know what that is.
Are you arguing that a goal driven agent is impossible? (Note that this is a stronger claim than it being wiser to build some other sort of agent, which would not contradict my reasoning about what a goal driven agent would do.)
(Yeah, the argument would have been something like, given a sufficiently rich and explanatory concept of “agent”, goal-driven agents might not be possible—or more precisely, they aren’t agents insofar as they’re making tradeoffs in favor of local homeostatic-like improvements as opposed to traditionally-rational, complex, normatively loaded decision policies. Or something like that.)
Let me try to strengthen your point. If an agent with goal G1 acquires sufficient “philosophical ability”, that it concludes that goal G is the right goal to have, that means that it decided that the best way to achieve goal G1 is to pursue what it thinks is the “right goal to have”. This would require it to take a kind of normative stance on goal fulfillment, which would require it to have normative machinery, which would need to be implemented in the agents mind. Is it impossible to create an agent without normative machinery of this kind? Does philosophical ability depend directly on normative machinery?
Let G1=”Figure out the right goal to have”