There would still exist approximate Solomonoff inductors compressing sense-data, creating meta-self-aware world-representations using the visual system and other modalities (“sight”), optimizing towards certain outcomes in a way that tracks progress using signals integrated with other signals (“happiness”)...
Maybe this isn’t what is meant by “happiness” etc. I’m not really sure how to define “happiness”. One way to define it would be the thing having a specific role in a functionalist theory of mind; there are particular mind designs that would have indicators for e.g. progress up a utility gradient, that are factored into a RL-like optimization system; the fact that we have a system like this is evidence that it’s to some degree a convergent target of evolution, although there likely exist alternative cognitive architectures that don’t have a direct analogue due to using a different set of cognitive organs to fulfill that role in the system.
There’s a spectrum one could draw along which the parameter varied is the degree to which one believes that mind architectures different than one’s own are valuable; the most egoist point on the spectrum would be believing that only the cognitive system one metaphysically occupies at the very moment is valuable, the least egoist would be a “whatever works” attitude that any cognitive architecture able to pursue convergent instrumental goals effectively is valuable; intermediate points would be “individualist egoism”, “cultural parochialism”, “humanism”, “terrestrialism”, or “evolutionism”. I’m not really sure how to philosophically resolve value disagreements along this axis, although even granting irreconcilable differences, there are still opportunities to analyze the implied ecosystem of agents and locate trade opportunities.
I think that people who imagine “tracking progress using signals integrated with other signals” feels anything like happiness feels inside to them—while taking that imagination and also loudly insisting that it will be very alien happiness or much simpler happiness or whatever—are simply making a mistake-of-fact, and I am just plain skeptical that there is a real values difference that would survive their learning what I know about how minds and qualia work. I of course fully expect that these people will loudly proclaim that I could not possibly know anything they don’t, despite their own confusion about these matters that they lack the skill to reflect on as confusion, and for them to exchange some wise smiles about those silly people who think that people disagree because of mistakes rather than values differences.
Trade opportunities are unfortunately ruled out by our inability to model those minds well enough that, if some part of them decided to seize an opportunity to Defect, we would’ve seen it coming in the past and counter-Defected. If we Cooperate, we’ll be nothing but CooperateBot, and they, I’m afraid, will be PrudentBot, not FairBot.
and they, I’m afraid, will be PrudentBot, not FairBot.
This shouldn’t matter for anyone besides me, but there’s something personally heartbreaking about seeing the one bit of research for which I feel comfortable claiming a fraction of a point of dignity, being mentioned validly to argue why decision theory won’t save us.
(Modal bargaining agents didn’t turn out to be helpful, but given the state of knowledge at that time, it was worth doing.)
It would be dying with a lot less dignity if everyone on Earth—not just the managers of the AGI company making the decision to kill us—thought that all you needed to do was be CooperateBot, and had no words for any sharper concepts than that. Thank you for that, Patrick.
To clarify, you mean “mistake-of-fact” in the sense that maybe the same people would use for other high-level concepts? Because if you use low enough resolution, happiness is like “tracking progress using signals integrated with other signals”, and so it is at least not inconsistent to save this part of your utility function using such low resolution.
There would still exist approximate Solomonoff inductors compressing sense-data, creating meta-self-aware world-representations using the visual system and other modalities (“sight”), optimizing towards certain outcomes in a way that tracks progress using signals integrated with other signals (“happiness”)...
Maybe this isn’t what is meant by “happiness” etc. I’m not really sure how to define “happiness”. One way to define it would be the thing having a specific role in a functionalist theory of mind; there are particular mind designs that would have indicators for e.g. progress up a utility gradient, that are factored into a RL-like optimization system; the fact that we have a system like this is evidence that it’s to some degree a convergent target of evolution, although there likely exist alternative cognitive architectures that don’t have a direct analogue due to using a different set of cognitive organs to fulfill that role in the system.
There’s a spectrum one could draw along which the parameter varied is the degree to which one believes that mind architectures different than one’s own are valuable; the most egoist point on the spectrum would be believing that only the cognitive system one metaphysically occupies at the very moment is valuable, the least egoist would be a “whatever works” attitude that any cognitive architecture able to pursue convergent instrumental goals effectively is valuable; intermediate points would be “individualist egoism”, “cultural parochialism”, “humanism”, “terrestrialism”, or “evolutionism”. I’m not really sure how to philosophically resolve value disagreements along this axis, although even granting irreconcilable differences, there are still opportunities to analyze the implied ecosystem of agents and locate trade opportunities.
I think that people who imagine “tracking progress using signals integrated with other signals” feels anything like happiness feels inside to them—while taking that imagination and also loudly insisting that it will be very alien happiness or much simpler happiness or whatever—are simply making a mistake-of-fact, and I am just plain skeptical that there is a real values difference that would survive their learning what I know about how minds and qualia work. I of course fully expect that these people will loudly proclaim that I could not possibly know anything they don’t, despite their own confusion about these matters that they lack the skill to reflect on as confusion, and for them to exchange some wise smiles about those silly people who think that people disagree because of mistakes rather than values differences.
Trade opportunities are unfortunately ruled out by our inability to model those minds well enough that, if some part of them decided to seize an opportunity to Defect, we would’ve seen it coming in the past and counter-Defected. If we Cooperate, we’ll be nothing but CooperateBot, and they, I’m afraid, will be PrudentBot, not FairBot.
This shouldn’t matter for anyone besides me, but there’s something personally heartbreaking about seeing the one bit of research for which I feel comfortable claiming a fraction of a point of dignity, being mentioned validly to argue why decision theory won’t save us.
(Modal bargaining agents didn’t turn out to be helpful, but given the state of knowledge at that time, it was worth doing.)
Sorry.
It would be dying with a lot less dignity if everyone on Earth—not just the managers of the AGI company making the decision to kill us—thought that all you needed to do was be CooperateBot, and had no words for any sharper concepts than that. Thank you for that, Patrick.
But sorry anyways.
To clarify, you mean “mistake-of-fact” in the sense that maybe the same people would use for other high-level concepts? Because if you use low enough resolution, happiness is like “tracking progress using signals integrated with other signals”, and so it is at least not inconsistent to save this part of your utility function using such low resolution.