I think what matters is not so much the probability of goal clusters, but something like the expectation of the amount of resources that AIs that have a particular goal cluster have access to. An AI might think that some specific goal cluster only has a 1:1000 chance of occurring anywhere, but if it does then there are probably a million instances of it. I think this is the same as being certain that there are 1,000 (1million/1,000) AIs with that goal cluster. Which seems like enough to ‘dilute’ the preferences of any given AI.
If the universe is pretty big then it seems like it would be pretty easy to get large expectations even with low probabilities. (let me know if I’m not making sense)
The “million instances” is the size of the cluster, and yes, that would impact its weight, but I think it’s arithmetically erroneous to suggest the density matters more than the probability. It depends entirely on what those densities and probabilities are, and you’re just plucking numbers straight out of the air. Why not go the whole hog and suggest a goal cluster that happens nine times out of ten, with a gajillion instances?
I believe the salient questions are:
Do such clusters even exist? Can they be inferred from a poverty of evidence just by thinking about possible agents that may or may not arise in our universe with enough confidence to actually act upon? This boils down to whether, if I’m smart enough, I can sit in an empty room, think “what if...” about examples of something I’ve never seen before from an enormous space of possibilities, and come up with an accurate collection of properties for those things, weighted by probability. There are some things we can do that with and some things we can’t. What category do alien goal systems fall into?
If they do exist, will they be specific enough for an AI to act upon? Even if it does deduce some inscrutable set of alien factors that we can’t make sense of, will they be coherent? Humans care a lot about methods of governance, the moral status of unborn children, and who people should and shouldn’t have sex with, but they don’t agree on these things.
If they do exist, are there going to be many disparate clusters, or will they converge? If they do converge, how relatively far away from the median is humanity? If they’re disparate, are they completely disjoint goals, or are they overlap and/or conflict with each other? More to the point, are they going to overlap and/or conflict with us?
I can’t say how much we’d need to worry about a superintelligent TDT-agent implementing alien goals. That’s a fact about the universe for which I don’t have a lot of evidence However there’s more than enough uncertainty surrounding the question for me to not lose any sleep over it.
I think what matters is not so much the probability of goal clusters, but something like the expectation of the amount of resources that AIs that have a particular goal cluster have access to. An AI might think that some specific goal cluster only has a 1:1000 chance of occurring anywhere, but if it does then there are probably a million instances of it. I think this is the same as being certain that there are 1,000 (1million/1,000) AIs with that goal cluster. Which seems like enough to ‘dilute’ the preferences of any given AI.
If the universe is pretty big then it seems like it would be pretty easy to get large expectations even with low probabilities. (let me know if I’m not making sense)
The “million instances” is the size of the cluster, and yes, that would impact its weight, but I think it’s arithmetically erroneous to suggest the density matters more than the probability. It depends entirely on what those densities and probabilities are, and you’re just plucking numbers straight out of the air. Why not go the whole hog and suggest a goal cluster that happens nine times out of ten, with a gajillion instances?
I believe the salient questions are:
Do such clusters even exist? Can they be inferred from a poverty of evidence just by thinking about possible agents that may or may not arise in our universe with enough confidence to actually act upon? This boils down to whether, if I’m smart enough, I can sit in an empty room, think “what if...” about examples of something I’ve never seen before from an enormous space of possibilities, and come up with an accurate collection of properties for those things, weighted by probability. There are some things we can do that with and some things we can’t. What category do alien goal systems fall into?
If they do exist, will they be specific enough for an AI to act upon? Even if it does deduce some inscrutable set of alien factors that we can’t make sense of, will they be coherent? Humans care a lot about methods of governance, the moral status of unborn children, and who people should and shouldn’t have sex with, but they don’t agree on these things.
If they do exist, are there going to be many disparate clusters, or will they converge? If they do converge, how relatively far away from the median is humanity? If they’re disparate, are they completely disjoint goals, or are they overlap and/or conflict with each other? More to the point, are they going to overlap and/or conflict with us?
I can’t say how much we’d need to worry about a superintelligent TDT-agent implementing alien goals. That’s a fact about the universe for which I don’t have a lot of evidence However there’s more than enough uncertainty surrounding the question for me to not lose any sleep over it.