If humans ever manage to build systems which are properly consequentialist—organizations or automations which are capable of expanding because it is instrumentally useful—we should not expect natural selection to discriminate at all on the basis of those systems’ values.
You seem to be making several more assumptions for your “median future” that you haven’t made explicit here. 1) Humans will manage to build such properly consequentialist systems not subject to value drift soon, before too much further evolution will have taken place. 2) We will succeed in imbuing such systems with the altruistic values that we still have at that point. 3) Such properly consequentialist systems will be able to either out-compete other entities that are subject to short-range consequentialism and value drift, or at least survive into the far future in an environment with such competitors.
Have you discussed (or seen good arguments made elsewhere) why these are likely to be the case?
I agree. I argued that values about the long term will dominate in the long term, and I suggested that our current long term values are mostly altruistic. But in the short term (particularly during a transition to machine intelligences) our values could change in important ways, and I didn’t address that.
I expect we’ll handle this (“expect” as in probability >50%, not probability 90%) primarily because we all want the same outcome, and we don’t yet see any obstacles clearly enough to project confidently that the obstacles are too hard to overcome. But like I said, it seems like an important thing to work on, directly or indirectly.
I don’t quite understand your point (3), which seems like it was addressed. A competitor who isn’t able to reason about the future seems like a weak competitor in the long run. It seems like the only way such a competitor can win (again, in the long run) is by securing some irreversible victory like killing everyone else.
I expect we’ll handle this (“expect” as in probability >50%, not probability 90%) primarily because we all want the same outcome, and we don’t yet see any obstacles clearly enough to project confidently that the obstacles are too hard to overcome.
When you say “we all want the same outcome”, do you mean we all want consequentialist systems, with our values and not subject to value drift, to be built before too much evolution has taken place? But many AGI researchers seem to prefer working on “heuristic soup” type designs (which makes sense if those AGI researchers are not themselves “properly consequentialist” and don’t care strongly about long range outcomes).
I don’t quite understand your point (3), which seems like it was addressed.
What I mean is that the kind of value-stable consequentialist that humans can build in the relevant time frame may be too inefficient to survive under competitive pressure from other cognitive/organizational architectures that will exist (even if it can survive as a singleton).
You seem to be making several more assumptions for your “median future” that you haven’t made explicit here. 1) Humans will manage to build such properly consequentialist systems not subject to value drift soon, before too much further evolution will have taken place. 2) We will succeed in imbuing such systems with the altruistic values that we still have at that point. 3) Such properly consequentialist systems will be able to either out-compete other entities that are subject to short-range consequentialism and value drift, or at least survive into the far future in an environment with such competitors.
Have you discussed (or seen good arguments made elsewhere) why these are likely to be the case?
I agree. I argued that values about the long term will dominate in the long term, and I suggested that our current long term values are mostly altruistic. But in the short term (particularly during a transition to machine intelligences) our values could change in important ways, and I didn’t address that.
I expect we’ll handle this (“expect” as in probability >50%, not probability 90%) primarily because we all want the same outcome, and we don’t yet see any obstacles clearly enough to project confidently that the obstacles are too hard to overcome. But like I said, it seems like an important thing to work on, directly or indirectly.
I don’t quite understand your point (3), which seems like it was addressed. A competitor who isn’t able to reason about the future seems like a weak competitor in the long run. It seems like the only way such a competitor can win (again, in the long run) is by securing some irreversible victory like killing everyone else.
When you say “we all want the same outcome”, do you mean we all want consequentialist systems, with our values and not subject to value drift, to be built before too much evolution has taken place? But many AGI researchers seem to prefer working on “heuristic soup” type designs (which makes sense if those AGI researchers are not themselves “properly consequentialist” and don’t care strongly about long range outcomes).
What I mean is that the kind of value-stable consequentialist that humans can build in the relevant time frame may be too inefficient to survive under competitive pressure from other cognitive/organizational architectures that will exist (even if it can survive as a singleton).