Actually, Mike Travers has a whole sequence of excellent posts on ascribing agency to non-human systems over at Ribbonfarm. See here. I particularly recommend the post Patterns of Refactored Agency.
I don’t think ascribing agency to systems like institutions and collections of institutions is too forced. In fact, institutions seem to exist precisely for preserving and propagating values in the face of changing individuals.
I’m completely fine with ascribing agency to institutions. I’m not fine with sticking in emotionally-loaded terms and implying that e.g. AI researchers should work on fixing the financial system.
But I don’t think that his point is that AI researchers, in general, should be working on fixing the financial system.
I think his point is that the people at MIRI have chosen AI research because they think that AI is a significant source of threat to human well-being/eixstence from non-human value systems (possibly generated by humans). His claim seems to be that AI may only be a very small part of the problem. Instead, there already exist non-human value systems generated by humans threatening human well-being/existence and we don’t know how to fix that.
So, I guess the counter-argument from someone at MIRI would go something like: “while it is true that human institutions can threaten human well-being, no human institution seems to have the power in the near future to threaten human existence. But the technology of self-improving AI, can FOOM and threaten human existence. Thus, we choose to work on preventing this outcome.”
Instead, there already exist non-human value systems generated by humans threatening human well-being/existence and we don’t know how to fix that.
First, I am unaware of evidence (though I am aware of a lot of loud screaming) that human institutions pose an existential risk to humanity. I think the closest we come to that is the capability of US and Russia to launch an all-out nuclear exchange.
Second, the whole “non-human value systems” is much too fuzzy for my liking. Is self-preservation a human value? Let’s take an entity, say a large department within a governmental bureaucracy, the major values of which are self-preservation and the accrual of benefits (of various kinds) to its leadership. Is that a “non-human value system”? Should we call it “hostile AI” and be worried about it?
Third, the global financial system (or the “industrial capitalism”) is not an institution. It’s an ecosystem where many different entities coexist, fight, live, and die. I am not sure ecosystems have agency.
Fourth, it looks to me like his argument would shortcut to either a revolution or more malaria nets.
OK, fine, unfriendly AIs occupy only a small part of the space of possible non-human agents arising from human action and having value systems different from ours and enough power to do a lot of harm as a result; and businesses and nations and so forth are other possible examples.
Furthermore, non-human agents arising [etc.] occupy only a small part of the space of Bad Things.
It doesn’t follow from the latter that people investigating how to arrange for businesses and nations and whatnot to do good rather than harm are making a mistake; and it doesn’t follow from the former that people investigating how to arrange for superhuman AIs (if and when they show up) to do good rather than harm are making a mistake.
Why not? Because in each case the more restricted class of entities has particular features that are (hopefully) amenable to particular kinds of study, and that (we fear) pose particular kinds of threat.
A large and important fraction of AI-space is occupied by entities with the following interesting features. They are created deliberately by human researchers; they operate according to clear and explicit (but perhaps monstrously complex) principles; their behaviour is, accordingly, in principle amenable to quite rigorous (but perhaps intractably difficult) analysis. Businesses and nations and religions and sports clubs don’t have these features, and there’s some hope of developing ways of understanding and/or controlling AIs that don’t apply to those other entities.
It is possible (very likely, according to some) that a large fraction of the probability of a superhuman AI turning up in the nearish future comes from scenarios in which the AI goes from being distinctly subhuman and no threat to anyone, to being vastly superhuman and potentially controlling everything that happens on earth, in so short a time that it’s not feasible for anyone (including businesses, nations, etc.) to stop it. Businesses and nations and religions and sports clubs mostly have [EDIT: oops, I meant “don’t have”] this feature (though you might argue that nuclear war is a bit like a Bad Singularity in some respects), and there might accordingly be a need for much tighter control over potentially superhuman AIs than over businesses and nations and the like.
So one can agree that there are interesting analogies between the danger from unfriendly superhuman AI and the danger from an out-of-control financial system / government / business cartel / religion / whatever, while also thinking that a bunch of people whose interests and expertise lie in the domain of software engineering and pure mathematics might be more effectively used by having them concentrate on AI rather than the financial system.
(There might also be a need for experts in software engineering and pure mathematics to help make the financial system safer—by keeping an eye on the potential for runaway algorithmic trading systems at banks and hedge funds, for instance. But that’s not what Mike Travers is talking about, and actually it’s not a million miles away from friendly AI work—though probably a lot easier because the systems involved are simpler and more limited in power.)
Actually, Mike Travers has a whole sequence of excellent posts on ascribing agency to non-human systems over at Ribbonfarm. See here. I particularly recommend the post Patterns of Refactored Agency.
I don’t think ascribing agency to systems like institutions and collections of institutions is too forced. In fact, institutions seem to exist precisely for preserving and propagating values in the face of changing individuals.
I’m completely fine with ascribing agency to institutions. I’m not fine with sticking in emotionally-loaded terms and implying that e.g. AI researchers should work on fixing the financial system.
But I don’t think that his point is that AI researchers, in general, should be working on fixing the financial system.
I think his point is that the people at MIRI have chosen AI research because they think that AI is a significant source of threat to human well-being/eixstence from non-human value systems (possibly generated by humans). His claim seems to be that AI may only be a very small part of the problem. Instead, there already exist non-human value systems generated by humans threatening human well-being/existence and we don’t know how to fix that.
So, I guess the counter-argument from someone at MIRI would go something like: “while it is true that human institutions can threaten human well-being, no human institution seems to have the power in the near future to threaten human existence. But the technology of self-improving AI, can FOOM and threaten human existence. Thus, we choose to work on preventing this outcome.”
First, I am unaware of evidence (though I am aware of a lot of loud screaming) that human institutions pose an existential risk to humanity. I think the closest we come to that is the capability of US and Russia to launch an all-out nuclear exchange.
Second, the whole “non-human value systems” is much too fuzzy for my liking. Is self-preservation a human value? Let’s take an entity, say a large department within a governmental bureaucracy, the major values of which are self-preservation and the accrual of benefits (of various kinds) to its leadership. Is that a “non-human value system”? Should we call it “hostile AI” and be worried about it?
Third, the global financial system (or the “industrial capitalism”) is not an institution. It’s an ecosystem where many different entities coexist, fight, live, and die. I am not sure ecosystems have agency.
Fourth, it looks to me like his argument would shortcut to either a revolution or more malaria nets.
OK, fine, unfriendly AIs occupy only a small part of the space of possible non-human agents arising from human action and having value systems different from ours and enough power to do a lot of harm as a result; and businesses and nations and so forth are other possible examples.
Furthermore, non-human agents arising [etc.] occupy only a small part of the space of Bad Things.
It doesn’t follow from the latter that people investigating how to arrange for businesses and nations and whatnot to do good rather than harm are making a mistake; and it doesn’t follow from the former that people investigating how to arrange for superhuman AIs (if and when they show up) to do good rather than harm are making a mistake.
Why not? Because in each case the more restricted class of entities has particular features that are (hopefully) amenable to particular kinds of study, and that (we fear) pose particular kinds of threat.
A large and important fraction of AI-space is occupied by entities with the following interesting features. They are created deliberately by human researchers; they operate according to clear and explicit (but perhaps monstrously complex) principles; their behaviour is, accordingly, in principle amenable to quite rigorous (but perhaps intractably difficult) analysis. Businesses and nations and religions and sports clubs don’t have these features, and there’s some hope of developing ways of understanding and/or controlling AIs that don’t apply to those other entities.
It is possible (very likely, according to some) that a large fraction of the probability of a superhuman AI turning up in the nearish future comes from scenarios in which the AI goes from being distinctly subhuman and no threat to anyone, to being vastly superhuman and potentially controlling everything that happens on earth, in so short a time that it’s not feasible for anyone (including businesses, nations, etc.) to stop it. Businesses and nations and religions and sports clubs mostly have [EDIT: oops, I meant “don’t have”] this feature (though you might argue that nuclear war is a bit like a Bad Singularity in some respects), and there might accordingly be a need for much tighter control over potentially superhuman AIs than over businesses and nations and the like.
So one can agree that there are interesting analogies between the danger from unfriendly superhuman AI and the danger from an out-of-control financial system / government / business cartel / religion / whatever, while also thinking that a bunch of people whose interests and expertise lie in the domain of software engineering and pure mathematics might be more effectively used by having them concentrate on AI rather than the financial system.
(There might also be a need for experts in software engineering and pure mathematics to help make the financial system safer—by keeping an eye on the potential for runaway algorithmic trading systems at banks and hedge funds, for instance. But that’s not what Mike Travers is talking about, and actually it’s not a million miles away from friendly AI work—though probably a lot easier because the systems involved are simpler and more limited in power.)
Upvoted, but I think you’re missing a negation in “Businesses and nations and religions and sports clubs mostly have this feature...”.
Yup, I was. Edited. Thanks!
[EDITED to fix an inconsequential thinko.]
Excellent summary. Thanks.