I’m pretty new to this although I’ve read Kurzweil’s book and Bostrom’s Superintelligence, and a couple of years worth of mostly lurking on LW, so if there’s if there’s a shitload of thinking about this I hope to be corrected civilly
If friendly AI is to be not just a substitute for but our guardian against unfriendly AI, won’t we end up thinking of all sorts of unfriendly AI tactics, and putting them into the friendly AI so it can anticipate and thwart them? If so, is there any chance of self-modification in the friendly AI turning all that against us? Ultimately, we’d count on the friendly AI itself trying to imagine and develop countermeasures against unfriendly AI tactics that are beyond our imagination, but then same problem maybe.
I’ve been pondering for some time, especially prompted by the book Boyd: The Fighter Pilot Who Changed the Art of War by Robert Coram, how one might distinguish qualities of possible knowledge that make them more or less likely to be of general benefit to humankind. Conflict knowledge seems to have a general problem. It is often developed under the optimistic assumption that it will give “us” who are well-intentioned the ability to make everybody else behave—or what is close to the same thing, it is developed under existential threat such that it is difficult to think a few years out—we need it or the evil ones will annihilate us. Hence the US and the atom bomb from 1945-49. Note that this kind of situation also motivates some (who have anticipated where I’m going) to insist “We have this advantage today—we probably won’t have it a few years from now—lets maximize our advantage while we have it (i.e. bomb the hell out of the USSR in 1946).
Another kind of knowledge might be called value-added knowledge—knowledge that disproves assumptions about economics being a zero-sum game. Better agriculture, house construction, health measures … One can always come up with counterexamples and some are quite non-trivial—the Internet facilitates formation of terrorist groups and other “echo chambers” of people with destructive or somehow non-benign belief systems. Maybe indeed media development falls in some middle-ground between value-added and conflict-oriented knowledge. Almost anything that can be considered “beneficial to humankind” might just advantage one supremely evil person, but I still think we can meaningfully speak of its general tendency to be beneficial, while the tendency of conflict-knowledge seems mostly in the long run to be neutral at best
Boyd, while developing a radically new philosophy of war-fighting got few rewards in the way of promotion and he was always embattled in the military establishment, but he collected around him a few strong acolytes, and did really if inadequately affect the design of fighter planes and their tactics, and his thought grew more and more ambitious until they embraced the art of war generally, and Coram strongly suggests he was as the side of the planners of the first Gulf War, and had a huge impact on how that was waged.
Unfortunately, as the book was being written several years later, there was speculation that people like Al Qaeda had incorporated some of the lessons of new warfare doctrines developed by the US.
It is generally problematic to predict where knowledge construction is going—because by definition we are making predictions about stuff the nature of which we don’t understand because it hasn’t been thought up yet—yet it seems we had better try, and Moore’s law gives one bit of encouragement. MIRI seems to be in part a huge exercise in this problematic sort of thinking.
If I have anything more than maybe “food for thought”, it may be to look for general tendencies (perhaps unprovable tendencies like Moore’s law) in the way kinds of knowledge affect conflict.
I’m pretty new to this although I’ve read Kurzweil’s book and Bostrom’s Superintelligence, and a couple of years worth of mostly lurking on LW, so if there’s if there’s a shitload of thinking about this I hope to be corrected civilly
If friendly AI is to be not just a substitute for but our guardian against unfriendly AI, won’t we end up thinking of all sorts of unfriendly AI tactics, and putting them into the friendly AI so it can anticipate and thwart them? If so, is there any chance of self-modification in the friendly AI turning all that against us? Ultimately, we’d count on the friendly AI itself trying to imagine and develop countermeasures against unfriendly AI tactics that are beyond our imagination, but then same problem maybe.
I’ve been pondering for some time, especially prompted by the book Boyd: The Fighter Pilot Who Changed the Art of War by Robert Coram, how one might distinguish qualities of possible knowledge that make them more or less likely to be of general benefit to humankind. Conflict knowledge seems to have a general problem. It is often developed under the optimistic assumption that it will give “us” who are well-intentioned the ability to make everybody else behave—or what is close to the same thing, it is developed under existential threat such that it is difficult to think a few years out—we need it or the evil ones will annihilate us. Hence the US and the atom bomb from 1945-49. Note that this kind of situation also motivates some (who have anticipated where I’m going) to insist “We have this advantage today—we probably won’t have it a few years from now—lets maximize our advantage while we have it (i.e. bomb the hell out of the USSR in 1946).
Another kind of knowledge might be called value-added knowledge—knowledge that disproves assumptions about economics being a zero-sum game. Better agriculture, house construction, health measures … One can always come up with counterexamples and some are quite non-trivial—the Internet facilitates formation of terrorist groups and other “echo chambers” of people with destructive or somehow non-benign belief systems. Maybe indeed media development falls in some middle-ground between value-added and conflict-oriented knowledge. Almost anything that can be considered “beneficial to humankind” might just advantage one supremely evil person, but I still think we can meaningfully speak of its general tendency to be beneficial, while the tendency of conflict-knowledge seems mostly in the long run to be neutral at best
Boyd, while developing a radically new philosophy of war-fighting got few rewards in the way of promotion and he was always embattled in the military establishment, but he collected around him a few strong acolytes, and did really if inadequately affect the design of fighter planes and their tactics, and his thought grew more and more ambitious until they embraced the art of war generally, and Coram strongly suggests he was as the side of the planners of the first Gulf War, and had a huge impact on how that was waged.
Unfortunately, as the book was being written several years later, there was speculation that people like Al Qaeda had incorporated some of the lessons of new warfare doctrines developed by the US.
It is generally problematic to predict where knowledge construction is going—because by definition we are making predictions about stuff the nature of which we don’t understand because it hasn’t been thought up yet—yet it seems we had better try, and Moore’s law gives one bit of encouragement. MIRI seems to be in part a huge exercise in this problematic sort of thinking.
If I have anything more than maybe “food for thought”, it may be to look for general tendencies (perhaps unprovable tendencies like Moore’s law) in the way kinds of knowledge affect conflict.