JohnCDraper comments on Humans provide an untapped wealth of evidence about alignment

JohnCDraper 14 Aug 2022 8:49 UTC
−3 points
−2
A couple of points. In certain, limited expressions of human international relations, we do know what the decision making process was to reach major decisions, such as the drafting and signing of the Geneva Conventions, some of humanity’s best and most successful policy documents. This is because the archived UK and US decision making processes were very well documented and are now declassified.
In other words, we can actually deconstruct human decision making principles (probably conforming instrumentalism in the case of the Geneva Conventions) and then use them to ground or verify other principles.
Coherent diamonds are likely to be obtained from human utopianesque sci fi than anywhere else. This is because decision making in international relations (which is what is at stake here, considering one country will develop the AGI first and an agential ASI may well be a sovereign agent) is transparent in plots but is usually opaque in real life decision making. Finding diamonds can be approached, and likely must be approached, in a modular fashion, so as to coherently construct, with multiple options transparent to human decision makers, Yudkowsky’s friendly AI via coherent extrapolated volition.
So, for example, if we want humanity to be served by an AGI/ASI in line with Asimov’s vision of an AGI Nanny overseeing humanity’s development, we deconstruct his Multivac stories and determine what principles it runs on and then express them in the form of programmable principles. If we want a certain problem approached via the Star Trek TNG ethos, then we need to break down what ethical principles are in play in a way that we can program into a nascent AGI. At a higher level of tech and maybe a more ruthless level of interstellar relations, the ethics of Banks’ Culture series module kicks in, and so on.
This argument is expounded in Carayannis, E.G., Draper, J. Optimising peace through a Universal Global Peace Treaty to constrain the risk of war from a militarised artificial superintelligence. AI & Soc (2022). https://doi.org/10.1007/s00146-021-01382-y