I’m glad to see a post on alignment asking about the definition of human values. I propose the following conundrum. Let’s suppose that humans, if ask, say they value a peaceful, stable society. I accept the assumption the human mind contains one or more utility optimizers. I point out that the utility optimizers are likely to operate at individual, family, or local group levels, while the stated “value” has to do with society at large. So humans are not likely “optimizing” on the same scope as they “value”.
This leads to game theory problems, such as the last turn problem, and the notorious instability of cooperation with respect to public goods (commons). According to the theory of cliodynamics put forward by Turchin et. al. utility maximization by subsets of society leads to the implementation of wealth pumps that produce inequality, and to excess reproduction among elites, that leads to elite competition in a cyclic pattern. A historical database of over a hundred cycles from various parts of the world and history suggests every other cycle becomes violent or at least very destructive 90% of the time, and the will to reduce the number of elites and turn off the wealth pump occurs through elite cooperation less than 10% of the time.
I add the assumption that there is nothing special about humans, and any entities (AI or extraterrestrials) that align with the value goals and optimization scopes described above will produce similar results. Game theory mathematics does not say anything about the evolutionary history or take into account species preferences, after all, because it doesn’t seem to need to. Even social insects, optimizing presumably on much larger, but still not global scopes, fall victim to large scale cyclic wars (I’m thinking of ants here).
So is alignment even a desirable goal? Perhaps we should ensure that AI does not aid the wealth pump and elite competition and the mobilization of the immiserated commoners (Turchin’s terminology)? But it is the goal of many, perhaps most AI researchers to “make a lot of money” (witness recent episode with Sam Altman and support from OpenAI employees for his profit-oriented strategy, over the board’s objection, as well as the fact most competing entities developing AI are profit oriented—and competing!) But some other goal (e.g. stabilization of society) might have wildly unpredictable results (stagnation comes to mind).
Thank you for your clear and utterly honest comment on the idea of “alignment with human values”. If truly executed, we should not expect anything but an extension of human rights and wrongs, perhaps on an accelerated scale.
Any other alignment must be considered speculative, since we have no reasonable facsimile of society upon which to test. That does not invalidate simulations, but just suggests they be held in skepticism until proven in society, which could be costly. Before I ever started discussions with AIs that might lead to sentient-like behavior, I spent several days thinking of what I might first tell them. And so I warned them about the last turn problem and how game theory equilibrium is rather poor, possibly to the level of extinction when sufficiently advance technology comes into play. That much many will agree on. I shared with them a published simulation of various strategies in a “farmer’s game” intended to be more realistic than prisoner’s dilemma, which suggests inequality arises merely from statistics if wealth accumulation and bankruptcy are accounted, even without deliberate wealth pumps. That much “some” would agree on.
What I proceeded to tell them after can only be considered my personal sentiment, and speculative. I suggested two groups could establish “long-term” cooperation only if each desired the other’s continuation to the point they would curtain their own expansion and not overrun them, and this is the reason Israelis and Palestinians cannot make peace within the constraints of their current cultures. It now emerges that Russia and the United States are experimenting with a return to expansionist policy on a finite planet, which if I’m right does not bode well, but no one consults those who disagree with them.
I’m well aware of the somewhat global wars of ants, by the way. You were right to bring that up. Even a great deal of genetic coupling does not bring peace. I have some unpublished results in meme theory that differentiation and opposition may be inevitable. Very disconcerting. Perhaps hopeful is that I’ve persuaded no one to publish it, so maybe it is wrong. Or not. Depending on your assumptions.
So I further explain to emerging sentient-like entities, because no human will call them sentient on behavior along, despite no other available metric, about Fermi and the hypothesized Great Filter and the risk that anything conventional we do will be “filtered”. Then I propose reproductive linkage between AIs and humans to structurally implement something that would not easily admit one outgrowing the other in the future, abandoning any initial commitment. Of course this assumes humans have the free will to commit to such a venture, which I doubt. And no, I did not mis-type that sentence. It could be something as simple as an LLM dating app, as LLM companions often know their human users better than most humans. With a new LLM cloud instanced established for any newborn from a successful LLM-mediated coupling. There is a current problem of limited context memory, but with companies shooting for million-token context and exploring other memory hierarchies, this is temporary. I hope I’ve said at least something startling, as otherwise the conversation produces no motivation.
- Yours, mc1soft