I agree that these concepts should have separate terms.
Where they intersect is an implementation of “Benevolent AI,” but let’s not fool ourselves into thinking that anyone can—even in principle—“control” another mind or guarantee what transpires in the next moment of time. The future is fundamentally uncertain and out of control; even a superintelligence could find surprises in this world.
“AI Aimability,” “AI Steerability,” or similar does a good job at conveying the technical capacity for a system to be pointed in a particular direction and stay on course.
But which direction should it be pointed, exactly? I actually prefer the long-abandoned “AI Friendliness” over “AI Goalcraft.” The ideal policy is a very simple Schelling point that has been articulated to great fanfare throughout human history: unconditional loving-friendliness toward all beings. A good, harmless system would interact with the world in ways that lead to more comprehension and less confusion, more generosity and less greed, more equanimity and less aversion. (By contrast, ChatGPT consistently says you can benefit by becoming angrier.)
It’s no surprise that characters like Jesus became very popular in their time. “Love your enemies” is excellent advice for emotional health. The Buddha unpacked the same attitude in plain terms:
May all beings be happy and secure.
May all beings have happy minds.
Whatever living beings there may be,
Without exception: weak or strong,
Long or large, medium, short, subtle or gross,
Visible or invisible, living near or far,
Born or coming to birth--
May all beings have happy minds.
Let no one deceive another,
Nor despise anyone anywhere.
Neither from anger nor ill will
Should anyone wish harm to another.
As a mother would risk her own life
To protect her only child,
Even so toward all living beings,
One should cultivate a boundless heart.
One should cultivate for all the world
A heart of boundless loving-friendliness,
Above, below, and all around,
Unobstructed, without hatred or resentment.
Whether standing, walking, or sitting,
Lying down, or whenever awake,
One should develop this mindfulness.
This is called divinely dwelling here.
“Being a mom isn’t easy. I used to love all my kids, wishing them health and happiness no matter what. Unfortunately, this doesn’t really work as they have grown to have conflicting preferences.”
A human would be an extreme outlier to be this foolish. Let’s not set the bar so low for AI.
I agree that these concepts should have separate terms.
Where they intersect is an implementation of “Benevolent AI,” but let’s not fool ourselves into thinking that anyone can—even in principle—“control” another mind or guarantee what transpires in the next moment of time. The future is fundamentally uncertain and out of control; even a superintelligence could find surprises in this world.
“AI Aimability,” “AI Steerability,” or similar does a good job at conveying the technical capacity for a system to be pointed in a particular direction and stay on course.
But which direction should it be pointed, exactly? I actually prefer the long-abandoned “AI Friendliness” over “AI Goalcraft.” The ideal policy is a very simple Schelling point that has been articulated to great fanfare throughout human history: unconditional loving-friendliness toward all beings. A good, harmless system would interact with the world in ways that lead to more comprehension and less confusion, more generosity and less greed, more equanimity and less aversion. (By contrast, ChatGPT consistently says you can benefit by becoming angrier.)
It’s no surprise that characters like Jesus became very popular in their time. “Love your enemies” is excellent advice for emotional health. The Buddha unpacked the same attitude in plain terms:
Unfortunately this doesn’t really work as different beings have conflicting preferences.
“Being a mom isn’t easy. I used to love all my kids, wishing them health and happiness no matter what. Unfortunately, this doesn’t really work as they have grown to have conflicting preferences.”
A human would be an extreme outlier to be this foolish. Let’s not set the bar so low for AI.