How exactly do you propose that the AI “weighs contextual constraints incorrectly” when the process of weighing constraints requires most of the constraints involved (probably thousands of them) to all suffer a simultaneous, INDEPENDENT ‘failure’ for this to occur?
And your reply was:
I’d hazard a guess that, for any given position, less than 70% of humans will agree without reservation. The issue isn’t that thousands of failures occur. The issue is that thousands of failures -always- occur.
This reveals that you are really not understanding what a weak constraint system is, and where the system is located.
When the human mind looks at a scene and uses a thousand clues in the scene to constrain the interpretation of it, those thousand clues all, when the network settles, relax into a state in which most or all of them agree about what is being seen. You don’t get “less than 70%” agreement on the interpretation of the scene! If even one element of the scene violates a constraint in a strong way, the mind orients toward the violation extremely rapidly.
The same story applies to countless other examples of weak constraint relaxation systems dropping down into energy minima.
Let me know when you do understand what you are talking about, and we can resume.
There is no energy minimum, if your goal is Friendliness. There is no “correct” answer. No matter what your AI does, no matter what architecture it uses, with respect to human goals and concerns, there is going to be a sizable percentage to whom it is unequivocally Unfriendly.
This isn’t an image problem. The first problem you have to solve in order to train the system is—what are you training it to do?
You’re skipping the actual difficult issue in favor of an imaginary, and easy to solve, issue.
there is going to be a sizable percentage to whom it is unequivocally Unfriendly
Unfriendly is an equivocal term.
“Friendliness” is ambiguous. It can mean safety, ie not making things worse, or it can mean making things better, creating paradise on Earth.
Friendliness in the second sense is a superset of morality. A friendly AI will be moral, a moral AI will not necessarily be friendly.
“Unfriendliness” is similarly ambiguous: an unfriendly AI may be downright dangerous; or it might have enough grasp of ethics to be safe, but not enough to be able to make the world a much more fun place for humans. Unfriendliness in the second sense is not, strictly speaking a safety issue.
A lot of people are able to survive the fact that some institutions, movements and ideologies are unfriendly to them, for some value of unfriendly. Unfriendliness doesn’t have to be terminal.
I said:
And your reply was:
This reveals that you are really not understanding what a weak constraint system is, and where the system is located.
When the human mind looks at a scene and uses a thousand clues in the scene to constrain the interpretation of it, those thousand clues all, when the network settles, relax into a state in which most or all of them agree about what is being seen. You don’t get “less than 70%” agreement on the interpretation of the scene! If even one element of the scene violates a constraint in a strong way, the mind orients toward the violation extremely rapidly.
The same story applies to countless other examples of weak constraint relaxation systems dropping down into energy minima.
Let me know when you do understand what you are talking about, and we can resume.
There is no energy minimum, if your goal is Friendliness. There is no “correct” answer. No matter what your AI does, no matter what architecture it uses, with respect to human goals and concerns, there is going to be a sizable percentage to whom it is unequivocally Unfriendly.
This isn’t an image problem. The first problem you have to solve in order to train the system is—what are you training it to do?
You’re skipping the actual difficult issue in favor of an imaginary, and easy to solve, issue.
Unfriendly is an equivocal term.
“Friendliness” is ambiguous. It can mean safety, ie not making things worse, or it can mean making things better, creating paradise on Earth.
Friendliness in the second sense is a superset of morality. A friendly AI will be moral, a moral AI will not necessarily be friendly.
“Unfriendliness” is similarly ambiguous: an unfriendly AI may be downright dangerous; or it might have enough grasp of ethics to be safe, but not enough to be able to make the world a much more fun place for humans. Unfriendliness in the second sense is not, strictly speaking a safety issue.
A lot of people are able to survive the fact that some institutions, movements and ideologies are unfriendly to them, for some value of unfriendly. Unfriendliness doesn’t have to be terminal.
Everything is equivocal to someone. Do you disagree with my fundamental assertion?
I can’t answer unequivocally for the reasons given.
There won’t be a sizeable percentage to whom the AI is unfriendly in the sense of obliterating them.
There might well be a percentage to whom the AI is unfriendly in some business as usual sense.
Obliterating them is only bad by your ethical system. Other ethical systems may hold other things to be even worse.
Irrelevant.
You responded to me in this case. It’s wholly relevant to my point that You-Friendly AI isn’t a sufficient condition for Human-Friendly AI.
However there are a lot of “wrong” answers.