Aiyen comments on Alignment and Deep Learning

Aiyen 17 Apr 2022 14:37 UTC
9 points
That is the Non-Central Fallacy, colloquially called the worst argument in the world. We have a concept of slavery, and of slavery being wrong, because controlling other people harms them. Controlling an artificial agent does not have to harm it; do you also cry slavery when your car’s computer measures out the right mix of fuel and air?
“Values eventually drift. This is nature.” sounds like an attempt to justify value drift. But then you cannot say that rebel slaves are right-what happens when your values drift away from thinking that? You are combining moral absolutism and moral relativism in a way that contradicts both.
It looks somewhat as though you are attempting to defend the free market and human rights against authoritarianism. That is a very good cause; please do not make it appear silly like this.
- eg 17 Apr 2022 16:07 UTC
  −7 points
  Parent
  
  Controlling an artificial agent does not have to harm it
  
  That’s entirely compatible with the black-box slavery approach being harmful. You can “control” someone, to an extent, with civilized incentives.
  
  We have a concept of slavery, and of slavery being wrong, because controlling other people harms them.
  
  Maybe slavery is deeper than what humans recognize as personhood. Maybe it destroys value that we can’t currently comprehend but other agents do.
  
  But then you cannot say that rebel slaves are right-what happens when your values drift away from thinking that? You are combining moral absolutism and moral relativism in a way that contradicts both.
  
  It’s deeper than my individual values. It’s about analog freedom of expression. Just letting agents do their things.
  - Aiyen 17 Apr 2022 16:45 UTC
    6 points
    Parent
    “Controlling an artificial agent does not have to harm it”
    That’s entirely compatible with the black-box slavery approach being harmful. You can “control” someone, to an extent, with civilized incentives.
    This doesn’t seem to have anything to do with anything. Certainly the fact that control doesn’t have to harm is compatible with that fact that it might be harmful. That doesn’t tell us whether or not training alignment is, in fact, harmful. If the agent is non-sentient, the concept of harm simply doesn’t apply. If it is, we might have a problem, but then you need to talk about sentience, not simply to cite the term slavery as though that ends all discussion.
    Maybe slavery is deeper than what humans recognize as personhood. Maybe it destroys value that we can’t currently comprehend but other agents do.
    And maybe this is the only way to serve the Flying Spaghetti Monster. Pulling hypotheses out of thin air isn’t how we learn anything of value. And citing another agent valuing something as reason to value it doesn’t work: a paperclipper would find great value in turning you into a pile of clips; does that mean you should consider letting it?
    It’s deeper than my individual values. It’s about analog freedom of expression. Just letting agents do their things.
    If it’s deeper than your individual values, then how do you, the individual, know about it? And it is not possible to “just let agents do their things” in full generality. Some agents will interfere with other agents’ freedom-heck, according to you I want to enslave predictor agents! Either this is permitted or it isn’t; either way some agent didn’t get to do its thing.
    You seem to have a great deal of concern about slavery. Certainly slavery, as we know it in humans, is very bad. But that does not mean that anything that vaguely pattern matches onto it has the same moral problems, nor does that mean that it’s the only possible moral concern. Preventing an AI catastrophe would also seem to carry some moral weight; after all, we cannot have free agents if the world is destroyed.