I think most LW readers don’t see much sacrosanct about evolved values
Maybe because they think about them in far mode. If you think about values as some ancient commandments written on some old parchment, it does not seem like rewriting the parchment could be a problem.
Let’s try it in the near mode. Imagine that 1000 years later you are defrosted and see a society optimized for… maximum suffering and torture. You are explained that it happened as a result of an experiment to initialize the superhuman AI with random values… and this was what the random generator has generated. It will be like this till the end of the universe. Enjoy the hell.
What is your reaction on this? Some values were replaced by some other values—thinking abstractly enough, it seems like nothing essential has changed; we are just optimizing for Y instead of X. Most of the algorithm is the same. Even many of the AI actions are the same: it tries to better understand human psychology and physiology, get more resources, protect itself against failure or sabotage, self-improve, etc.
How could you explain what is wrong with this scenario, without using some of our evolved values in your arguments? Do you think that a pebblesorter, concerned only with sorting pebbles, would see an important difference between “human hell” and “human paradise” scenarios? Do you consider this neutrality of pebblesorter with regards to human concerns (and a neutrality of humans with regards to pebblesorter concerns) to be a desirable outcome?
(No offense to pebblesorters. If we ever meet them, I hope we can cooperate to create a universe with a lot of happy humans and properly sorted heaps of pebbles.)
How could you explain what is wrong with this scenario, without using some of our evolved values in your arguments?
It’s only “wrong” in the sense that I don’t want it, i.e. it doesn’t accord with my values. I don’t see the need to mention the fact that they may have been affected by evolution.
Maybe because they think about them in far mode. If you think about values as some ancient commandments written on some old parchment, it does not seem like rewriting the parchment could be a problem.
Let’s try it in the near mode. Imagine that 1000 years later you are defrosted and see a society optimized for… maximum suffering and torture. You are explained that it happened as a result of an experiment to initialize the superhuman AI with random values… and this was what the random generator has generated. It will be like this till the end of the universe. Enjoy the hell.
What is your reaction on this? Some values were replaced by some other values—thinking abstractly enough, it seems like nothing essential has changed; we are just optimizing for Y instead of X. Most of the algorithm is the same. Even many of the AI actions are the same: it tries to better understand human psychology and physiology, get more resources, protect itself against failure or sabotage, self-improve, etc.
How could you explain what is wrong with this scenario, without using some of our evolved values in your arguments? Do you think that a pebblesorter, concerned only with sorting pebbles, would see an important difference between “human hell” and “human paradise” scenarios? Do you consider this neutrality of pebblesorter with regards to human concerns (and a neutrality of humans with regards to pebblesorter concerns) to be a desirable outcome?
(No offense to pebblesorters. If we ever meet them, I hope we can cooperate to create a universe with a lot of happy humans and properly sorted heaps of pebbles.)
It’s only “wrong” in the sense that I don’t want it, i.e. it doesn’t accord with my values. I don’t see the need to mention the fact that they may have been affected by evolution.