MinibearRex comments on Why No Wireheading?

MinibearRex 19 Jun 2011 5:02 UTC
7 points

What is this “valuing”? How do you know that something is a “value”, terminal or not?

Are you looking for a definition? Specifically coming up with a dictionary definition for the word “value” doesn’t seem like it would be very instrumental to this discussion. But really, I think just about everyone has a pretty good sense for what we’re talking about when we post the symbols “v”, “a”, “l”, “u”, and “e” on less wrong for us to simply discuss the concept of value without trying to come up with a “definition”.

How do you know what it’s about? How would you know if you were mistaken? What about unconscious hypocrisy or confabulation?

Completely understanding what it is we value is a cognitive psychology or AI question. For our purposes, we can get some pretty decent bayesian evidence on what our values are by simply asking “which future scenario do I want to steer the world towards?” Is that going to give us perfect information on exactly what we value? No. But is it a pretty good start? Yes.

Where do these “values” come from (i.e. what process creates them)?

Evolution. Next question.

Overall, it sounds to me like people are confusing their feelings about (predicted) states of the world with caring about states directly.

That’s what EY talks about when he uses the phrase “how _ feels from inside”. I do feel like I care about states directly. That means, of course, that what is actually being fed into that algorithm is my own mental model of a future world. Now, when we talk about wireheading, the mental model of that particular future world causes the algorithm to return the response “I’d prefer a different future” from my own decision making algorithm. I don’t want my future self to be wireheaded. If someone else does, I will be disappointed, because I will not get a chance to interact or socialize with them any more. We will become so different that we will never get a chance to have some awesome experience together. I won’t necessarily stop them, but I would try to talk them out of it.
- [deleted] 19 Jun 2011 12:47 UTC
  2 points
  Parent
  
  Are you looking for a definition?
  
  No, I’m trying to understand the process others use to make their claims about what they value (besides direct experiences). I can’t reproduce it, so it feels like they are confabulating, but I don’t assume that’s the most likely answer here.
  
  For our purposes, we can get some pretty decent bayesian evidence on what our values are by simply asking “which future scenario do I want to steer the world towards?” Is that going to give us perfect information on exactly what we value? No. But is it a pretty good start? Yes.
  
  That seems horribly broken. There are tons of biases that make asking such questions essentially meaningless. Looking at anticipated and real rewards and punishments can easily be done and fits into simple models that actually predict people’s behaviors. Asking complex question leads to stuff like the Trolley problem which is notoriously unreliable and useless with regards to figuring out why we prefer some options to others.
  
  It seems to me that assuming complex values requires cognitive algorithms that are much more expensive than anything evolution might build and don’t easily fit actually revealed preferences. Their only strength seems to be that they would match some thoughts that come up while contemplating decisions (and not even non-contradictory ones). Isn’t that privileging a very complex hypothesis?