The internal conflict (IC) in a value system is the negative of the sum, over all pairs of nodes, of the product of the node values and the connection weight between them. This is an energy measure that we want to minimize.
Why?
Good question. Because I prefer a value system that’s usually not self-contradictory over one that’s usually self-contradictory. I can’t convince you that this is good if you are a moral nihilist, which is a very popular position on LW and, I think, central to CEV. If all possible value systems are equally good, by all means, choose one that tells you to love or hate people based on their fingerprints, and kill your friends if they walk through a doorway backwards.
Empirically, value systems with high IC resemble conservative religious values, which take evolved human values, and then pile an arbitrary rule system on top of them which gives contradictory, hard-to-interpret results resulting in schizophrenic behavior that appears insane to observers from almost any other value system, causes great pain and stress to its practitioners, and often leads to bloody violent conflicts because of their low correlation with other value systems.
Say I’m shopping for a loaf of bread. I have two values. I prefer larger loaves over smaller loaves, and I prefer cheaper loaves over more expensive loaves.
Unfortunately, these values are negatively correlated with each other (larger loaves tend to cost more). Clearly, my values are an arbitrary rule system which gives contradictory, hard-to-interpret results resulting in schizophrenic behavior that appears insane to observers from almost any other value system.
So how should I resolve this? Should I switch to preferring smaller loaves of bread, or should I switch to preferring more expensive loaves of bread?
That depends on why you prefer larger loaves of bread.
If you’re maximizing calories or just want to feel that you’re getting a good deal, go for the highest calorie-to-dollar ratio, noting sales.
If you need more surface area for your sandwiches, choose bread that is shaped in a sandwich-optimal configuration with little hard-to-sandwich heel volume. Make thin slices so you can make more sandwiches, and get an amount of bread that will last just about exactly until you go to the store again or until you expect diminishing marginal utility from bread-eating due to staleness.
If you want large loaves to maximize the amount of time between grocery trips, buy 6 loaves of the cheapest kind and put 5 of them in the freezer, to take out as you finish room-temperature bread.
If you just think large loaves of bread are aesthetically pleasing, pick a kind of bread with lots of big air pockets that puff it up, which is priced by dough weight.
etc. etc.
Figuring out why you have a value, or what the value is attached to, is usually a helpful exercise when it apparently conflicts with other things.
Figuring out why you have a value, or what the value is attached to, is usually a helpful exercise when it apparently conflicts with other things.
I think that, though you have given good approaches to making a good tradeoff, the conflict between values in this example is real, and the point is that you make the best tradeoff you can in the context, but don’t modify your values because the internal conflict makes it hard to achieve them.
Point taken—you certainly don’t want to routinely solve problems by changing your values instead of changing your environment.
However, I think you tend to think about deep values, what I sometimes call latent values, while I often talk about surface values, of the type that show up in English sentences and in logical representations of them. People do change their surface values: they become vegetarian, quit smoking, go on a diet, realize they don’t enjoy Pokemon anymore, and so on. I think that this surface-value-changing is well-modelled by energy minimization.
Whether there is a set of “deepest values” that never change is an open question. These are the things EY is talking about when he says an agent would never want to change its goals, and that you’re talking about when you say an agent doesn’t change its utility function. The EY-FAI model assumes such a thing exists, or that they should exist, or could exist. This needs to be thought about more. I think my comments in “Only humans can have human values” on “network concepts” are relevant. It’s not obvious that a human’s goal structure has top-level goals. It would be a possibly-unique exception among complex network systems if they do.
I see your point. I wasn’t thinking of models where you have one preference per object feature. I was thinking of more abstract examples, like trying to be a cheek-turning enemy-loving Christian and a soldier at the same time.
I don’t think of choosing an object whose feature vector has the maximum dot product with your preference vector as conflict resolution; I think of it (and related numerical constraint problems) as simplex optimization. When you want to sum a set of preferences that are continuous functions of continuous features, you can generally take all the preferences and solve directly (or numerically) to find the optimum.
In the “moral values” domain, you’re more likely to have discontinuous rules (e.g., “X is always bad”, or “XN is not”), and be performing logical inference over them. This results in situations that you can’t solve directly, and it can result in circular or indeterminate chains of reasoning, and multiple possible solutions.
My claim is that more conflicts is worse, not that conflicts can or should be eliminated. But I admit that aspect of my model could use more justification.
Is there a way to distinguish moral values from other kinds of values? Coming up with a theory of values that explains both the process of choosing who to vote for, and threading a needle, as value-optimization, is going to be difficult.
In the “moral values” domain, you’re more likely to have discontinuous rules (e.g., “X is always bad”, or “XN is not”), and be performing logical inference over them. This results in situations that you can’t solve directly, and it can result in circular or indeterminate chains of reasoning, and multiple possible solutions.
This line of thinking is setting off my rationalization detectors. It sounds like you’re saying, “OK, I’ll admit that my claim seems wrong in some simple cases. But it’s still correct in all of the cases that are so complicated that nobody understands them.”
I don’t know how to distinguish moral values from other kinds of values, but it seems to me that this isn’t exactly the distinction that would be most useful for you to figure out. My suggestion would be to figure out why you think high IC is bad, and see if there’s some nice way to characterize the value systems that match that intuition.
I think a natural intuition about a moral values domain suggests that things are likely to be non-linear and discontinuous.
I don’t think its so much saying the claim is wrong in simple cases, but its still correct in cases no one understands.
It’s more saying the alternative claims being proposed are a long ways from handling any real world example, and I’m disinclined to believe that a sufficiently complicated system will satisfy continuity and linearity.
Also, we should distinguish between “why do I expect that existing value systems are energy-minimized” and “why should we prefer value systems that are energy-minimized”.
The former is easier to answer, and I gave a bit of an answer in “Only humans can have human values”.
The latter I could justify within EY-FAI by therefore claiming that being energy-minimized is a property of human values.
My suggestion would be to figure out why you think high IC is bad, and see if there’s some nice way to characterize the value systems that match that intuition.
That’s a good idea. My “final reason” for thinking that high IC is bad may be because high-IC systems are a pain in the ass when you’re building intelligent agents. They have a lot of interdependencies among their behaviors, get stuck waffling between different behaviors, and are hard to debug. But we (as designers and as intelligent agents) have mechanisms to deal with these problems; e.g., producing hysteresis by using nonlinear functions to sum activation from different goals.
My other final reason is that I consciously try to energy-minimize my own values, and I think other thoughtful people who aren’t nihilists do too. Probably nihilists do too, if only for their own convenience.
My other other final reason is that energy-minimization is what dynamic network concepts do. It’s how they develop, as e.g. for spin-glasses, economies, or ecologies.
Because I prefer a value system that’s usually not self-contradictory over one that’s usually self-contradictory.
Sometimes values you really have are in conflict, you have options to achieve one to a certain extent, or to achieve the other to different extent, and you have to figure out which is more important to you. This does not mean that you give up the value you didn’t choose, just that in that particular situation, it was more effective to pursue the other. Policy Debates Should Not Appear One-Sided.
I upvoted because the formalization is interesting and the observation of what happens when we average values is a good one. But I’m still far from convinced IC is really what we need to worry about.
Empirically, value systems with high IC resemble conservative religious values, which take evolved human values, and then pile an arbitrary rule system on top of them which gives contradictory, hard-to-interpret results resulting in schizophrenic behavior that appears insane to observers from almost any other value system, causes great pain and stress to its practitioners, and often leads to bloody violent conflicts because of their low correlation with other value systems.
I think all of this applies to my liberal values: arbitrary rule system on top of evolved values? Check. Appears insane to observers from almost any other value system? Check. Causes great pain and stress to its practitioners? Check. Bloody violent conflicts because of their low correlation with other value systems? Double check!
Good question. Because I prefer a value system that’s usually not self-contradictory over one that’s usually self-contradictory. I can’t convince you that this is good if you are a moral nihilist, which is a very popular position on LW and, I think, central to CEV. If all possible value systems are equally good, by all means, choose one that tells you to love or hate people based on their fingerprints, and kill your friends if they walk through a doorway backwards.
Empirically, value systems with high IC resemble conservative religious values, which take evolved human values, and then pile an arbitrary rule system on top of them which gives contradictory, hard-to-interpret results resulting in schizophrenic behavior that appears insane to observers from almost any other value system, causes great pain and stress to its practitioners, and often leads to bloody violent conflicts because of their low correlation with other value systems.
Say I’m shopping for a loaf of bread. I have two values. I prefer larger loaves over smaller loaves, and I prefer cheaper loaves over more expensive loaves.
Unfortunately, these values are negatively correlated with each other (larger loaves tend to cost more). Clearly, my values are an arbitrary rule system which gives contradictory, hard-to-interpret results resulting in schizophrenic behavior that appears insane to observers from almost any other value system.
So how should I resolve this? Should I switch to preferring smaller loaves of bread, or should I switch to preferring more expensive loaves of bread?
That depends on why you prefer larger loaves of bread.
If you’re maximizing calories or just want to feel that you’re getting a good deal, go for the highest calorie-to-dollar ratio, noting sales.
If you need more surface area for your sandwiches, choose bread that is shaped in a sandwich-optimal configuration with little hard-to-sandwich heel volume. Make thin slices so you can make more sandwiches, and get an amount of bread that will last just about exactly until you go to the store again or until you expect diminishing marginal utility from bread-eating due to staleness.
If you want large loaves to maximize the amount of time between grocery trips, buy 6 loaves of the cheapest kind and put 5 of them in the freezer, to take out as you finish room-temperature bread.
If you just think large loaves of bread are aesthetically pleasing, pick a kind of bread with lots of big air pockets that puff it up, which is priced by dough weight.
etc. etc.
Figuring out why you have a value, or what the value is attached to, is usually a helpful exercise when it apparently conflicts with other things.
I think that, though you have given good approaches to making a good tradeoff, the conflict between values in this example is real, and the point is that you make the best tradeoff you can in the context, but don’t modify your values because the internal conflict makes it hard to achieve them.
Point taken—you certainly don’t want to routinely solve problems by changing your values instead of changing your environment.
However, I think you tend to think about deep values, what I sometimes call latent values, while I often talk about surface values, of the type that show up in English sentences and in logical representations of them. People do change their surface values: they become vegetarian, quit smoking, go on a diet, realize they don’t enjoy Pokemon anymore, and so on. I think that this surface-value-changing is well-modelled by energy minimization.
Whether there is a set of “deepest values” that never change is an open question. These are the things EY is talking about when he says an agent would never want to change its goals, and that you’re talking about when you say an agent doesn’t change its utility function. The EY-FAI model assumes such a thing exists, or that they should exist, or could exist. This needs to be thought about more. I think my comments in “Only humans can have human values” on “network concepts” are relevant. It’s not obvious that a human’s goal structure has top-level goals. It would be a possibly-unique exception among complex network systems if they do.
I see your point. I wasn’t thinking of models where you have one preference per object feature. I was thinking of more abstract examples, like trying to be a cheek-turning enemy-loving Christian and a soldier at the same time.
I don’t think of choosing an object whose feature vector has the maximum dot product with your preference vector as conflict resolution; I think of it (and related numerical constraint problems) as simplex optimization. When you want to sum a set of preferences that are continuous functions of continuous features, you can generally take all the preferences and solve directly (or numerically) to find the optimum.
In the “moral values” domain, you’re more likely to have discontinuous rules (e.g., “X is always bad”, or “XN is not”), and be performing logical inference over them. This results in situations that you can’t solve directly, and it can result in circular or indeterminate chains of reasoning, and multiple possible solutions.
My claim is that more conflicts is worse, not that conflicts can or should be eliminated. But I admit that aspect of my model could use more justification.
Is there a way to distinguish moral values from other kinds of values? Coming up with a theory of values that explains both the process of choosing who to vote for, and threading a needle, as value-optimization, is going to be difficult.
This line of thinking is setting off my rationalization detectors. It sounds like you’re saying, “OK, I’ll admit that my claim seems wrong in some simple cases. But it’s still correct in all of the cases that are so complicated that nobody understands them.”
I don’t know how to distinguish moral values from other kinds of values, but it seems to me that this isn’t exactly the distinction that would be most useful for you to figure out. My suggestion would be to figure out why you think high IC is bad, and see if there’s some nice way to characterize the value systems that match that intuition.
I disagree with this.
I think a natural intuition about a moral values domain suggests that things are likely to be non-linear and discontinuous.
I don’t think its so much saying the claim is wrong in simple cases, but its still correct in cases no one understands.
It’s more saying the alternative claims being proposed are a long ways from handling any real world example, and I’m disinclined to believe that a sufficiently complicated system will satisfy continuity and linearity.
Also, we should distinguish between “why do I expect that existing value systems are energy-minimized” and “why should we prefer value systems that are energy-minimized”.
The former is easier to answer, and I gave a bit of an answer in “Only humans can have human values”.
The latter I could justify within EY-FAI by therefore claiming that being energy-minimized is a property of human values.
That’s a good idea. My “final reason” for thinking that high IC is bad may be because high-IC systems are a pain in the ass when you’re building intelligent agents. They have a lot of interdependencies among their behaviors, get stuck waffling between different behaviors, and are hard to debug. But we (as designers and as intelligent agents) have mechanisms to deal with these problems; e.g., producing hysteresis by using nonlinear functions to sum activation from different goals.
My other final reason is that I consciously try to energy-minimize my own values, and I think other thoughtful people who aren’t nihilists do too. Probably nihilists do too, if only for their own convenience.
My other other final reason is that energy-minimization is what dynamic network concepts do. It’s how they develop, as e.g. for spin-glasses, economies, or ecologies.
Sometimes values you really have are in conflict, you have options to achieve one to a certain extent, or to achieve the other to different extent, and you have to figure out which is more important to you. This does not mean that you give up the value you didn’t choose, just that in that particular situation, it was more effective to pursue the other. Policy Debates Should Not Appear One-Sided.
I upvoted because the formalization is interesting and the observation of what happens when we average values is a good one. But I’m still far from convinced IC is really what we need to worry about.
I think all of this applies to my liberal values: arbitrary rule system on top of evolved values? Check. Appears insane to observers from almost any other value system? Check. Causes great pain and stress to its practitioners? Check. Bloody violent conflicts because of their low correlation with other value systems? Double check!
And I still like my liberal values!
Good point. Maybe tribal ethics have the least internal conflict, since they may be closest to an equilibrium reached by evolution.