Thank you for confirming. I wanted to be sure I wasn’t putting words in your mouth.
I think I just have a very different model than you of what most people tend to do when they’re constantly horrified by their own actions.
I’m sorry about the animal welfare relevance of this analogy, but it’s the best one I have:
The difference between positive reinforcement and punishment is staggering; you can train a circus animal to do complex tricks using either method, but only under the positive reinforcement method will the animal voluntarily engage further with the trainer. Train an animal with punishment and it will tend to avoid further training, will escape the circus if at all possible.
This is why I think your psychology is unusual. I expect a typical person filled with horror about a behavior to change that behavior for a while (do the trained trick), but eventually find a way to not think about it (avoid the trainer) or change their beliefs in order to not find it horrible any longer (escape the circus). I can believe that your personal history makes the horror an extremely motivating force for you. I just don’t think that’s the default way for people to respond to those sort of experiences and feelings.
It’s also the reason why I want people to reset their zero point such that helpful actions do in fact feel like they push the world into the positive. That gives a positive reinforcement to helpful actions, rather than punishing oneself from any departure from helpful actions. And I expect that to help most people go farther.
Huh… I think the crux of our differences here is that I don’t view my ethical intuition as a trainer which employs negative/positive reinforcement to condition my behavior—I just view it as me. And I care a good bit about staying me. The idea that people would choose to modify their ethical framework to reduce emotional unpleasantness over a) performing a trick like donating which isn’t really that unpleasant in-itself or b) directly resolving the emotional pain in a way that doesn’t modify the ethical framework/ultimate actions really perturbs me.
Can you confirm that the above interpretation is appropriate? I think its less-clearly-true than just “positive reinforcement vs punishment” (which I agree with) and I want to be careful interpreting it in this way. If I do, it will significantly update my world-model/strategy.
I think the self is not especially unified in practice for most people- the elephant and the rider, as it were. (Even the elephant can have something like subagents.) That’s not quite true, but it’s more true than the idea of a human as a unitary agent.
I’m mostly selfish and partly altruistic, and the altruistic part is working hard to make sure that its negotiated portion of the attention/energy/resource budget doesn’t go to waste. Part of that is strategizing about how to make the other parts come along for the ride more willingly.
Reframing things to myself, in ways that don’t change the truth value but do change the emphasis, is very useful. Other parts of me don’t necessarily speak logic, but they do speak metaphor.
I agree that you and I experience the world very differently, and I assert that my experience is the more common one, even among rationalists.
Thanks for confirming. For what it’s worth, I can envision your experience being a somewhat frequent one (and I think it’s probably actually more common among rationalists than the average Jo). It’s somewhat surprising to me because I interact with a lot of (non-rationalist) people who express very low zero-points for the world, give altruism very little attention, yet can often be nudged into taking pretty significant ethical actions almost because I just point out that they can. There’s no specific ethical sub-agent and specific selfish sub-agent, just a whole vaguely selfish person with accurate framing and a willingness to allocate resources when it’s easy.
Maybe these people have not internalized the implications of a low zero-point world in the same way we have but it generally pushes me away from a sub-agent framing with respect to the average person.
I’ll also agree with your implication that my experience is relatively uncommon. I do far more internal double cruxes than the norm and it’s definitely led to some unusual psychology—I’m planning on doing a post on it one of these days.
It’s also the reason why I want people to reset their zero point such that helpful actions do in fact feel like they push the world into the positive. That gives a positive reinforcement to helpful actions, rather than punishing oneself from any departure from helpful actions.
I just want to point out that, while two utility functions that differ only in zero point produce the same outcomes, a single utility function with a dynamically moving zero-point does not. If I just pushed the world into the positive yesterday, why do I have to do it again today? The human brain is more clever than that and, to successfully get away with it, you’d have to be using some really nonstandard utilitarianism.
Thank you for confirming. I wanted to be sure I wasn’t putting words in your mouth.
I think I just have a very different model than you of what most people tend to do when they’re constantly horrified by their own actions.
I’m sorry about the animal welfare relevance of this analogy, but it’s the best one I have:
The difference between positive reinforcement and punishment is staggering; you can train a circus animal to do complex tricks using either method, but only under the positive reinforcement method will the animal voluntarily engage further with the trainer. Train an animal with punishment and it will tend to avoid further training, will escape the circus if at all possible.
This is why I think your psychology is unusual. I expect a typical person filled with horror about a behavior to change that behavior for a while (do the trained trick), but eventually find a way to not think about it (avoid the trainer) or change their beliefs in order to not find it horrible any longer (escape the circus). I can believe that your personal history makes the horror an extremely motivating force for you. I just don’t think that’s the default way for people to respond to those sort of experiences and feelings.
It’s also the reason why I want people to reset their zero point such that helpful actions do in fact feel like they push the world into the positive. That gives a positive reinforcement to helpful actions, rather than punishing oneself from any departure from helpful actions. And I expect that to help most people go farther.
Huh… I think the crux of our differences here is that I don’t view my ethical intuition as a trainer which employs negative/positive reinforcement to condition my behavior—I just view it as me. And I care a good bit about staying me. The idea that people would choose to modify their ethical framework to reduce emotional unpleasantness over a) performing a trick like donating which isn’t really that unpleasant in-itself or b) directly resolving the emotional pain in a way that doesn’t modify the ethical framework/ultimate actions really perturbs me.
Can you confirm that the above interpretation is appropriate? I think its less-clearly-true than just “positive reinforcement vs punishment” (which I agree with) and I want to be careful interpreting it in this way. If I do, it will significantly update my world-model/strategy.
I think the self is not especially unified in practice for most people- the elephant and the rider, as it were. (Even the elephant can have something like subagents.) That’s not quite true, but it’s more true than the idea of a human as a unitary agent.
I’m mostly selfish and partly altruistic, and the altruistic part is working hard to make sure that its negotiated portion of the attention/energy/resource budget doesn’t go to waste. Part of that is strategizing about how to make the other parts come along for the ride more willingly.
Reframing things to myself, in ways that don’t change the truth value but do change the emphasis, is very useful. Other parts of me don’t necessarily speak logic, but they do speak metaphor.
I agree that you and I experience the world very differently, and I assert that my experience is the more common one, even among rationalists.
Thanks for confirming. For what it’s worth, I can envision your experience being a somewhat frequent one (and I think it’s probably actually more common among rationalists than the average Jo). It’s somewhat surprising to me because I interact with a lot of (non-rationalist) people who express very low zero-points for the world, give altruism very little attention, yet can often be nudged into taking pretty significant ethical actions almost because I just point out that they can. There’s no specific ethical sub-agent and specific selfish sub-agent, just a whole vaguely selfish person with accurate framing and a willingness to allocate resources when it’s easy.
Maybe these people have not internalized the implications of a low zero-point world in the same way we have but it generally pushes me away from a sub-agent framing with respect to the average person.
I’ll also agree with your implication that my experience is relatively uncommon. I do far more internal double cruxes than the norm and it’s definitely led to some unusual psychology—I’m planning on doing a post on it one of these days.
I just want to point out that, while two utility functions that differ only in zero point produce the same outcomes, a single utility function with a dynamically moving zero-point does not. If I just pushed the world into the positive yesterday, why do I have to do it again today? The human brain is more clever than that and, to successfully get away with it, you’d have to be using some really nonstandard utilitarianism.
Of course you shouldn’t plan to reset the zero point after actions! That’s very different.
I use this sparingly, for observing big new facts that I didn’t cause to be true. That doesn’t change the relative expected utilities of various actions, so long as my expected change in utility from future observations is zero.