Here is a 5 minute, spicy take of an alignment chart.
What do you disagree with.
To try and preempt some questions:
Why is rationalism neutral?
It seems pretty plausible to me that if AI is bad, then rationalism did a lot to educate and spur on AI development. Sorry folks.
Why are e/accs and EAs in the same group.
In the quick moments I took to make this, I found both EA and E/acc pretty hard to predict and pretty uncertain in overall impact across some range of forecasts.
I disagree with “of course”. The laws of cognition aren’t on any side, but human rationalists presumably share (at least some) human values and intend to advance them; insofar they are more successful than non-rationalists this qualifies as Good.
So by my metric, Yudkowsky and Lintemandain’s Dath Ilan isn’t neutral, it’s quite clearly lawful good, or attempting to be. And yet they care a lot about the laws of cognition.
So it seems to me that the laws of cognition can (should?) drive towards flouishing rather than pure knowledge increase. There might be things that we wish we didn’t know for a bit. And ways to increase our strength to heal rather than our strength to harm.
To me it seems a better rationality would be lawful good.
A lot of problems arise from inaccurate beliefs instead of bad goals. E.g. suppose both the capitalists and the communists are in favor of flourishing, but they have different beliefs on how best to achieve this. Now if we pick a bad policy to optimize for a noble goal, bad things will likely still follow.
Interesting. I always thought the D&D alignment chart was just a random first stab at quantizing a standard superficial Disney attitude toward ethics. This modification seems pretty sensible.
I think your good/evil axis is correct in terms of a deeper sense of the common terms. Evil people don’t try to harm others typically, they just don’t care- so their efforts to help themselves and their friends is prone to harm others. Being good means being good to everyone, not just your favorites. It’s the size of your circle of compassion. Outright malignancy, cackling about others suffering, is pretty eye-catching when it happens (and it does), but I’d say the vast majority of harm in the world has been done by people who are merely not much concerned with collateral damage. Thus, I think those deserve the term evil, lest we focus on the wrong thing.
Predictable/unpredictable seems like a perfectly good alternate label for the chaotic/lawful. In some adversarial situations, it makes sense to be unpredictable.
One big question is whether you’re referring to intentions or likely outcomes in your expected valaue (which I assume is expected value for all sentient beings or somethingg). A purely selfish person without much ambition may actually be a net good in the world; they work for the benefit of themselves and those close enough to be critical for their wellbeing, and they don’t risk causing a lot of harm since that might cause blowback. The same personality put in a position of power might do great harm, ordering an invasion or employee downsizing to benefit themselves and their family while greatly harming many.
Yeah I find the intention vs outcome thing difficult.
What do you think of “average expected value across small perturbations in your life”. Like if you accidentally hit churchill with a car and so cause the UK to lose WW2 that feels notably less bad than deliberately trying to kill a much smaller number of people. In many nearby universes, you didn’t kill churchill, but in many nearby universes that person did kill all those people.
Here is a 5 minute, spicy take of an alignment chart.
What do you disagree with.
To try and preempt some questions:
Why is rationalism neutral?
It seems pretty plausible to me that if AI is bad, then rationalism did a lot to educate and spur on AI development. Sorry folks.
Why are e/accs and EAs in the same group.
In the quick moments I took to make this, I found both EA and E/acc pretty hard to predict and pretty uncertain in overall impact across some range of forecasts.
What? This apology makes no sense. Of course rationalism is Lawful Neutral. The laws of cognition aren’t, can’t be, on anyone’s side.
I disagree with “of course”. The laws of cognition aren’t on any side, but human rationalists presumably share (at least some) human values and intend to advance them; insofar they are more successful than non-rationalists this qualifies as Good.
So by my metric, Yudkowsky and Lintemandain’s Dath Ilan isn’t neutral, it’s quite clearly lawful good, or attempting to be. And yet they care a lot about the laws of cognition.
So it seems to me that the laws of cognition can (should?) drive towards flouishing rather than pure knowledge increase. There might be things that we wish we didn’t know for a bit. And ways to increase our strength to heal rather than our strength to harm.
To me it seems a better rationality would be lawful good.
The laws of cognition are natural laws. Natural laws cannot possibly “drive towards flourishing” or toward anything else.
Attempting to make the laws of cognition “drive towards flourishing” inevitably breaks them.
A lot of problems arise from inaccurate beliefs instead of bad goals. E.g. suppose both the capitalists and the communists are in favor of flourishing, but they have different beliefs on how best to achieve this. Now if we pick a bad policy to optimize for a noble goal, bad things will likely still follow.
Interesting. I always thought the D&D alignment chart was just a random first stab at quantizing a standard superficial Disney attitude toward ethics. This modification seems pretty sensible.
I think your good/evil axis is correct in terms of a deeper sense of the common terms. Evil people don’t try to harm others typically, they just don’t care- so their efforts to help themselves and their friends is prone to harm others. Being good means being good to everyone, not just your favorites. It’s the size of your circle of compassion. Outright malignancy, cackling about others suffering, is pretty eye-catching when it happens (and it does), but I’d say the vast majority of harm in the world has been done by people who are merely not much concerned with collateral damage. Thus, I think those deserve the term evil, lest we focus on the wrong thing.
Predictable/unpredictable seems like a perfectly good alternate label for the chaotic/lawful. In some adversarial situations, it makes sense to be unpredictable.
One big question is whether you’re referring to intentions or likely outcomes in your expected valaue (which I assume is expected value for all sentient beings or somethingg). A purely selfish person without much ambition may actually be a net good in the world; they work for the benefit of themselves and those close enough to be critical for their wellbeing, and they don’t risk causing a lot of harm since that might cause blowback. The same personality put in a position of power might do great harm, ordering an invasion or employee downsizing to benefit themselves and their family while greatly harming many.
Yeah I find the intention vs outcome thing difficult.
What do you think of “average expected value across small perturbations in your life”. Like if you accidentally hit churchill with a car and so cause the UK to lose WW2 that feels notably less bad than deliberately trying to kill a much smaller number of people. In many nearby universes, you didn’t kill churchill, but in many nearby universes that person did kill all those people.
Chaotic Good: pivotal act
Lawful Evil: “situational awareness”