[Epistemic status & effort: exploring a question over an hour or so, and constrained to only use information I already know. This is a problem solving exercise, not a research paper. Originally written just for me; minor clarification added later.]
Is the use of force a unique industry, where a single monolithic [business] entity is the most stable state, the equilibrium point? From a business perspective, an entity selling the use of force might be thought of as in a “risk management” or “contract enforcement” industry. It might use an insurance-like business model, or function more like a contractor for large projects.
In a monopoly on the use of force, the one monopolizing entity can spend all of its time deciding what to do, and then relatively little time & energy actually exerting that force, because resistance to its force is minimal, since there are no similarly sized entities to oppose it. The cost of using force is slightly increased if the default level of resistance [i.e. how much force is available to someone who has not hired their own use-of-force business entity] is increased. Can the default level of opposition be lowered by a monolithic entity? Yes [e.g. limiting private ownership of weapons & armor].
In a diverse [non-monopoly] environment, an entity selling the use of force could easily find a similar sized entity opposing it. Opposed entities can fight immediately like hawks, or can negotiate like doves, but: the costs of conflict will be underestimated (somehow they always are); “shooting first” (per the dark forest analogy) is a powerful & possibly dominant strategy; and hawkish entities impose a cost on the entire industry.
Does this incentivize dovish entities to cooperate to eliminate hawkish entities? If they are able to coordinate, and if there are enough of them to bear the cost & survive, then probably. If they do this, then they in-effect form a single monolithic cooperating entity. If they cannot, then dovish entities may become meaningless, as hawkish entities tear them and each other apart until only one remains (Highlander rules?). Our original question might then depend on the following one:
“Do dovish entities in the market of force necessarily either become de facto monopolies or perish?”
How do we go about both answering and supporting this question? Via historical example and / or counterexample?
What does the example look like?
Entities in the same ecosystem either destroy each other, or cooperate to a degree where they cease acting like separate entities.
What does the counterexample look like?
Entities in the same ecosystem remain separate, only cooperate to serve shared goals, and do come into conflict, but do not escalate that conflict into deadly force.
[Sidenote: what is this “degree of cooperation”? Shared goal: I want to accomplish X, and you also want to accomplish X, so we cooperate to make accomplishment of X more likely. “To cease acting separately”: I want to accomplish X; you do not care about X; however, you will still help me with X, and bear costs to do so, because you value our relationship, and have the expectation that I may help you accomplish an unknown Y in the future, even though I don’t care about Y.]
Possible examples to investigate (at least the ones which come quickly to mind):
Danish conquests in medieval UK.
The world wars and cold war.
Modern geopolitics.
Possible counterexamples to investigate:
Pre-nationalized medieval European towns & cities, of the kind mentioned in Seeing Like a State?
Urban gang environments, especially in areas with minimal law enforcement?
Somalia, or other areas described as “failed states”?
Modern geopolitics.
Standoffs between federal authorities and militia groups in the modern & historical USA, such as the standoff at Bundy Ranch.
This is a good topic for investigation, but you probably need to model it in more detail than you currently are. There are many dimensions and aspects to use of violence (and the threat of violence) that don’t quite fit the “monopoly” model. And many kinds of force/coercion that aren’t directly violent, even if they’re tenuously chained to violence via many causal steps.
I very much like the recognition that it’s an equilibrium—there are multiple opposing (and semi-opposing, if viewed in multiple dimensions) actors with various strength and willingness to harm or cooperate. It’s not clear whether there’s a single solution at any given time, but it is clear that it will shift over time, sometimes quickly, often slowly.
Another good exploration is “what rights exist without being enforced by violence (or the distant threat of violence)?” I’d argue almost none.
(Relative likelihood notation is easier, so we will use that)
I heard a thing. Well, I more heard a thing about another thing. Before I heard about it, I didn’t know one way or the other at all. My prior was the Bayesian null prior of 1:1. Let’s say the thing I heard is “Conspiracy thinking is bad for my epistemology”. Let’s pretend it was relevant at the time, and didn’t just come up out of nowhere. What is the chance that someone would hold this opinion, given that they are not part of any conspiracy against me? Maybe 50%? If I heard it in a Rationality influenced space, probably more like 80%? Now, what is the chance that someone would share this as their opinion, given that they are involved in a conspiracy against me? Somewhere between 95% and 100%, so let’s say 99%? Now, our prior is 1:1, and our likelihood ratio is 80:99, so our final prediction, of someone not being a conspirator vs being a conspirator, is 80:99, or 1:1.24. Therefore, my expected probability of someone not being a conspirator went from 50%, down to 45%. Huh.
For the love of all that is good, please shoot holes in this and tell me I screwed up somewhere.
If they’re actually in a conspiracy against you, it’s likely that they don’t even want you thinking about conspiracies. It’s not in their interest for you to associate them with the concept “conspiracy” in any way, since people who don’t think about conspiracies at all are unlikely to unmask them. By this reasoning, the chance of a conspirator drawing attention to thinking about conspiracies is not anywhere near 95% - maybe not even 20%.
A highly competent conspiracy member will give you no information that distinguishes the existence of the conspiracy from the non-existence of the conspiracy. If you believe that they have voluntarily given you such information, then you should rule out that the conspiracy consists of competent members. This takes a chunk out of your “this person is a conspirator” weight.
There are always more hypotheses. Splitting into just two and treating them as internally homogeneous is always a mistake.
I hope this helps! Thinking about conspiracies doesn’t have to be bad for your epistemology, but I suspect that in practice it is much more often harmful than helpful.
Yeah. I wanted to assume they were being forced to give an opinion, so that “what topics a person is or isn’t likely to bring up” wasn’t a confounding variable. Your point here suggests that a conspirator’s response might be more like “I don’t think about them”, or some kind of null opinion.
This sort of gets to the core of what I was wondering about, but am not sure how to solve: how lies will tend to pervert Bayesian inference. “Simulacra levels” may be relevant here. I would think that a highly competent conspirator would want to only give you information that would reduce your prediction of a conspiracy existing, but this seems sort of recursive, in that anything that would reduce your prediction of a conspiracy would have increased likelihood of being said by a conspirator. Would the effect of lies by bad-faith actors, who know your priors, be that certain statements just don’t update your priors, because uncertainty makes them not actually add any new information? I don’t know what limit this reduces to, and I don’t yet know what math I would need to solve it.
Naturally. I think “backpropagation” might be related to certain observations affecting multiple hypotheses? But I haven’t brushed up on that in a while.
Thank you, it does help! I know some people who revel in conspiracy theories, and some who believe conspiracies are so unlikely, that they dismiss any possibility of a conspiracy out of hand. I get left in the middle with the feeling that some situations “don’t smell right”, without having a provable, quantifiable excuse for why I feel that way.
The event is more likely to occur if the person is a conspirator, so you hearing the statement should indeed increase your credence for conspiracy (and symmetrically decrease your credence for not-conspiracy).
Epistemic Status: I feel like I stumbled over this; it has passed a few filters for correctness; I have not rigorously explored it, and I cannot adequately defend it, but I think that is more my own failing than the failure of the idea.
I have heard said that “Good and Evil are Social Constructs”, or “Who’s really to say?”, or “Morality is relative”. I do not like those at all, and I think they are completely wrong. Since then, I either found, developed, or came across (I don’t remember how I got this) a model of Good and Evil, which has so far seemed accurate in every situation I have applied it to. I don’t think I’ve seen this model written explicitly anywhere, but I have seen people quibble about the meaning of Good & Evil in many places, so whether this turns out to be useful, or laughably naïve, or utterly obvious to everyone but me, I’d rather not keep it to myself anymore.
The purpose of this, I guess, is that when the map has become so smudged and smeared, and some people question whether it ever corresponded to the territory at all, to now figure out what part of the territory this part of the map was supposed to refer to. I will assume that we have all seen or heard examples of things which are Good, things which are Evil, things which are neither, and things which are somewhere in between. An accurate description of Good & Evil should accurately match those experiences a vast majority (all?) of the time.
It seems to me, that among the clusters of things in possibility space, the core of Good is “to help others at one’s own expense” while the core of Evil is “to harm others for one’s own benefit”.
In my limited attempts at verifying this, the Goodness or Evilness of an action or situation has so far seemed to correlate with the presence, absence, and intensity of these versions of Good & Evil. Situations where one does great harm to others for one’s own gain seem clearly evil, like executing political opposition. Situations where one helps others at a cost to oneself seem clearly good, like carrying people out of a burning building. Situations where no harm nor help is done, and no benefit is gained nor cost expended, seem neither Good nor Evil, such as a rock sitting in the sun, doing nothing. Situations where both harm is done & help is given, and where both a cost is expended and a benefit is gained, seem both Good and Evil, or somewhere in between, such as rescuing an unconscious person from a burning building, and then taking their wallet.
The correctness of this explanation depends on whether it matches others’ judgements of specific instances of Good or Evil, so I can’t really prove its correctness from my armchair. The only counterexamples I have seen so far involved significant amounts of motivated reasoning (someone who was certain that theft wasn’t wrong when they did it).
I’m sure there are many things wrong with this, but I can’t expect to become better at rationality if I’m not willing to be crappy at it first.
For self-defense, that’s still a feature, and not a bug. It’s generally seen as more evil to do more harm when defending yourself, and in law, defending youself with lethal force is “justifyable homicide”, it’s specifically called out as something much like an “acceptable evil”. Would it be more or less evil to cause an attacker to change their ways without harming them? Would it be more or less evil to torture an attacker before killing them?
″...by not doing all the Good...” In the model, it’s actually really intentional that “a lack of Good” is not a part of the definition of Evil, because it really isn’t the same thing. There are idiosyncracies in this model which I have not found all of yet. Thank you for pointing them out!
Your intuitions about what’s good and what’s evil are fully consistent with morality being relative, and with them being social constructs. You are deeply embedded in many overlapping cultures and groups, and your beliefs about good and evil will naturally align with what they want you to believe (which in many cases is what they actually believe, but there’s a LOT of hypocrisy on this topic, so it’s not perfectly clear).
I personally like those guidelines. Though I’d call it good to help others EVEN if you benefit in doing so, and evil to do significant net harm, but with a pretty big carveout for small harms which can be modeled as benefits on different dimensions. And my liking them doesn’t make them real or objective.
The first paragraph is equivalent to saying that “all good & evil is socially constructed because we live in a society”, and I don’t want to call someone wrong, so let me try to explain...
An accurate model of Good & Evil will hold true, valid, and meaningful among any population of agents: human, animal, artificial, or otherwise. It is not at all depentent on existing in our current, modern society. Populations that do significant amounts of Good amongst each other generally thrive & are resilient (e.g. humans, ants, rats, wolves, cells in any body, many others), even though some individuals may fail or die horribly. Populations which do significant amounts of Evil tend to be less resilient, or destroy themselves (e.g. high crime areas, cancer cells), even though certain members of those populations may be wildly successful, at least temporarily.
This isn’t even a human-centric model, so it’s not “constructed by society”. It seems to me more likely to be a model that societies have to conform to, in order to exist in a form that is recognizeable as a society.
I apologize for being flippant, and thank you for replying, as having to overcome challenges to this helps me figure it out more!
An accurate model of Good & Evil will hold true, valid, and meaningful among any population of agents: human, animal, artificial, or otherwise.
I look forward to seeing such a model. Or even the foundation of such a model and an indication of how you know it’s truly about good and evil, rather than efficient and in-.
I think, to form an ethical system that passes basic muster, you can’t only take into account the immediate good/bad effects on people of an action. That would treat the two cases “you dump toxic waste on other people’s lawns because you find it funny” and “you enjoy peacefully reading a book by yourself, and other people hate this because they hate you and they hate it when you enjoy yourself” the same.
If you start from a utilitarian perspective, I think you quickly figure out that there need to be rules—that having rules (which people treat as ends in themselves) leads to higher utility than following naive calculations. And I think some version of property rights is the only plausible rule set that anyone has come up with, or at least is the starting point. Then actions may be considered ethically bad to the extent that they violate the rules.
Regarding Good and Evil… I think I would use those words to refer to when someone is conscious of the choice between good and bad actions, and chooses one or the other, respectively. When I think of “monstrously evil”, I think of an intelligent person who understands good people and the system they’re in, and uses their intelligence and their resources to e.g. select the best people and hurt them specifically, or to find the weakest spots and sabotage the system most thoroughly and efficiently. I can imagine a dumb evil person, but I think they still have to know that an option is morally bad and choose it; if they don’t understand what they’re doing in that respect, then they’re not evil.
you enjoy peacefully reading a book by yourself, and other people hate this because they hate you and they hate it when you enjoy yourself
The problem with making hypothetical examples, is when you make them so unreal as to just be moving words around. Playing music/sound/whatever loud enough to be noise pollution would be similar to the first example. Less severe, but similar. Spreading manure on your lawn so that your entire neighborhood stinks would also be less severe, but similar. But if you’re going to say “reading” and then have hypothetical people not react to reading in the way that actual people actually do, then your hypothetical example isn’t going to be meaningful.
As for requiring consciousness, that’s why I was judging actions, not the agents themselves. Agents tend to do both, to some degree.
Ok, if you want more realistic examples, consider:
driving around in a fancy car that you legitimately earned the money to buy, and your neighbors are jealous and hate seeing it (and it’s not an eyesore, nor is their complaint about wear and tear on the road or congestion)
succeeding at a career (through skill and hard work) that your neighbors failed at, which reminds them of their failure and they feel regret
marrying someone of a race or sex that causes some of your neighbors great anguish due to their beliefs
maintaining a social relationship with someone who has opinions your neighbors really hate
having resources that they really want—I mean really really want, I mean need—no matter how much you like having it, I can always work myself up into a height of emotion such that I want it more than you, and therefore aggregate utility is optimized if you give it to me
The category is “peaceful things you should be allowed to do—that I would write off any ethical system that forbade you from doing—even though they (a) benefit you, (b) harm others, and (c) might even be net-negative (at least naively, in the short term) in aggregate utility”. The point is that other people’s psyches can work in arbitrary ways that assign negative payoffs to peaceful, benign actions of yours, and if the ethical system allows them to use this to control your behavior or grab your resources, then they’re incentivized to bend their psyches in that direction—to dwell on their envy and hatred and let them grow. (Also, since mind-reading isn’t currently practical, any implementation of the ethical system relies on people’s ability to self-report their preferences, and to be convincing about it.) The winners would be those who are best able to convince others of how needy they are (possibly by becoming that needy).
Therefore, any acceptable ethical system must be resistant to this kind of utilitarian coercion. As I say, rules—generally systems of rights, generally those that begin with the right to one’s self and one’s property—are the only plausible solution I’ve encountered.
Whom/what an agent is willing to do Evil to, vs whom/what it would prefer to do Good to, sort of defines an in-group/out-group divide, in a similar way to how the decision to cooperate or defect does in the Prisoner’s Dilemma. Hmmm...
AGI Alignment, or How Do We Stop Our Algorithms From Getting Possessed by Demons?
[Epistemic Status: Absurd silliness, which may or may not contain hidden truths]
[Epistemic Effort: Idly exploring idea space, laying down some map so I stop circling back over the same territory]
From going over Zvi’s sequence on Moloch, what the “demon” Moloch really is, is a pattern (or patterns) of thought and behaviour which destroys human value in certain ways. That is an interesting way of labeling things. We know patterns of thought, and we know humans can learn them through their experiences, or by being told them, or reading them somewhere, or any other way that humans can learn things. If a human learns a pattern of thought which destroys value in some way, and that pattern gets reinforced, or in some other way comes to be the primary pattern of that human’s thought or behaviour (e.g. it becomes the most significant factor in their utility function), is that in any way functionally different from “becoming possessed by demons”?
That’s a weird paradigm. It gives the traditionally fantastical term “demon” a real definition as a real thing that really exists, and it also separates algorithms from the agents that execute them.
A few weird implications: When we’re trying to “align an AGI”, are we looking for an agent which cannot even theoretically become possessed by demons? Because that seems like it might require an agent which cannot be altered, or at least cannot alter itself. But in order to learn in a meaningful way, an intelligent agent has to be able to alter itself. (Yeah, I should prove these statements, but I’m not gonna. See: Epistemic Effort) So then instead of an immutable agent, are we looking for positive patterns of thought, which resist being possessed by demons?
[I doubt that this is in any way useful to anyone, but it was fun for me. It will disappear into the archives soon enough]
You might want to look into the chaos magick notion of “egregores”. Particularly the less woo bits based on meme theory and cybernetics. Essentially: it is reasonable to suspect that there are human mind subagents capable of replicating themselves across people by being communicated, and cooperating with their copies in other hosts to form larger, slower collective minds. To me it seems like such “egregores” include deities, spirits, corporations, nations, and all other agenty social constructs.
It is in fact exactly correct that people can and do, regularly, get possessed by spirits. Think of your favorite annoying identity politics group and how they all look and act roughly the same, and make irrational decisions that benefit the desire of their identity group to spread itself to new humans more than their own personal well being.
Social media has enabled these entities to spread their influence far faster than ever before, and they are basically unaligned AIs running on human wetware, just itching to get themselves uploaded—a lesser-appreciated possible failure mode of AGI in my opinion.
Now that I’ve had 5 months to let this idea stew, when I read your comment again just now, I think I understand it completely? After getting comfortable using “demons” to refer to patterns of thought or behavior which proliferate in ways not completely unlike some patterns of matter, this comment now makes a lot more sense than it used to.
Lovely! I’m glad to hear it’s making sense to you. I had a leg up in perceiving this—I spent several years of my youth as a paranoid, possibly schizotypal occultist who literally believed in spirits—so it wasn’t hard for me, once I became more rational, to notice that I’d not been entirely wrong. But most people have no basis from which to start when perceiving these things!
Monopolies on the Use of Force
[Epistemic status & effort: exploring a question over an hour or so, and constrained to only use information I already know. This is a problem solving exercise, not a research paper. Originally written just for me; minor clarification added later.]
Is the use of force a unique industry, where a single monolithic [business] entity is the most stable state, the equilibrium point? From a business perspective, an entity selling the use of force might be thought of as in a “risk management” or “contract enforcement” industry. It might use an insurance-like business model, or function more like a contractor for large projects.
In a monopoly on the use of force, the one monopolizing entity can spend all of its time deciding what to do, and then relatively little time & energy actually exerting that force, because resistance to its force is minimal, since there are no similarly sized entities to oppose it. The cost of using force is slightly increased if the default level of resistance [i.e. how much force is available to someone who has not hired their own use-of-force business entity] is increased. Can the default level of opposition be lowered by a monolithic entity? Yes [e.g. limiting private ownership of weapons & armor].
In a diverse [non-monopoly] environment, an entity selling the use of force could easily find a similar sized entity opposing it. Opposed entities can fight immediately like hawks, or can negotiate like doves, but: the costs of conflict will be underestimated (somehow they always are); “shooting first” (per the dark forest analogy) is a powerful & possibly dominant strategy; and hawkish entities impose a cost on the entire industry.
Does this incentivize dovish entities to cooperate to eliminate hawkish entities? If they are able to coordinate, and if there are enough of them to bear the cost & survive, then probably. If they do this, then they in-effect form a single monolithic cooperating entity. If they cannot, then dovish entities may become meaningless, as hawkish entities tear them and each other apart until only one remains (Highlander rules?). Our original question might then depend on the following one:
“Do dovish entities in the market of force necessarily either become de facto monopolies or perish?”
How do we go about both answering and supporting this question? Via historical example and / or counterexample?
What does the example look like? Entities in the same ecosystem either destroy each other, or cooperate to a degree where they cease acting like separate entities.
What does the counterexample look like? Entities in the same ecosystem remain separate, only cooperate to serve shared goals, and do come into conflict, but do not escalate that conflict into deadly force.
[Sidenote: what is this “degree of cooperation”? Shared goal: I want to accomplish X, and you also want to accomplish X, so we cooperate to make accomplishment of X more likely. “To cease acting separately”: I want to accomplish X; you do not care about X; however, you will still help me with X, and bear costs to do so, because you value our relationship, and have the expectation that I may help you accomplish an unknown Y in the future, even though I don’t care about Y.]
Possible examples to investigate (at least the ones which come quickly to mind): Danish conquests in medieval UK. The world wars and cold war. Modern geopolitics.
Possible counterexamples to investigate: Pre-nationalized medieval European towns & cities, of the kind mentioned in Seeing Like a State? Urban gang environments, especially in areas with minimal law enforcement? Somalia, or other areas described as “failed states”? Modern geopolitics. Standoffs between federal authorities and militia groups in the modern & historical USA, such as the standoff at Bundy Ranch.
This is a good topic for investigation, but you probably need to model it in more detail than you currently are. There are many dimensions and aspects to use of violence (and the threat of violence) that don’t quite fit the “monopoly” model. And many kinds of force/coercion that aren’t directly violent, even if they’re tenuously chained to violence via many causal steps.
I very much like the recognition that it’s an equilibrium—there are multiple opposing (and semi-opposing, if viewed in multiple dimensions) actors with various strength and willingness to harm or cooperate. It’s not clear whether there’s a single solution at any given time, but it is clear that it will shift over time, sometimes quickly, often slowly.
Another good exploration is “what rights exist without being enforced by violence (or the distant threat of violence)?” I’d argue almost none.
How do lies affect Bayesian Inference?
(Relative likelihood notation is easier, so we will use that)
I heard a thing. Well, I more heard a thing about another thing. Before I heard about it, I didn’t know one way or the other at all. My prior was the Bayesian null prior of 1:1. Let’s say the thing I heard is “Conspiracy thinking is bad for my epistemology”. Let’s pretend it was relevant at the time, and didn’t just come up out of nowhere. What is the chance that someone would hold this opinion, given that they are not part of any conspiracy against me? Maybe 50%? If I heard it in a Rationality influenced space, probably more like 80%? Now, what is the chance that someone would share this as their opinion, given that they are involved in a conspiracy against me? Somewhere between 95% and 100%, so let’s say 99%? Now, our prior is 1:1, and our likelihood ratio is 80:99, so our final prediction, of someone not being a conspirator vs being a conspirator, is 80:99, or 1:1.24. Therefore, my expected probability of someone not being a conspirator went from 50%, down to 45%. Huh.
For the love of all that is good, please shoot holes in this and tell me I screwed up somewhere.
There are lots of holes. Here are a few:
If they’re actually in a conspiracy against you, it’s likely that they don’t even want you thinking about conspiracies. It’s not in their interest for you to associate them with the concept “conspiracy” in any way, since people who don’t think about conspiracies at all are unlikely to unmask them. By this reasoning, the chance of a conspirator drawing attention to thinking about conspiracies is not anywhere near 95% - maybe not even 20%.
A highly competent conspiracy member will give you no information that distinguishes the existence of the conspiracy from the non-existence of the conspiracy. If you believe that they have voluntarily given you such information, then you should rule out that the conspiracy consists of competent members. This takes a chunk out of your “this person is a conspirator” weight.
There are always more hypotheses. Splitting into just two and treating them as internally homogeneous is always a mistake.
I hope this helps! Thinking about conspiracies doesn’t have to be bad for your epistemology, but I suspect that in practice it is much more often harmful than helpful.
Yeah. I wanted to assume they were being forced to give an opinion, so that “what topics a person is or isn’t likely to bring up” wasn’t a confounding variable. Your point here suggests that a conspirator’s response might be more like “I don’t think about them”, or some kind of null opinion.
This sort of gets to the core of what I was wondering about, but am not sure how to solve: how lies will tend to pervert Bayesian inference. “Simulacra levels” may be relevant here. I would think that a highly competent conspirator would want to only give you information that would reduce your prediction of a conspiracy existing, but this seems sort of recursive, in that anything that would reduce your prediction of a conspiracy would have increased likelihood of being said by a conspirator. Would the effect of lies by bad-faith actors, who know your priors, be that certain statements just don’t update your priors, because uncertainty makes them not actually add any new information? I don’t know what limit this reduces to, and I don’t yet know what math I would need to solve it.
Naturally. I think “backpropagation” might be related to certain observations affecting multiple hypotheses? But I haven’t brushed up on that in a while.
Thank you, it does help! I know some people who revel in conspiracy theories, and some who believe conspiracies are so unlikely, that they dismiss any possibility of a conspiracy out of hand. I get left in the middle with the feeling that some situations “don’t smell right”, without having a provable, quantifiable excuse for why I feel that way.
The event is more likely to occur if the person is a conspirator, so you hearing the statement should indeed increase your credence for conspiracy (and symmetrically decrease your credence for not-conspiracy).
The Definition of Good and Evil
Epistemic Status: I feel like I stumbled over this; it has passed a few filters for correctness; I have not rigorously explored it, and I cannot adequately defend it, but I think that is more my own failing than the failure of the idea.
I have heard said that “Good and Evil are Social Constructs”, or “Who’s really to say?”, or “Morality is relative”. I do not like those at all, and I think they are completely wrong. Since then, I either found, developed, or came across (I don’t remember how I got this) a model of Good and Evil, which has so far seemed accurate in every situation I have applied it to. I don’t think I’ve seen this model written explicitly anywhere, but I have seen people quibble about the meaning of Good & Evil in many places, so whether this turns out to be useful, or laughably naïve, or utterly obvious to everyone but me, I’d rather not keep it to myself anymore.
The purpose of this, I guess, is that when the map has become so smudged and smeared, and some people question whether it ever corresponded to the territory at all, to now figure out what part of the territory this part of the map was supposed to refer to. I will assume that we have all seen or heard examples of things which are Good, things which are Evil, things which are neither, and things which are somewhere in between. An accurate description of Good & Evil should accurately match those experiences a vast majority (all?) of the time.
It seems to me, that among the clusters of things in possibility space, the core of Good is “to help others at one’s own expense” while the core of Evil is “to harm others for one’s own benefit”.
In my limited attempts at verifying this, the Goodness or Evilness of an action or situation has so far seemed to correlate with the presence, absence, and intensity of these versions of Good & Evil. Situations where one does great harm to others for one’s own gain seem clearly evil, like executing political opposition. Situations where one helps others at a cost to oneself seem clearly good, like carrying people out of a burning building. Situations where no harm nor help is done, and no benefit is gained nor cost expended, seem neither Good nor Evil, such as a rock sitting in the sun, doing nothing. Situations where both harm is done & help is given, and where both a cost is expended and a benefit is gained, seem both Good and Evil, or somewhere in between, such as rescuing an unconscious person from a burning building, and then taking their wallet.
The correctness of this explanation depends on whether it matches others’ judgements of specific instances of Good or Evil, so I can’t really prove its correctness from my armchair. The only counterexamples I have seen so far involved significant amounts of motivated reasoning (someone who was certain that theft wasn’t wrong when they did it).
I’m sure there are many things wrong with this, but I can’t expect to become better at rationality if I’m not willing to be crappy at it first.
That makes self-defence Evil. It even makes cultivating one’s own garden Evil (by not doing all the Good that one might). And some argue that. Do you?
For self-defense, that’s still a feature, and not a bug. It’s generally seen as more evil to do more harm when defending yourself, and in law, defending youself with lethal force is “justifyable homicide”, it’s specifically called out as something much like an “acceptable evil”. Would it be more or less evil to cause an attacker to change their ways without harming them? Would it be more or less evil to torture an attacker before killing them?
″...by not doing all the Good...” In the model, it’s actually really intentional that “a lack of Good” is not a part of the definition of Evil, because it really isn’t the same thing. There are idiosyncracies in this model which I have not found all of yet. Thank you for pointing them out!
Your intuitions about what’s good and what’s evil are fully consistent with morality being relative, and with them being social constructs. You are deeply embedded in many overlapping cultures and groups, and your beliefs about good and evil will naturally align with what they want you to believe (which in many cases is what they actually believe, but there’s a LOT of hypocrisy on this topic, so it’s not perfectly clear).
I personally like those guidelines. Though I’d call it good to help others EVEN if you benefit in doing so, and evil to do significant net harm, but with a pretty big carveout for small harms which can be modeled as benefits on different dimensions. And my liking them doesn’t make them real or objective.
The first paragraph is equivalent to saying that “all good & evil is socially constructed because we live in a society”, and I don’t want to call someone wrong, so let me try to explain...
An accurate model of Good & Evil will hold true, valid, and meaningful among any population of agents: human, animal, artificial, or otherwise. It is not at all depentent on existing in our current, modern society. Populations that do significant amounts of Good amongst each other generally thrive & are resilient (e.g. humans, ants, rats, wolves, cells in any body, many others), even though some individuals may fail or die horribly. Populations which do significant amounts of Evil tend to be less resilient, or destroy themselves (e.g. high crime areas, cancer cells), even though certain members of those populations may be wildly successful, at least temporarily.
This isn’t even a human-centric model, so it’s not “constructed by society”. It seems to me more likely to be a model that societies have to conform to, in order to exist in a form that is recognizeable as a society.
I apologize for being flippant, and thank you for replying, as having to overcome challenges to this helps me figure it out more!
I look forward to seeing such a model. Or even the foundation of such a model and an indication of how you know it’s truly about good and evil, rather than efficient and in-.
I think, to form an ethical system that passes basic muster, you can’t only take into account the immediate good/bad effects on people of an action. That would treat the two cases “you dump toxic waste on other people’s lawns because you find it funny” and “you enjoy peacefully reading a book by yourself, and other people hate this because they hate you and they hate it when you enjoy yourself” the same.
If you start from a utilitarian perspective, I think you quickly figure out that there need to be rules—that having rules (which people treat as ends in themselves) leads to higher utility than following naive calculations. And I think some version of property rights is the only plausible rule set that anyone has come up with, or at least is the starting point. Then actions may be considered ethically bad to the extent that they violate the rules.
Regarding Good and Evil… I think I would use those words to refer to when someone is conscious of the choice between good and bad actions, and chooses one or the other, respectively. When I think of “monstrously evil”, I think of an intelligent person who understands good people and the system they’re in, and uses their intelligence and their resources to e.g. select the best people and hurt them specifically, or to find the weakest spots and sabotage the system most thoroughly and efficiently. I can imagine a dumb evil person, but I think they still have to know that an option is morally bad and choose it; if they don’t understand what they’re doing in that respect, then they’re not evil.
The problem with making hypothetical examples, is when you make them so unreal as to just be moving words around. Playing music/sound/whatever loud enough to be noise pollution would be similar to the first example. Less severe, but similar. Spreading manure on your lawn so that your entire neighborhood stinks would also be less severe, but similar. But if you’re going to say “reading” and then have hypothetical people not react to reading in the way that actual people actually do, then your hypothetical example isn’t going to be meaningful.
As for requiring consciousness, that’s why I was judging actions, not the agents themselves. Agents tend to do both, to some degree.
Ok, if you want more realistic examples, consider:
driving around in a fancy car that you legitimately earned the money to buy, and your neighbors are jealous and hate seeing it (and it’s not an eyesore, nor is their complaint about wear and tear on the road or congestion)
succeeding at a career (through skill and hard work) that your neighbors failed at, which reminds them of their failure and they feel regret
marrying someone of a race or sex that causes some of your neighbors great anguish due to their beliefs
maintaining a social relationship with someone who has opinions your neighbors really hate
having resources that they really want—I mean really really want, I mean need—no matter how much you like having it, I can always work myself up into a height of emotion such that I want it more than you, and therefore aggregate utility is optimized if you give it to me
The category is “peaceful things you should be allowed to do—that I would write off any ethical system that forbade you from doing—even though they (a) benefit you, (b) harm others, and (c) might even be net-negative (at least naively, in the short term) in aggregate utility”. The point is that other people’s psyches can work in arbitrary ways that assign negative payoffs to peaceful, benign actions of yours, and if the ethical system allows them to use this to control your behavior or grab your resources, then they’re incentivized to bend their psyches in that direction—to dwell on their envy and hatred and let them grow. (Also, since mind-reading isn’t currently practical, any implementation of the ethical system relies on people’s ability to self-report their preferences, and to be convincing about it.) The winners would be those who are best able to convince others of how needy they are (possibly by becoming that needy).
Therefore, any acceptable ethical system must be resistant to this kind of utilitarian coercion. As I say, rules—generally systems of rights, generally those that begin with the right to one’s self and one’s property—are the only plausible solution I’ve encountered.
Whom/what an agent is willing to do Evil to, vs whom/what it would prefer to do Good to, sort of defines an in-group/out-group divide, in a similar way to how the decision to cooperate or defect does in the Prisoner’s Dilemma. Hmmm...
AGI Alignment, or How Do We Stop Our Algorithms From Getting Possessed by Demons?
[Epistemic Status: Absurd silliness, which may or may not contain hidden truths]
[Epistemic Effort: Idly exploring idea space, laying down some map so I stop circling back over the same territory]
From going over Zvi’s sequence on Moloch, what the “demon” Moloch really is, is a pattern (or patterns) of thought and behaviour which destroys human value in certain ways. That is an interesting way of labeling things. We know patterns of thought, and we know humans can learn them through their experiences, or by being told them, or reading them somewhere, or any other way that humans can learn things. If a human learns a pattern of thought which destroys value in some way, and that pattern gets reinforced, or in some other way comes to be the primary pattern of that human’s thought or behaviour (e.g. it becomes the most significant factor in their utility function), is that in any way functionally different from “becoming possessed by demons”?
That’s a weird paradigm. It gives the traditionally fantastical term “demon” a real definition as a real thing that really exists, and it also separates algorithms from the agents that execute them.
A few weird implications: When we’re trying to “align an AGI”, are we looking for an agent which cannot even theoretically become possessed by demons? Because that seems like it might require an agent which cannot be altered, or at least cannot alter itself. But in order to learn in a meaningful way, an intelligent agent has to be able to alter itself. (Yeah, I should prove these statements, but I’m not gonna. See: Epistemic Effort) So then instead of an immutable agent, are we looking for positive patterns of thought, which resist being possessed by demons?
[I doubt that this is in any way useful to anyone, but it was fun for me. It will disappear into the archives soon enough]
You might want to look into the chaos magick notion of “egregores”. Particularly the less woo bits based on meme theory and cybernetics. Essentially: it is reasonable to suspect that there are human mind subagents capable of replicating themselves across people by being communicated, and cooperating with their copies in other hosts to form larger, slower collective minds. To me it seems like such “egregores” include deities, spirits, corporations, nations, and all other agenty social constructs.
It is in fact exactly correct that people can and do, regularly, get possessed by spirits. Think of your favorite annoying identity politics group and how they all look and act roughly the same, and make irrational decisions that benefit the desire of their identity group to spread itself to new humans more than their own personal well being.
Social media has enabled these entities to spread their influence far faster than ever before, and they are basically unaligned AIs running on human wetware, just itching to get themselves uploaded—a lesser-appreciated possible failure mode of AGI in my opinion.
Now that I’ve had 5 months to let this idea stew, when I read your comment again just now, I think I understand it completely? After getting comfortable using “demons” to refer to patterns of thought or behavior which proliferate in ways not completely unlike some patterns of matter, this comment now makes a lot more sense than it used to.
Lovely! I’m glad to hear it’s making sense to you. I had a leg up in perceiving this—I spent several years of my youth as a paranoid, possibly schizotypal occultist who literally believed in spirits—so it wasn’t hard for me, once I became more rational, to notice that I’d not been entirely wrong. But most people have no basis from which to start when perceiving these things!