Ontological Crisis in Humans
Imagine a robot that was designed to find and collect spare change around its owner’s house. It had a world model where macroscopic everyday objects are ontologically primitive and ruled by high-school-like physics and (for humans and their pets) rudimentary psychology and animal behavior. Its goals were expressed as a utility function over this world model, which was sufficient for its designed purpose. All went well until one day, a prankster decided to “upgrade” the robot’s world model to be based on modern particle physics. This unfortunately caused the robot’s utility function to instantly throw a domain error exception (since its inputs are no longer the expected list of macroscopic objects and associated properties like shape and color), thus crashing the controlling AI.
According to Peter de Blanc, who used the phrase “ontological crisis” to describe this kind of problem,
Human beings also confront ontological crises. We should find out what cognitive algorithms humans use to solve the same problems described in this paper. If we wish to build agents that maximize human values, this may be aided by knowing how humans re-interpret their values in new ontologies.
I recently realized that a couple of problems that I’ve been thinking over (the nature of selfishness and the nature of pain/pleasure/suffering/happiness) can be considered instances of ontological crises in humans (although I’m not so sure we necessarily have the cognitive algorithms to solve them). I started thinking in this direction after writing this comment:
This formulation or variant of TDT requires that before a decision problem is handed to it, the world is divided into the agent itself (X), other agents (Y), and “dumb matter” (G). I think this is misguided, since the world doesn’t really divide cleanly into these 3 parts.
What struck me is that even though the world doesn’t divide cleanly into these 3 parts, our models of the world actually do. In the world models that we humans use on a day to day basis, and over which our utility functions seem to be defined (to the extent that we can be said to have utility functions at all), we do take the Self, Other People, and various Dumb Matter to be ontologically primitive entities. Our world models, like the coin collecting robot’s, consist of these macroscopic objects ruled by a hodgepodge of heuristics and prediction algorithms, rather than microscopic particles governed by a coherent set of laws of physics.
For example, the amount of pain someone is experiencing doesn’t seem to exist in the real world as an XML tag attached to some “person entity”, but that’s pretty much how our models of the world work, and perhaps more importantly, that’s what our utility functions expect their inputs to look like (as opposed to, say, a list of particles and their positions and velocities). Similarly, a human can be selfish just by treating the object labeled “SELF” in its world model differently from other objects, whereas an AI with a world model consisting of microscopic particles would need to somehow inherit or learn a detailed description of itself in order to be selfish.
To fully confront the ontological crisis that we face, we would have to upgrade our world model to be based on actual physics, and simultaneously translate our utility functions so that their domain is the set of possible states of the new model. We currently have little idea how to accomplish this, and instead what we do in practice is, as far as I can tell, keep our ontologies intact and utility functions unchanged, but just add some new heuristics that in certain limited circumstances call out to new physics formulas to better update/extrapolate our models. This is actually rather clever, because it lets us make use of updated understandings of physics without ever having to, for instance, decide exactly what patterns of particle movements constitute pain or pleasure, or what patterns constitute oneself. Nevertheless, this approach hardly seems capable of being extended to work in a future where many people may have nontraditional mind architectures, or have a zillion copies of themselves running on all kinds of strange substrates, or be merged into amorphous group minds with no clear boundaries between individuals.
By the way, I think nihilism often gets short changed around here. Given that we do not actually have at hand a solution to ontological crises in general or to the specific crisis that we face, what’s wrong with saying that the solution set may just be null? Given that evolution doesn’t constitute a particularly benevolent and farsighted designer, perhaps we may not be able to do much better than that poor spare-change collecting robot? If Eliezer is worried that actual AIs facing actual ontological crises could do worse than just crash, should we be very sanguine that for humans everything must “add up to moral normality”?
To expand a bit more on this possibility, many people have an aversion against moral arbitrariness, so we need at a minimum a utility translation scheme that’s principled enough to pass that filter. But our existing world models are a hodgepodge put together by evolution so there may not be any such sufficiently principled scheme, which (if other approaches to solving moral philosophy also don’t pan out) would leave us with legitimate feelings of “existential angst” and nihilism. One could perhaps still argue that any current such feelings are premature, but maybe some people have stronger intuitions than others that these problems are unsolvable?
Do we have any examples of humans successfully navigating an ontological crisis? The LessWrong Wiki mentions loss of faith in God:
In the human context, a clear example of an ontological crisis is a believer’s loss of faith in God. Their motivations and goals, coming from a very specific view of life suddenly become obsolete and maybe even nonsense in the face of this new configuration. The person will then experience a deep crisis and go through the psychological task of reconstructing its set of preferences according the new world view.
But I don’t think loss of faith in God actually constitutes an ontological crisis, or if it does, certainly not a very severe one. An ontology consisting of Gods, Self, Other People, and Dumb Matter just isn’t very different from one consisting of Self, Other People, and Dumb Matter (the latter could just be considered a special case of the former with quantity of Gods being 0), especially when you compare either ontology to one made of microscopic particles or even less familiar entities.
But to end on a more positive note, realizing that seemingly unrelated problems are actually instances of a more general problem gives some hope that by “going meta” we can find a solution to all of these problems at once. Maybe we can solve many ethical problems simultaneously by discovering some generic algorithm that can be used by an agent to transition from any ontology to another?
(Note that I’m not saying this is the right way to understand one’s real preferences/morality, but just drawing attention to it as a possible alternative to other more “object level” or “purely philosophical” approaches. See also this previous discussion, which I recalled after writing most of the above.)
- Six Plausible Meta-Ethical Alternatives by 6 Aug 2014 0:04 UTC; 90 points) (
- Open-minded updatelessness by 10 Jul 2023 11:08 UTC; 65 points) (
- Beware Selective Nihilism by 20 Dec 2012 18:53 UTC; 62 points) (
- The Problem of the Criterion by 21 Jan 2021 15:05 UTC; 61 points) (
- Three kinds of moral uncertainty by 30 Dec 2012 10:43 UTC; 57 points) (
- Two More Decision Theory Problems for Humans by 4 Jan 2019 9:00 UTC; 56 points) (
- Original Research on Less Wrong by 29 Oct 2012 22:50 UTC; 48 points) (
- 5 Sep 2014 3:36 UTC; 41 points) 's comment on Goal retention discussion with Eliezer by (
- 19 Aug 2019 4:02 UTC; 33 points) 's comment on Realism about rationality by (
- Resolving von Neumann-Morgenstern Inconsistent Preferences by 22 Oct 2024 11:45 UTC; 31 points) (
- Problems with Amplification/Distillation by 27 Mar 2018 11:12 UTC; 29 points) (
- Why the Problem of the Criterion Matters by 30 Oct 2021 20:44 UTC; 24 points) (
- Value uncertainty by 29 Jan 2020 20:16 UTC; 20 points) (
- 4 Jan 2019 19:12 UTC; 11 points) 's comment on Two More Decision Theory Problems for Humans by (
- 22 Jul 2017 10:06 UTC; 9 points) 's comment on Why I think the Foundational Research Institute should rethink its approach by (EA Forum;
- 13 Jan 2013 20:03 UTC; 2 points) 's comment on How to Be Oversurprised by (
- 19 Dec 2012 20:25 UTC; 2 points) 's comment on The “Scary problem of Qualia” by (
- 31 Jan 2019 19:33 UTC; 2 points) 's comment on The Question Of Perception by (
- 3 Jan 2015 18:29 UTC; 1 point) 's comment on Blind Spots: Compartmentalizing by (EA Forum;
- 17 Jun 2015 23:37 UTC; 1 point) 's comment on Concept Safety: Producing similar AI-human concept spaces by (
- 7 Jan 2013 7:13 UTC; 0 points) 's comment on Second-Order Logic: The Controversy by (
This seems true and important.
Is there a technical term for this way of dealing with different-but-related ontologies? I’ve grappled with similar problems, but never gotten as far as making them precise like that.
I’m not aware of any, but you may call it “hybrid ontologies” or “ontological interfacing”.
No. People do go out of their minds on nihilism now and then.
And I’ve already seen two LWers who have discovered such compassion for the suffering of animals that they want to exterminate them all except for a few coddled pets, one of whom isn’t sure that humans should exist at all.
Less dramatically, any number of Buddhists have persuaded themselves that they don’t exist, although I’m not sure how many just believe that they believe that.
“If nothing really exists and it’s all just emptiness and even the emptiness is empty of existence, then how could I have killed all those nonexistent people with my nonexistent hands?
“Tell it to the nonexistent prison walls, buddy.”
I went through my nihilism crisis a little over two years ago. I was depressed and sad, and didn’t see any point in existing. After about two weeks, I realized what was going on—that I had kicked that last pillar of “the universe has meaning” out from under my model of the world. It was odd, having something that seemed so trivial have so much of an impact on my well being. Prior to that experience, I would not have expected it.
But once I realized that was the problem, once I realized that life had no point, things changed. The problem at that point simply became, “why do I exist, and why do I care?” The answer I came up with is that I exist because the universe happens to be set up this way. And I care (about any/everything) simply because my genetics, atoms, molecules, and processing architecture are set up in a way that happens to care.
This was good enough. In fact, it’s turned out to be better than what I had before. I love life, I want to experience things, I want to contribute, I want to maximize the utility function that is partially of my own making and partially not.
Getting through to true nihilism can be difficult, and I can see many people not having the ability to do so. But in my case, it has served me well, as my model of the world is now more accurate.
I came by anti idealism through Stirner, who framed it as “Whose cause?” Maybe that’s why I never hit the “it’s all pointless” stage, and progressed directly to my cause—acting according to my values. Free to love what I love, and hate what I hate. And similar to you, I find it better than before. To own your values is more satisfying than feeling they’re dictates from a higher power.
Buddhism merely states that there’s a psychological continuum in which there is nothing unchanging. The “self” that’s precluded is just an unchanging one. (That said, in the Abhidharma there are unchanging elements from which this psychological continuum is constituted.) The Mahayana doctrine of emptiness (which isn’t common to all Buddhism, just the schools that are now found in the Himalayas and East Asia) essentially states that everything is without inherent existence; things only exist as conditioned phenomena in relation to other things, nothing can exist in or of itself because this would preclude change. It’s essentially a restatement of impermanence (everything is subject to change) with the addition of interdependence. So I’d imagine few Buddhists have convinced themselves they don’t exist.
I can’t speak for most Buddhists, but my interpretation (formed after about a week of meditation) is that they are basically saying that belief in the self as a unified whole is a delusion that causes a lot of unhappiness, not that they don’t exist.
If you pay close attention to your mind, you start to notice that it’s just a series of ever-changing recombinations of sights, sounds and feelings; always flickering on and off and moving slightly. It’s the absurdly chaotic nature of the flux of stimuli that people misinterpret into the feeling that they are a single thing.
When they do that, they feel upset about what “they” are, and what “they” could be, because it never lives up to expectations. But really, “they” are not the same thing “they” were even a few seconds ago. A slight change in the balance of electrochemical signals (such as is happening at all times) causes you to view reality from a whole new angle. In this way, the “me” that I remember from earlier today is not the “me” that is typing right now, even though we are linked by causality. In the end, the mind is made up of so many different, changing pieces, that grouping it as one thing can only cause pain.
Incidentally, trying to force yourself into one ego also grants people some of their most compelling reasons to live, so you could make the case that rejecting the concept of self is as much a rejection of life as it is death, but that’s an issue for another time.
I’m not saying that most do, but some certainly do. Here’s the first Google hit for “you do not exist”, and there are a lot more hits to the same sort of thing: people who have had the tiny ontological shock of noticing that their mind is not an atomic, unanalysable entity, and go mildly insane, if only to the point of ranting about their discovery on the internet and talking about red pills and blue pills.
One might as well say that this rock does not exist (and I think that is a straightforward reading of the Perfection of Wisdom sutras). The rock is just a set of atoms that happens to be temporarily assembled together creating the illusion of a single thing. But having read Korzybski before the Buddhist scriptures, I just shrug “oh, yes, consciousness of abstraction”, and dismiss the emptiness doctrine as a deepity. You wouldn’t catch HPMoR!Harry reciting the Heart Sutra. Instead, he uses what is true in that idea to invent Partial Transfiguration.
I probably would have agreed with you 2 weeks ago, but I think there’s a bit more to it.
There’s a pretty drastic difference between a rock and a human mind. Rocks don’t appear to change just by looking at them. Thoughts do. For most purposes, your model of a rock can get away with focusing only on its contrast to its surroundings; of the concept of rockiness; like human intellects are predisposed to do.
Whether the rock is a consistent whole or not, as long as it differs from non-rocks in your perception, it’s still a rock. You, however, are inside your consciousness. You can’t really contrast anything from the workings of your mind, because it’s all you ever see. You can only contrast one aspect of your mind from another, which means that the model you have of yourself inside your mind as being more inside your mind than your model of other things inside your mind, is severely flawed.
I believe that this particular flaw is chiefly responsible for a good deal, if not all, of human suffering, as well as ambition. I don’t know if that belief can be used towards anything as categorically practical as transfiguration, but it certainly is useful for solving ontological and existential crises and improving happiness and peace of mind.
Also, really, the rock doesn’t exist, just the concept of one. Anyone who says that the image of a rock in your mind doesn’t describe the counterintuitive patters that correlate to it accurately enough for them to feel comfortable with is technically right. And it’s probably best not look expressly for flaws in ancient wisdom so as to discount the message. it’s outdated, it’s going to be wrong. What you should do instead is take the arguments presented, try to see the speakers point of view, and interpret the words in the most plausible way possible and look for serious challenges to your beliefs so that you actually get more out of it then just feeling superior to a bunch of dead guys.
Timescale and proximity.
I don’t see the difficulty. I contrast how I feel when I first wake up in the morning (half dead) and how I feel half an hour later (alive). I contrast myself before and after a glass of beer. When I drive a car, I notice if I am making errors of judgement. While I am sure I am not perfect at seeing my own flaws, to the extent that I do, it’s a routine sort of thing, not a revelation.
So the Buddhists say, but I’ve done a fair amount of meditation and never noticed any connection between contemplating my interior life and the presence or absence of suffering. Neither has the experience thrown up so much as a speedbump, never mind a serious challenge to anything. Wow, maybe I’m naturally enlightened already! Except I wouldn’t say my life manifested any evidence of that.
Benja just posted a neat proof of why, if your preferences don’t satisfy the axiom of continuity in von Neumann-Morgenstern utility theory, your rational behavior would be almost everywhere identical to the behavior of someone whose continuity-satisfying-preferences simply ignore the “lower priority” aspects of yours. E.g. if you prefer “X torture plus N dust specks” over “Y torture plus M dust specks” for any X < Y but also for any X==Y, N < M, then you might as well ignore the existence of dust specks because in practical questions there’s always going to be some epsilon of probability between X and Y.
But now what if instead of “torture” and “dust specks” we have a lexical preference ordering on “displeasure of God(s)” and “everything else bad”, and then we remove the former from the picture? Suddenly the parts of probability space that you were previously ignoring (except indirectly insofar as you tried to reflect the preferences of God(s) regarding everything else) are now the only thing you should care about!
From other frameworks the problem looks even worse: If your previous answer to the is-ought problem was to derive every ethical proposition from the single “ought” axiom “We ought to do what God wants regarding X”, and now you’re down to zero “ought” axioms, that makes a huge difference, no?
There is a somewhat trivial way out of this particular mess: replace “We ought to do what God wants” with “We ought to do what God would want, if God existed”. (It’s not any more incoherent than using any other fictional character as a role model.)
If God doesn’t exist, then there is no way to know what He would want, so the replacement has no actual moral rules.
If God doesn’t exist, loads of people are currently fooling themselves into thinking they know what He would want, and CronoDAS claims that’s enough.
I question whether anyone actually has such a lexical preference ordering (religious people seem to frequently violate what they take to be God’s commands), but if someone did wouldn’t they continue to act as if God existed, since a Bayesian can’t assign this zero probability?
Again I question whether anyone actually holds such an ethical framework, except possibly in the sense of espousing it for signaling purposes. I think when someone is espousing such an ethical framework, what they are actually doing is akin to having a utility function where God’s pleasure/displeasure is just a (non-lexical) term along with many others. So when such an ethical framework becomes untenable they can just stop espousing it and fall back to doing what they were doing all along. At the risk of stating the obvious, this doesn’t work with the kind of ontological crises described in the OP.
What ontological crisis is that? The rest of the article is written in general terms, but this phrase suggests you have a specific one in mind, but without actually being specific. Is there some ontological crisis that you are facing, that moved this article?
Personally, learning that apples are made of atoms doesn’t give me any difficulty in eating them.
I explained this in the post:
If you’re asking whether it’s something more personal, the answer is no, I’ve always been interested in moral philosophy and this just part of my continuing investigations.
Here’s a hypothesis (warning for armchair evpsych)...
Define “preferences” to refer broadly to a set that includes an individual’s preferences, values, goals, and morals. During an individual’s childhood and early adulthood, their ontology and preferences co-evolve. Evolution seeks to maximize fitness, so the preference acquisition process is biased in such a way that the preferences we pick up maximize our ability to survive and have surviving offspring. For example, if hunting is considered high-status in our tribe and we display a talent for hunting, we’ll probably pick up a preference for being a hunter. Our circle of altruism gets calibrated to cover those who are considered part of our tribe, and so on. This has the ordinary caveat that the preference acquisition process should be expected to be optimized for the EEA, not the modern world.
There is an exploration/exploitation tradeoff here, and the environment in the EEA probably didn’t change that radically, so as time goes by this process slows down and major changes to our preferences become less and less likely. Because our preferences were acquired via a process aiming to maximize our fit to our particular environment, they are intimately tied together with our ontology. As our neurology shifts closer towards the exploitation phase and our preferences become less amenable to change, we become more likely to add new heuristics to our utility functions rather than to properly revise them when our ontology changes.
This is part of the reason for generational conflict, because as children are raised in a different environment and taught a different view of their world, their preferences become grounded in a new kind of ontology that’s different from the one the preferences of their parents came from. It also suggests that the preferences of any humans still alive today might to some extent be simply impossible to reconcile with those of a sufficiently advanced future—though the children (or robots) born and raised within that future will have no such problem. Which, of course, is just as it has always been.
I feel like the part about altruism doesn’t match my observations very well. First, on a theoretical level, it seems like exploration is nearly costless here. It merely consists of retaining some flexibility, and does not inhibit exploitation in any practical sense, so I’m not sure there’s any strong advantage for stopping it (although there might also not have been much of an advantage in retaining it before modern times either). More concretely, it seems like we have empirical evidence to measure this hypothesis by, as many people in the modern world switch “tribes” because of moving long distances, switching jobs, or significantly altering their social standing.
From what I’ve seen, when such switches occur, many of the people who were in the old circle of altruism are promptly forgotten (with the exception of those with whom reputation has been built up particularly highly), and a new circle forms to encompass the relevant people in the new community. There is, admittedly, a different case when a person moves to a different culture. Then, it seems that while the circle of altruism might partially shift, persons from the original culture are still favored strongly by the person (even if she did not know them before).
(The non-altruism parts seem likely enough, though. At the risk of really badly abusing evpsych, we might theorize that people sometimes moved to nearby tribes, which had similar cultures, but almost never to distant tribes, which did not.)
Yes, that sound plausible.
This seems to be one of the biggest problems for FAI… keeping an utility function constant in a self-modifying agent is hard enough, but keeping it the same over a different domain… well, that’s real hard.
Actually, there might be three outcomes:
we can extrapolate so that it all adds up to normality in when mapped back to the original ontology (unlikely)
we can extrapolate in various ways that is consistent with the original onthology & original human brain design, but not unique (which doesn’t seem to be a “fail” scenario… we just might need new values in addition to the old ones)
our current utility function turns out to be outright contradictory (I can imagine an AI, after a few turns of self-modification, looking at the instructions that turn out to be “the goal of your existence is making X blue” and “avoid making X blue at all costs” after converting them to a better ontology...)
CEV, for example, seems to assume 1 (or 2?). Do we have any indication that it’s not 3 that is the case?
Why exactly do you call 1 unlikely? The whole metaethics sequence argue in favor in 1 (If I understand what you mean by 1 correctly), so what part of that argument do you think is wrong specifically?
Although I’ve read the metaethics sequence, that was a long time ago, but I think I’ll put reading it again on my todo list then!
My intuition behind thinking 1 unlikely (yes, it’s just an intuition) comes from the fact that we are already bad at generalizing “people-ness”… (see animal rights for example: huge, unsolved debates over morality, combined with the fact that we just care more about human-looking, cute things than non-cute ones… which seems to be pretty arbitary to me). And things will get worse when we end up with entities that consist of non-integer numbers of non-boolean peopleness (different versions or instances of uploads, for example).
Another feeling: also in math, it might be possible to generalize things, but not necessarily, and not always uniquely (integers to real numbers seems to work rather seamlessly, but then if you try to generalize n-th differentials over real numbers… what I’ve heard, there are a few different formulations that kind of work, but neither of them is the “real one”.)
Huh. I assumed that they would give the same result, at least for sufficiently well-behaved functions. They don’t?
One shouldn’t confuse there being a huge debate over something with the problem being unsolved, far less unsolvable (look at the debate over free will or worse p-zombies). I have actually solved the problem of the moral value of animals to my satisfaction (my solution could be wrong, of course). As for the problem of dealing with peoples having multiple copies this really seems like the problem of reducing “magical reality fluid” which while hard seems like it should be possible.
Well, yes. But in general if you’re trying to elucidate some concept in your moral reasoning you should ask yourself the specific reason why you care about that specific concept until you reach concepts that looks like they should have canonical reductions, then you reduce them. If in doing so you end up with multiple possible reductions that probably mean you didn’t go deep enough and should be asking why you care about that specific concept some more so that you can pinpoint the reduction you are actually interested in. If after all that you’re still left with multiple possible reductions for a certain concept, that you appear to value terminally, and not for any other reasons, then you should still be able to judge between possible reductions using the other things you care about: elegance, tractability, etc. (though if you end up in this situation it probably means you made an error somewhere...)
I’m not sure what you’re referring to here...
Also, looking at the possibilities you enumerate again, 3 appear incoherent. Contradictions are for logical systems, if you have a component of your utility function which is monotone increasing in the quantity of blue in the universe and another component which is monotone decreasing in the quantity of blue in the universe, they partially or totally cancel one another but that doesn’t result in a contradiction.
Which is your solution, if I may ask?
sure, good point. Nevertheless, if I’m correct, there still isn’t any Scientifically Accepted Unique Solution for the moral value of animals, even though individuals (like you) might have their own solutions (the question is whether the solution uniquely follows from your other preferences, or is somewhat arbitrary?)
(that was just some random example, it’s fractional calculus which I heard a presentation about recently. Not especially relevant here though :))
I just found a nice example for the topic of the post that doesn’t seem to be reducible to anything else: see the post “The “Scary problem of Qualia”. There is no obvious answer, we didn’t really encounter the question so far in practice (but we probably will in the future), and other than its impact on our utility functions, it seems to be the typical “tree falls in forest” one, not really constraining anything in the real world. So the extrapolated utility function seems to be at least category 2.
There isn’t any SAUS for the problem of free will either. Nonetheless, it is a solved problem. Scientists are not in the business of solving that kind of problems, those problems generally being considered philosophical in nature.
It certainly appear to uniquely follow.
That seems easy to answer. Modulo a reduction of computation of course but computation seems like a concept which ought to be canonically reducible.
but it most likely isn’t. “X computes Y” is a model in our head that is useful to predict what e.g. computers do, which breaks down if you zoom in (qualia appear in exactly what stage of a CPU pipeline?) or don’t assume the computer is perfect (how much rounding error is allowed to make the simulation a person and not random noise?)
(nevertheless, sure, the SAUS might not always exist… but above question still doesn’t seem to have any LW Approved Unique Solution (tm) either :))
Are you saying you think qualia is ontologically fundamental or that it isn’t real or what?
I’m saying that although it isn’t ontologically fundamental, our utility function might still build on it (it “feels real enough”), so we might have problems if we try to extrapolate said function to full generality.
If something is not ontologically fundamental and doesn’t reduce to anything which is, then that thing isn’t real.
Once upon a time, developmental psychology claimed that human babies learned object permanence as they aged. I don’t know if that’s still the dominant opinion, but it seems at least possible to me, a way that the world could be, if not the way it is. What would that mean, for a baby to go from not having a sense of objects persisting in locations to having it?
First, let’s unpack what an object might be. If there’s a region of silver color in a baby’s visual field, and rather than the region breaking apart in different directions over time, if it stays together, if the blob is roughly invariant to translations, that’s the beginning of an object concept. A lump that stays together. Then the baby notices that the variances are also predictable, that the flowery texture bits are sometimes showing, and the curved bits are sometimes showing, but not usually together, or neither shows when the silver blob looks especially small, relative to its position. From coactivation of features, and maybe higher order statistics, the baby eventually learns a representation of spoon which predicts which features will be visible as a function of some rotation parameters. That’s what a primitive spoon object means in my mind. There are of course other things to incorporate into the spoon model, like force dynamics (its weight, malleability, permeability and sharpness, et cetera), lighting invariances, and haptic textural information.
Permanence would be something like being able to make predictions of the object’s position even when the object is occluded (having a representation of the face behind the hands, which doesn’t compress the present visual scene, but compresses visual scenes across time). Old experiments showed that babies’ object tracking gazes for occluded objects increased with age, which was supposed to be support for a theory of learned object permanence.
Now, if that’s at all how human macroscopic object perception starts out, I think it’s fair to call that “ruled by a hodgepodge of heuristics and prediction algorithms”. However, it seems psychologically implausible that babies undergo a utility function change throughout this process, in the way you seem to mean. If we think of a world model as supplying predictions (note, this is something of an abuse of terminology, since it probably includes both structured, quickly updateable predictions from “model-based” brain regions like hippocampal place cells, as well as slowly revised model-free habit-type predictions) - if we think of a world model as supplying predictions and utility functions as supplying valuations over predicted worlds, then the domain of the utility function is still some kind of predicted state, before and after learning object permanence. Intuitively, worlds without object permanence are a very different hypothesis space, and thus a very different space of appreciable hypothetical realities, than “our” models which “divide cleanly into these 3 parts”, but I think both types fall into a broader category which reward circuitry functions can take as argument. Indeed, if developmental psychology was right about learning object permanence, humans probably spend a few weeks with world models which have graded persistence.
How do you define “successfully”?
For example, all the disagreement over “free will” seems to be because many humans have a sense of morality which presumes some sense of an extraphysical free will. Confronted with evidence that we are physical systems, some people resort to claiming that we aren’t actually physical systems, others modify their conception of free will to be compatible with us being physical systems, and some declare that since we are physical systems, we have no free will and the concept was confused from the start. Those are three different outcomes of running into an ontological crisis—which one counts as being successful? (Of course, not everyone thinks that morality requires an extraphysical free will in the first place.)
Similarly, there is currently controversy over animal rights, which for some is influenced by the ontological question of whether animals suffer in a manner similar to us; historically, similar ontological considerations influenced the question of whether to treat black people or women the same as white men. The theory of evolution probably caused some sort of ontological crises, and so on. In each case, there are various ways of dealing with the issue, but it’s not clear which one of them counts as a “success”. (Aside for the fact that we’d like to call the reactions which align with our way of thought as successes, of course. Needless to say, this comment does not endorse the ill-treatment of black people, women, nor for that matter animals.) Society just generally splits into different opposing camps, they debate each other for a while, and then generally some of the camps just happens to win out for whatever sociological reason.
When your preferences operate over high-level things in your map, the problem is not that they don’t talk about the real world. Because there is a specific way in which the map gets determined by the world, in a certain sense they already do talk about the real world, you just didn’t originally know how to interpret them in this way. You can compose the process that takes the world and produces your map with the process that takes your map and the high-level things in it and produces a value judgement, obtaining a process that takes the world and produces a value judgment. So it’s not a problem of judgments being defined for only the map, it’s an issue arising from gaining the ability to examine new interpretations of the same implementation of preferences, that are closer to being in terms of the world than the original intuitive perception that only described them in terms of vague high level concepts.
The problem is that when you examine your preference in terms of judging the world, you notice that you don’t like some properties of what it does, you want to modify it, but you are not sure how. There are more possibilities when you are working with a more detailed model. But gaining access to this new information doesn’t in any way invalidate the original judgment procedure, except to the extent that you are now able to make improvements. The ability to notice flaws and to make improvements means in particular that your preference includes judgments that refer to originally unknown details, as is normal for sufficiently complicated mathematical definitions.
What is the process that takes the world and produces my map? Note that whatever this process is, it needs to take as input an arbitrary possible state of the world. I can’t see what possible process you might be referring to...
As you learn more about the world, you are able to describe the way in which your brain-implemented judgment machinery depends on how the world works. This seems to be a reliable presentation of what has happened as a result of most of the discoveries that changed our view on fundamental physics so far. Not being able to formally describe how that happened is a problem that’s separate from deciding whether the preference defined in terms of vague brain-implemented concepts could be interpreted in terms of novel unexpected physics.
The step of re-describing status quo preferences in terms of a new understanding of the world (in this trivial sense I’m discussing) doesn’t seem to involve anything interesting related to the preferences, it’s all about physical brains and physical world. There is no point at this particular stage of updating the ontology where preferences become undefined or meaningless. The sense that they do (when they do) comes from the next stage, where you criticize the judgment procedures given the new better understanding of what they are, but at that point the task of changing the ontology is over.
Uploading also seems like it’s going to spawn a whole lot of ontological crises in humans. Suppose you value equality and want to weigh everyone’s views equally, either via an ordinary democratic process or something more exotic like CEV. So what do you do when somebody can make a huge number of copies of themselves, does each of get them equal weight? How do you weigh the values of two minds that have merged together? Or even without uploading, if you happen to think that “alters” in dissociative identity disorder were genuine entities, should their desires be counted separately?
A lot of modern-day values seem to be premised on the assumption that human minds are indivisible, uncopyable and unmergeable—if you discard those assumptions, translating our values into something meaningful becomes really hard. To say nothing about what “death” or “personal identity” means when minds can be copied and deleted...
My answer is that this seems like a severely underdetermned problem, not an overdetermined one. If we start from the requirement that the change-gathering robot still gathers change in the ancestral environment, we’re at least starting with lots of indeterminacy, and I don’t think there’s some tipping point as we add desiderata. Unless we take the tragic step of requiring our translated morality to be the only possible translation :D
I bet a number of LW regulars have done so. I commented earlier (and rather too often) of how my ontology shifted from realism to instrumentalism after the standard definitions of terms “reality” and “existence” proved unsatisfactory for me. Certainly being exposed to the stuff discussed on this forum ought to have had a profound impact at least on some others, as well.
Hmm, shifting from realism to instrumentalism may or may not count as an ontological crisis for my purposes. The key question is, did you change the domain of your utility function? If so, from what to what, and how did you obtain your new utility function?
BTW, my standard advice for “wrote too many comments on some topic” is “write a post about it and then you can just link to the post”.
A formal look at ways of thinking about agency
Two words for the author at that link: control systems. That’s what agency is. The book, contrary to what he says, does not want to be read, the food does not want to be eaten; neither of these things is capable of wanting anything. The person who reads the book, who eats the food, does act to bring these things about (including when the latter person is a dog, or a rat, or a bacterium).
Learning to see agency where there is none is not a way of getting a more accurate picture of the world. It is a way of getting a less accurate picture of the world. You might as well imagine hurricances to be God’s wrath. To get a more accurate picture of the world, one must learn what agency is, learn how to find it where it is present and to find its absence where it is absent.
The one that caught my eye as possibly being useful was thinking of online temptations has having agency—there isn’t a personal connection, but those temptations evolve to be better at grabbing attention.
It is the people whose job it is to tempt you whose strategies evolve. That is the ball one needs to keep one’s eye on.
Actually, dealing with a component of your ontology not being real seems like a far harder problem than the problem of such a component not being fundamental.
According to the Great Reductionist Thesis everything real can be reduced to a mix of physical reference and logical reference. In which case, if every component of your ontology is real, you can obtain a formulation of your utility function in terms of fundamental things.
The case where some components of your ontology can’t be reduced because they’re not real and where your utility function refer explicitly to such entity seem considerably harder, but that is exactly the problem that someone who realize God doesn’t actually exist is confronted with, and we do manage that kind of ontology crisis.
So are you saying that the GRT is wrong or that none of the things that we value are actually real or that we can’t program a computer to perform reduction (which seems absurd given that we have managed to perform some reductions already) or what? Because I don’t see what you’re trying to get at here.
Have we actually managed to perform any reductions? In order to reduce “apple” for example,
So what are these truth conditions precisely? If there is an apple-shaped object carved out of marble sitting on the table, does that satisfy the truth conditions for “some apples on the table”? What about a virtual reality representation of an apple on a table? Usually we can just say “it doesn’t matter whether we call that thing an apple” and get on with life, but we can’t do that if we’re talking about something that shows up (or seems to show up) as a term in our utility function, like “pain”.
There is an entire sequence dedicated to how to define concepts and the specific problem of categories as they matter for your utility function is studied in this post where it is argued those problems should be solved by moral arguments and the whole metaethics sequence argue for the fact that moral arguments are meaningful.
Now if you’re asking me if we have a complete reduction of some concept relevant to our utility function all the way down to fundamental physics then the answer is no. That doesn’t mean that partial reductions of some concepts potentially relevant to our utility function has never been accomplished or that complete reduction is not possible.
So are you arguing that the metaethics sequence is wrong and that moral arguments are meaningless or are you arguing that the GRT is wrong and that reduction of the concept which appear in your utility function is impossible despite them being real or what? I still have no idea what it is that you’re arguing for exactly.
Something like that, except I don’t know enough to claim that moral arguments are meaningless, just that the problem is unsolved. Eliezer seems to think that moral arguments are meaningful, but their meanings are derived only from how humans happen to respond to them (or more specifically to whatever coherence humans may show in their responses to moral arguments). I’ve written before about why I find this unsatisfactory. Perhaps more relevant for the current discussion, I don’t see much reason to think this kind of “meaning” is sufficient to allow us to fully reduce concepts like “pain” and “self”.
No. What he actually says is that when we do moral reasoning we are approximating some computation in the same ways that the pebblesorters are approximating primality. What make moral arguments valid or invalid is whether the arguments actually establish what they where trying to establish in the context of the actual concept of rightness which is being approximated in the same way that an argument by a pebblesorter that 6 is an incorrect heap because 6=2*3 is judged valid because that is an argument establishing 6 is not a prime number. Of course, to determine what computation we actually are trying to approximate or to establish that we actually are approximating something, looking at the coherence human show in their response to moral arguments is an excellent idea, but it is not how you define morality.
Looking at the first link you provided, I think that looking at where people moralities concentrate as you present moral arguments to them is totally the wrong way to look at this problem. Consider for example Goldbach’s Conjecture. If you where to take peoples and present random arguments to them about the conjecture it doesn’t seem to necessarily be the case, depending on what distribution you use on possible arguments, that people opinions will concentrate to the correct conclusion concerning the validity of the conjecture. That doesn’t mean that people can’t talk meaningfully about the validity of Goldbach’s Conjecture. Should we be able to derive the computation that we approximate when we talk about morality by examining the dynamic of how people reacts to moral arguments? The answer is yes, but it isn’t a trivial problem.
As for the second link you provided, you argue that the way we react to moral arguments could be totally random or depend on trivial details, which doesn’t appear to be the case and which seems in contradiction with the fact that we do manage to agree about a lot of thing concerning morality and that moral progress do seems to be a thing.
You seem to be taking a position that’s different from Eliezer’s, since AFAIK he has consistently defended this approach that you call “wrong” (for example in the thread following my first linked comment). If you have some idea of how to “derive the computation that we approximate when we talk about morality by examining the dynamic of how people reacts to moral arguments” that doesn’t involve just “looking at where people moralities concentrate as you present moral arguments to them” then I’d be interested to know what it is.
ETA: Assuming “when we do moral reasoning we are approximating some computation”, what reasons do we have for thinking that the “some computation” will allow us to fully reduce “pain” to a set of truth conditions? What are some properties of this computation that you can cite as being known, that leads you to think this?
Well, Eliezer_2009 do seem to underestimate the difficulty of the extrapolation problem.
Have I solved that problem? No. But humans do seem to be able to infer from the way pebblesorters reason that they are referring to primality and also it seem that by looking at mathematicians reasoning about Goldbach’s Conjecture we should be able to infer what they refer to when they speak about Goldbach’s Conjecture and we don’t do that by looking at what position they end up concentrating at when we present them with all possible arguments. So the problem ought to be solvable.
Are you trying to imply that it could be the case that the computation that we approximate is only defined over a fixed ontology which doesn’t correspond to the correct ontology and simply return a domain error when one try to apply it to the real world? Well that doesn’t seem to be the case because we are able to do moral reasoning at a more fine-grained level than the naive ontology and a lot of the concept that seems relevant to moral reasoning, like qualia for example, seems like they ought to have canonical reductions. I detail the way in which I see us doing that kind of elucidation in this comment.
In each of those cases the pebblesorter reasoning / human mathematical reasoning is approximating some idealized reasoning system that is “well-behaved” in certain ways, for example not being sensitive to the order in which it encounters arguments. It’s also the case that there is only one such system “nearby” so that we don’t have to choose from multiple reasoning systems which one we are really approximating. (Although even in math there are some areas, such as set theory, with substantial unresolved debate as to what it is that we’re really talking about.) It’s unclear to me that either of these holds for human moral reasoning.
Can you give a detailed example of this?
Given that we are able to come to agreement about certain moral matters and the existence of moral progress, I do think that the evidence favor the existence of a well-behaved idealized reasoning system that we are approximating when we do moral reasoning.
This for a start.
What “certain moral matters” do you have in mind? As for existence of moral progress, see Konkvistador’s draft post Against moral progress.
I’ve always found that post problematic, and finally wrote down why. Any other examples?
A ontological crisis would indicate a serious problem. I don’t see one.
There may be weird boundary cases where our base ontologies have a problem, but I don’t find applying morality a daily struggle.
Do you have concrete examples of serious problems with applying morality in the real world today?
When is a fetus or baby capable of feeling pain (that has moral disvalue)? What about (non-human) animals?
At some point between being a blob of cells and being a fully formed baby popping out of Mommy, with the disvalue on a sliding scale. Where is the crisis?
For non-human animals, I disapprove of torturing animals for kicks, but am fine with using animals for industrial purposes, including food and medical testing. No crisis here either.
In life, I don’t kill people, and I also don’t alleviate a great deal of death and suffering that I might. I eat meat, I wear leather, and support abortion rights. No crisis. And I don’t see a lot of other people in crisis over such things either.
I explained in the post why “ontological crisis” is a problem that people mostly don’t have to deal with right away, but will have to eventually, in the paragraph that starts with “To fully confront the ontological crisis that we face”. Do you have any substantive disagreements with my post, or just object to the term “crisis” as being inappropriate for something that isn’t a present emergency for most people? If it’s the latter, I chose it for historical reasons, namely because Peter de Blanc already used it to name a similar class of problems in AIs.
In the paragraph you refer to:
Maybe we have no substantive disagreement. If your point is that a million copy super intelligence will have issues in morality because of their ontologies that we don’t currently have, then I agree. Me, I think it’s kind of cheeky to be prescribing solutions for the million copy super intelligence—I think he’s smarter than I am, doesn’t need my help much, and may not ever exist anyway. I’m not here to rain on that parade, but I’m not interested in joining it either.
However, you seemed to be using present tense for the crisis, and I just don’t see one now. Real people now don’t have big complicated ontological problems lacking clear solutions. That was my point.
The abortion example was appropriate, as that is one issue where currently many people have a problem, but their problem is usually just essentialism, and there is a cure for it—just knock it off.
I find discussions of metaethics interesting, particularly in terms of the conceptual confusion involved. It seemed that you were getting at such issues, but I couldn’t locate a concrete and currently relevant issue of that type from your post. So I directly asked for the concretes applicable now. You gave a couple. I don’t find either particularly problematic.
I don’t see how this addresses the problem.
Are you suggesting that AI will avoid cognitive dissonance by using compartmentalization like humans do?
It’s easy to decide that the moral significance of a fetus changes gradually from conception to birth; it takes a bit more thought to quantify the significance. Abstractly, at what stage of development is the suffering of 100 fetuses commensurate with the suffering of a newborn? 1 month of gestation? 4? 9? More concretely, if you’re pregnant, you’ll have to decide not only whether the phenomenological point of view of your unborn child should be taken into account in your decisionmaking, but you’ll have to decide in what way and to what degree it should be taken into account.
It’s not clear whether your disapproval of animal torture is for consequentialist or virtue-ethics reasons, or whether it is a moral judgment at all; but in either case there are plenty of everyday cases of borderline animal exploitation. (Dogfighting? Inhumane flensing?) And maybe you have a very specific policy about which practices you support and which you don’t. But why that policy?
Perhaps “crisis” is too dramatic in its connotations, but you should at least give some thought to the many moral decisions you make every day, and decide whether, on reflection, you endorse the choices you’re making.
Concretely, how many people that you know have faced a situation where that calculation is relevant?
I don’t know how much you’ll have to decide on how you decide. You’ll decide what to do based on your valuations—I don’t think the valuations themselves involve a lot of deciding. I don’t decide that ice cream is yummy; I taste it and it is.
And yes, I think it’s a good policy to review your decisions and actions to see if you’re endorsing the choices you’re making. But that’s not primarily an issue of suspect ontologies, but of just paying attention to your choices.
I don’t know what you mean by this, but maybe there’s no point in further discussion because we seem to agree that one should reflect on one’s moral decisions.
When it is similar enough to me that I can feel empathy for its mental state. If the torture vs dust, babyeaters, and paperclip maximizers have taught me anything it’s that I value the utility of other agents according to their similarity to me. I value the utility of agents that are indistinguishable from me most highly and very dissimilar agents (or inanimate matter) the least.
When I think about quantifying that similarity I think of how many experiences and thoughts I can meaningfully share with the agent. The utility of an agent that can think and feel and act the way that I can carries much more weight in my utility function. An agent that actually does think, experience, and act like I do has even more weight. If I compare the set of my actions, thoughts, and experiences, ME, to the set of the agent’s actions, thoughts, and experiences, AGENT, I think U(me) + |AGENT intersect ME| / |ME| * U(agent) is a reasonable starting point. Comparing actions would probably be done by comparing the outcome of my decision theory to that of the agent. It might even be possible to directly compare the function U_me(world) to U_agent(world)[agent_self=me_self] (the utility function of the agent with the agent’s representation of itself replaced with a representation of me) and the more similar the resulting functions the more empathy I would have for the agent. I would also want to include a factor of my estimate of the agent’s future utility. For example, a fetus will likely have a utility function much closer to mine in 20 years, so I would have more empathy for the future state of a fetus than the future state of a babyeater.
I really like the robot metaphor, and I fully agree with the kind of nihilism you describe here. Let me note, though, that nihilism is a technically precise but potentially misleading name for this world view. I am an old-fashioned secular humanist when it comes to 2012 humans. I am a moral nihilist only when I have to consider the plethora of paradoxes that come with the crazy singularity stuff we like to discuss here (most significantly, substrate-independence). Carbon-based 2012 humans already face some uncomfortable edge cases (e.g. euthanasia, abortion, animal rights), but with some introspection and bargaining we can and should agree on some ground rules. I am a big fan of such ground rules, that’s why I call myself an old-fashioned humanist. On the other hand, I think our morality simply does not survive the collision with your “ontological crisis”. After the ontological crisis forces itself on us, it is a brand new world, and it becomes meaningless to ask what we ought to do in this new world. I am aware that this is an aesthetically deeply unsatisfying philosophical position, so I wouldn’t accept it if I had some more promising alternatives available.
According to Wikipedia, if there’s some way to keep morality/values while adopting mereological nihilism, Peter Unger, Cian Dorr, or Ross Cameron may have thought of it.