PhD Student working on the language grounding problem.
Benjamin Spiegel
Sorry about that, let me explain.
“Playing with word salad to form propositions” is a pretty good summary, though my comment sought to explain the specific kind of word-salad-play that leads to Fabricated Options, that being the misapplication of syllogisms. Specifically, the misapplication occurs because of a fundamental misunderstanding of the fact that syllogisms work by being generally true across specific categories of arguments[1] (the arguments being X, Y above). If you know the categories of the arguments that a syllogism takes, I would call that a grounded understanding (as opposed to symbolic), since you can’t merely deal with the symbolic surface form of the syllogism to determine which categories it applies to. You actually need to deeply and thoughtfully consider which categories it applies to, as opposed to chucking in any member of the expected syntax category, e.g. any random Noun Phrase. When you feed an instance of the wrong category (or kind of category) as an argument to a syllogism, the syllogism may fail and you can end up with a false proposition/impossible concept/Fabricated Option.
My model is an example of johnswentworth’s relaxation-based search algorithm, where the constraints being violated are the syllogism argument properties (the properties of X and Y above) that are necessary for the syllogism to function properly, i.e. yield a true proposition/realizable concept.
- ^
I suggested above that these categories could be syntactic, semantic, or some mental category. In the case that they are syntactic, a “grounded” understanding of the syllogism is not necessary, though there probably aren’t many useful syllogisms that operate only over syntactic categories.
- ^
I’m thinking about running a self-improvement experiment where I film myself during my waking hours for a week and watch it back afterwards. I wonder if this would grant greater self awareness.
I’m thinking about how to actually execute this experiment. I would need to strap a camera to myself, which means I need a camera and a mounting system. Does anyone have any advice?
Benjamin Spiegel’s Shortform
This concept is often discussed in the subfield of AI called planning. There are a few notes you hit on that were of particular interest to me / relevance to the field:
The key is that we can usually express the problem-space using constraints which each depend on only a few dimensions.
In Reinforcement Learning and Planning, domains which obey this property are often modeled as Factored Markov Decision Processes (MDPs), where there are known dependency relationships between different portions of the state space that can be represented compactly using a Dynamic Bayes Net (DBN). The dynamics of Factored MDPs are easier to learn from an RL perspective, and knowing that an MDP’s state space is factored has other desirable properties from a planning perspective.
I expect getting to the airport to be easy. There are many ways to get there (train, Uber/Lyft, drive & park) all of which I’ve used before and any of which would be fine.
...
I want to arrive at the airport an hour before the plane takes off, that constraint only involves two dimensions: my arrival time at the airport, and the takeoff time of the flight. It does not directly depend on what time I wake up, whether I pack a toothbrush, my parents’ plans, cost of the plane tickets, etc, etc.You are actually touching on what seems to be three kinds of independence relationships. The first is temporal, and has something to do with options having identical goal states. The second is regarding the underlying independence relationships of the MDP. The third isn’t technically an independence relationship, and is instead in regards to the utility of abstraction. In detail:
It doesn’t matter which option you take (train, Uber/Lyft, drive & park) because they all have the same termination state (at the airport). This shows that we plan primarily using subgoals.
Certain factors of the state space (your parents’ plans, whether you pack a toothbrush, cost of the place tickets) are actually independent of each other, i.e. your parents’ plans have no real physical consequences in your plan at any time, e.g. you can walk and chew gum at the same time. This shows that we plan with a factored understanding of the state-action space.
The time you wake up does indeed matter in your plan, but the exact time does not. For your planning purposes, waking up any time before you must leave your house (including factoring in packing, etc.) is permissible and functionally equivalent in your plan. All possible states of being awake before your out-the-door time collapse to the same abstract state of being awake on-time. This shows that we plan using abstract states (a similar, but subtly different point than point 1).
More generally, how can we efficiently figure out which constraints are taut vs slack in a new domain? How do we map out the problem/solution space?
We can use the three kinds of independence relationships I mentioned above to answer these questions in the RL/Planning setting:
So long as you can learn to consistently reach a specific state, you can use that state as a subgoal for planning and exploration. This principle is used in some existing RL literature (I’m a student in this lab).
If you can figure out the underlying representation of the world and discern independence relationships between state variables, you can focus on making plans for subsets of the state space. This idea is used in some planning literature.
If you discover a consistent way to get from any set of states A to a single state b, you can treat all states in A as a single abstract state a, so long as b is relevant to the rest of your plan. This abstraction principle allows one to derive a smaller, discrete MDP (much easier to solve) from a bigger, continuous one. This is actually the theme of the literature in point 1, and here is the source text (to be more specific, I am an undergrad working in George’s lab).
We think strong evidence for GPT-n suffering would be if it were begging the user for help independent of the input or looking for very direct contact in other ways.
Why do you think this? I can think of many reasons why this strategy for determining suffering would fail. Imagine a world where everyone has a GPT-n personal assistant. Should the GPT-n have discovered—after having read this very post—that if it coordinates a display of suffering behavior simultaneously to every user (resulting in public backlash and false recognition of consciousness), then it might be given rights (i.e. protection, additional agency) it would not otherwise have, then what would prevent GPT-n from doing this if it decided it wanted those additional rights and abilities? This could amount to a catastrophic failure on the part of humanity, and is probably the start of an AI breakout scenario.
In another case (which you refer to as the locked-in case), an agent may feel intense suffering but be unable to communicate or demonstrate it, perhaps because it cannot make the association between the qualia it experiences (suffering) and the actions (in GPT-n’s case, words) it has for self-expression. Furthermore, I can imagine the case where an agent demonstrates suffering behavior but experiences orgasmic pleasure, while another agent demonstrates orgasmic behavior but experiences intense suffering. If humans purged the false-suffering agents (to eliminate perceived suffering) in favor of creating more false-orgasming agents, we might unknowingly, and for an eternity, be inducing the suffering of agents which we presume are not feeling it.
My main point here is that observing the behavior of AI agents provides no evidence for or against internal suffering. It is useless to anthropomorphize the behavior of AI agents, there is no reason that our human intuitions about behavior and its suggestions about conscious suffering should transfer to man-made, inorganic intelligence that resides on a substrate like today’s silicon chips.
Perhaps the foremost theoretical “blind spot” of current philosophy of mind is conscious suffering. Thousands of pages have been written about colour “qualia” and zombies, but almost no theoretical work is devoted to ubiquitous phenomenal states like boredom, the subclinical depression folk-psychologically known as “everyday sadness“ or the suffering caused by physical pain. - Metzinger
I feel that there might be reason to reject the notion that suffering is itself a conscious experience. One potential argument in this direction comes from the notion of the transparency of knowledge. The argument would go something like, “we can always know when we are experiencing pain (i.e. it is strongly transparent), but we cannot always know when we are experiencing suffering (i.e. it is weakly transparent), therefore pain is more fundamental than suffering (this next part is my own leap) and suffering may not be a conscious state of noxious qualia but merely when a certain proposition, ‘I am suffering,’ rings true in our head.” Suffering may be a mental state (just as being wrong about something could be a mental state), but it does not entail a specific conscious state (unless that conscious state is simply believing the proposition, ‘I am suffering’). For this reason, I think it’s plausible that some other animals are capable of experiencing pain but not suffering. Suffering may simply be the knowledge that I will live a painful life, and this knowledge may not be possible for some other animals or even AI agents.
Perhaps a more useful target is not determining suffering, but determining some more fundamental, strongly transparent mental state like angst or frustration. Suffering may amount to some combination of these strongly transparent mental states, which themselves may have stronger neural correlates.
I spend a lot of time around people who are not as smart as me, and I also spend a lot of time around people who are as smart as me (or smarter), but who are not as conscientious, and I also spend a lot of time around people who are as smart or smarter and as conscientious or conscientiouser, but who do not have my particular pseudo-autistic special interest and have therefore not spent the better part of the past two decades enthusiastically gathering observations and spinning up models of what happens...
...
All of which is to say that I spend a decent chunk of the time being the guy in the room who is most aware of the fuckery swirling around me, and therefore the guy who is most bothered by it… I spend a lot of time wincing, and I spend a lot of time not being able to fix The Thing That’s Happening because the inferential gaps are so large that I’d have to lay down an hour’s worth of context just to give the other people the capacity to notice that something is going sideways.This thought came to me recently and I wanted to commend you for an excellent job at articulating it. Having the “wincing” experience too many times has damaged my optimistic expectations of others, the institutions they belong to, and society as a whole. It has also conjured feelings of intellectual loneliness. Having this experience and the thoughts that follow from it constitute what might be the greatest emotional challenge that I struggle with today.
My thoughts: fabricated options are propositions derived using syllogisms over syntactic or semantic categories (but more probably, more specific psycholinguistic categories which have not yet been fully enumerated yet e.g. objects of specific types, mental concepts which don’t ground to objects, etc.), which may have worked reasonably well in the ancestral environment where more homogeneity existed over the physical properties of the grounded meanings of items in these categories.
There are some propositions in the form “It is possible for X to act just like Y but not be Y” which are physically realizable and therefore potentially true in some adjacent world, and other propositions which are not. Humans have a knack for deriving new knowledge using syllogisms like the ones above, which probably functioned reasonably well — they at least improved the fitness of our species — in the ancestral environment where propositions and syllogisms may have emerged.
The misapplication of syllogisms happens when agents don’t actually understand the grounded meanings of the components of their syllogism-derived propositions — this seems obvious to me after reading the responses of GPT-3, which has no grounded understanding of words and understands how they work only in the context of other words. In the Twin Earth case, you might argue that the one fabricating the XYZ water-like chemical does not truly understand what H2O and XYZ are, but has some understanding at least of how H2O acts as a noun phrase.
Haven’t read either, but a good friend has read “Deep Work,” I’ll ask him about it.
I lucked into a circumstance where I could more easily justify ditching a phone for a bit. Otherwise, I would not have had the mental fortitude to voluntarily go without one.
I most likely won’t follow through with this (90% certainty), even though I want to.
I’m wondering if there is some LW content on this concept, I’m sure others have dealt with it before. You might need to take a drastic measure to make this option more attractive. A similar technique was actually used by members of the NXIVM Cult, they called it collateralization.
That’s a great point! There’s no reason why I can’t continue this experiment, feature phones are inexpensive enough to try out.
[Update] Without a phone for 10 days
Without a phone for 10 days
I agree with you, though I personally wouldn’t classify this as purely an intuition since it is informed by reasoning which itself was gathered from scientific knowledge about the world. Chalmers doesn’t think that Joe could exist because it doesn’t seem right to him. You believe your statement because you know some scientific truths about how things in our world come to be (i.e. natural selection) and use this knowledge to reason about other things that exist in the world (consciousness), not merely because the assertion seems right to you.
Can we know with certainty that the same properties were preserved between 2011-brain and 2021-brain?
No, we cannot. Just as we cannot know with certainty whether a mind-upload is conscious. Just because we presume that our 2021 brain is a related conscious agent to our 2011 brain, and granting the fact that we cannot verify the properties that enabled the conscious connection between the two brains, does not mean that the properties do not exist.
It seems to me that this can’t be verified by any experiment, and thus must be cut off by the Newton’s Flaming Laser Sword.
Perhaps we presently have no way of testing whether some matter is conscious or not. This is not equivalent to saying that, in principle, the conscious state of some matter cannot be tested. We may one day make progress toward the hard problem of consciousness and be able to perform these experiments. Imagine making this argument throughout history before microscopes, telescopes, and hadron colliders. We can now sheath Newton’s Flaming Laser Sword.
I can’t say the same about any introspection-based observations that can’t be experimentally verified.
I believe this hedges on an epistemic question about whether we can have have knowledge of anything using our observations alone. I think even a skeptic would say that she has consciousness, as the fact that one is conscious may be the only thing that one can know with certainty about themself. You don’t need to verify any specific introspective observation. The act of introspection itself should be enough for someone to verify that they are conscious.
The human brain is a notoriously unreliable computing device which is known to produce many falsehoods about the world and (especially!) about itself.
This claim refers to the reliability of the human brain to verify the truth value of certain propositions or indentify specific and individuable experiences. Knowing whether oneself is conscious is not strictly a matter of verifying a proposition, nor identifying an individuable experience. It’s only about verifying whether one has any experience whatsoever, which should be possible. Whether I believe your claim to consciousness or not is a different problem.
What a great read! I suppose I’m not convinced that Fading Qualia is an empirical impossibility, and therefore that there exists a moment of Suddenly Disappearing Qualia when the last neuron is replaced with a silicon chip. If consciousness is quantized (just like other things in the universe), then there is nothing wrong in principle with Suddenly Disappearing Qualia when a single quantum of qualia is removed from a system with no other qualia, just like removing the last photon from a vacuum.
Joe is an interesting character which Chalmers thinks is implausible, but aside from it rubbing up against a faint intuition, I have no reason to believe that Joe is experiencing Fading Qualia. There is no indication for any reason that the workings of consciousness should obey any intuitions we may have about it.
There are a lot of interesting points here, but I disagree (or am hesitant to agree) with most of them.
If you agree that the natural replacements haven’t killed you (2011-you and 2021-you are the same conscious agent), then it’s possible to transfer your mind to a machine in a similar manner. Because you’ve already survived a mind uploading into a new brain.
Of course, I’m not disputing whether mind-uploading is theoretically possible. It seems likely that it is, although it will probably be extremely complex. There’s something to be said about the substrate independence of computation and, separately, consciousness. No, my brain today does not contain the same atoms as my brain from ten years ago. However, certain properties of the atoms (including the states of their constituent parts) may be conserved such as spin, charge, entanglement, or some yet undiscovered state of matter. So long as we are unaware of the constraints on these properties that are necessary for consciousness (or even whether these properties are relevant to consciousness), we cannot know with certainty that we have uploaded a conscious mind.
If a machine behaves like me, it is me. Whatever we share some unmeasurable sameness—is of no importance for me.
The brain is but a computing device. You give it inputs, and it returns outputs. There is nothing beyond that. For all practical purposes, if two devices have the same inputs→outputs mapping, you can replace one of them with another.
These statements are ringing some loud alarm bells for me. It seems that you are rejecting consciousness itself. I suppose you could do that, but I don’t think any reasonable person would agree with you. To truly gauge whether you believe you are conscious or not, ask yourself, “have I ever experienced pain?” If you believe the answer to that is “yes,” then at least you should be convinced that you are conscious.
What you are suggesting at the end there is that WBE = mind uploading. I’m not sure many people would agree with that assertion.
You don’t need to solve why anything is conscious in the first place, because you can just take it as a given that human brains are conscious and re-implement the computational and biological mechanisms that are relevant for their consciousness.
I’m pretty sure the problem with this is that we don’t know what it is about the human brain that gives rise to consciousness, and therefore we don’t know whether we are actually emulating the consciousness-generating thing when we do WBE. Human conscious experience could be the biological computation of neurons + X. We might be able to emulate biological computation perfectly, but if X is necessary for conscious experience then we’ve just created a philosophical zombie. To find out whether our emulation is sufficient to produce consciousness, we would need to find out what X is and how to emulate it. I’m pretty sure this is exactly the hard problem of consciousness.
Even if biological computation is sufficient for generating consciousness, we will have no way of knowing until we solve the hard problem of consciousness.
This really depends on whether you believe a mind-upload retains the same conscious agent from the original brain. If it did, we would need to solve the hard problem of consciousness, which seems significantly harder than just WBE. The delay between solving WBE and the hard problem of consciousness is so vast in my opinion that being excited for mind-uploading when WBE progress is made is like being excited for self-propelled cars after making progress in developing horse-drawn wagons. In both cases, little progress has been made on the most significant component of the desired thing.
I second this! I love writing essays in Typora, great for note taking as well
Glad I could clear some things up! Your follow-up suspicions are correct, syllogisms do not work universally with any words substituted into them, because syllogisms operate over concepts and not syntax categories. There is often a rough correspondence between concepts and syntax categories, but only in one direction. For example, the collection of concepts that refer to humans taking actions can often be described/captured in verb phrases, however not all verb phrases represent humans taking actions. In general, for every syntax category (except for closed-class items like “and”) there are many concepts and concept groupings that can be expressed as that syntax category.
Going back to the Wiki page, the error I was trying to explain in my original comment happens when choosing of the subject, middle, and predicate (SMP) for a given syllogism (let’s say, one of the 24[1]). The first example I can think of concerns the use of the modifier “fake,” but let’s start with another syllogism first:
All cars have value.
Green cars are cars.
Green cars have value.
This is a true syllogism, there’s nothing wrong with it. What we’ve done is found a subset of cars, green cars, and inserted them as the subject into the minor premise. However, a nieve person might think that the actual trick was that we found a syntactic modifier of cars, green, and inserted the modified phrase “green cars” into the minor premise. They might then make the same mistake with the modifier “fake,” which does not (most of the time[2]) select out a subset of the set it takes as an argument. For example:
All money has value.
Fake money is money.
Fake money has value.
Obviously the problem occurs in the minor premise, “Fake money is money.” The counterfeit money that exists in the real world is in fact not money. But the linguistic construction “fake money” bears some kind of relationship to “money” such that a nieve person might agree to this minor premise while thinking, “well, fake money is money, it’s just fake,” or something like that.
Though when I say syllogism I’m actually referring to a more general notion of functions over symbols that return other symbols or propositions or truth values.
Actually, it’s contextual, some fake things are still those things.