how to quote
Paste text into your comment and then select/highlight it. Formatting options will appear, including a quote button.
how to quote
Paste text into your comment and then select/highlight it. Formatting options will appear, including a quote button.
People often try to solve the problem of counterfactuals by suggesting that there will always be some uncertainty. An AI may know its source code perfectly, but it can’t perfectly know the hardware it is running on.
How could Emmy, an embedded agent, know its source code perfectly, or even be certain that it is a computing device under the Church-Turing definition? Such certainty would seem dogmatic. Without such certainty, the choice of 10 rather than 5 cannot be firmly classified as an error. (The classification as an error seemed to play an important role in your discussion.) So Emmy has a motivation to keep looking and find that U(10)=10.
Thanks for making point 2. Moral oughts need not motivate sociopaths, who sometimes admit (when there is no cost of doing so) that they’ve done wrong and just don’t give a damn. The “is-ought” gap is better relabeled the “thought-motivation” gap. “Ought”s are thoughts; motives are something else.
Technicalities: Under Possible Precisifications, 1 and 5 are not obviously different. I can interpret them differently, but I think you should clarify them. 2 is to 3 as 4 is to 1, so I suggest listing them in that order, and maybe adding an option that is to 3 as 5 is to 1.
Substance: I think you’re passing over a bigger target for criticism, the notion of “outcomes”. In general, agents can and do have preferences over decision processes themselves, as contrasted with the standard “outcomes” of most literature like winning or losing money or objects. For example, I can be “money pumped” in the following manner. Sell me a used luxury sedan on Monday for $10k. Trade me a Harley Davidson on Tuesday for the sedan plus my $5. Trade me a sports car on Wednesday for the Harley plus $5. Buy the sports car from me on Thursday for $9995. Oh no, I lost $15 on the total deal! Except: I got to drive, or even just admire, these different vehicles in the meantime.
If all processes and activities are fair game for rational preferences, then agents can have preferences over the riskiness of decisions, the complexity of the decision algorithm, and a host of other features that make it much more individually variable which approach is “best”.
If there were no Real Moral System That You Actually Use, wouldn’t you have a “meh, OK” reaction to either Pronatal Total Utilitarianism or Antinatalist Utilitarianism—perhaps whichever you happened to think of first? How would this error signal—disgust with those conclusions—be generated?
Shouldn’t a particular method of inductive reasoning be specified in order to give the question substance?
Great post and great comment. Against your definition of “belief” I would offer the movie The Skeleton Key. But this doesn’t detract from your main points, I think.
I think there are some pretty straightforward ways to change your true preferences. For example, if I want to become a person who values music more than I currently do, I can practice a musical instrument until I’m really good at it.
I don’t say that we can talk about every experience, only that if we do talk about it, then the basic words/concepts we use are about things that influence our talk. Also, the causal chain can be as indirect as you like: A causes B causes C … causes T, where T is the talk; the talk can still be about A. It just can’t be about Z, where Z is something which never appears in any chain leading to T.
I just now added the caveat “basic” because you have a good point about free will. (I assume you mean contracausal “free will”. I think calling that “free will” is a misnomer, but that’s off topic.) Using the basic concepts “cause”, “me”, “action”, and “thing” and combining these with logical connectives, someone can say “I caused my action and nothing caused me to cause my action” and they can label this complex concept “free will”. And that may have no referent, so such “free will” never causes anything. But the basic words that were used to define that term, do have referents, and do cause the basic words to be spoken. Similarly with “unicorn”, which is shorthand for (roughly) a “single horned horse-like animal”.
An eliminativist could hold that mental terms like “qualia” are referentless complex concepts, but an epiphenomenalist can’t.
The core problem remains that, if some event A plays no causal role in any verbal behavior, it is impossible to see how any word or phrase could refer to A. (You’ve called A “color perception A”, but I aim to dispute that.)
Suppose we come across the Greenforest people, who live near newly discovered species including the greater geckos. Greenforesters use the word “gumie” always and only when they are very near greater geckos. Since greater geckos are extremely well camouflaged, they can only be seen at short range. Also, all greater geckos are infested with microscopic gyrating gnats. Gyrating gnats make intense ultrasound energy, so whenever anyone is close to a greater gecko, their environment and even their brain is filled with ultrasound. When one’s brain is filled with this ultrasound, the oxygen consumption by brain cells rises. Greenforesters are hunter-gatherers lacking either microscopes or ultrasound detectors.
To what does “gumie” refer: geckos, ultrasound, or neural oxygen consumption? It’s a no-brainer. Greenforesters can’t talk about ultrasound or neural oxygen: those things play no causal role in their talk. Even though ultrasound and neural oxygen are both inside the speakers, and in that sense affect them, since neither one affects their talk, that’s not what the talk is about.
Mapping this causal structure to the epiphenomenalist story above: geckos are like photon-wavelengths R, ultrasound in brain is like brain activity B, oxygen consumption is like “color perception” A, and utterances of “gumie” are like utterances S1 and S2. Only now I hope you can see why I put scare quotes around “color perception”. Because color perception is something we can talk about.
Good point. But consider the nearest scenarios in which I don’t withdraw my hand. Maybe I’ve made a high-stakes bet that I can stand the pain for a certain period. The brain differences between that me, and the actual me, are pretty subtle from a macroscopic perspective, and they don’t change the hot stove, nor any other obvious macroscopic past fact. (Of course by CPT-symmetry they’ve got to change a whole slew of past microscopic facts, but never mind.) The bet could be written or oral, and against various bettors.
Let’s take a Pearl-style perspective on it. Given DO:Keep.hand.there, and keeping other present macroscopic facts fixed, what varies in the macroscopic past?
Sean Carroll writes in The Big Picture, p. 380:
The small differences in a person’s brain state that correlate with different bodily actions typically have negligible correlations with the past state of the universe, but they can be correlated with substantially different future evolutions. That’s why our best human-sized conception of the world treats the past and future so differently. We remember the past, and our choices affect the future.
I’m especially interested in the first sentence. It sounds highly plausible (if by “past state” we mean past macroscopic state), but can someone sketch the argument for me? Or give references?
For comparison, there are clear explanations available for why memory involves increasing entropy. I don’t need anything that formal, but just an informal explanation of why different choices don’t reliably correlate to different macroscopic events at lower-entropy (past) times.
We not only stop at red lights, we make statements like S1: “subjectively, red is closer to violet than it is to green.” We have cognitive access both to “objective” phenomena like the family of wavelengths coming from the traffic light, and also to “subjective” phenomena of certain low-level sensory processing outputs. The epiphenomenalist has a theory on the latter. Your steelman is well taken, given this clarification.
By the way, the fact that there is a large equivalence class of wavelength combinations that will be perceived the same way, does not make redness inherently subjective. There is an objective difference between a beam of light containing a photon mix that belongs to that class, and one that doesn’t. The “primary-secondary quality” distinction, as usually conceived, is misleading at best. See the Ugly Duckling theorem.
Back to “subjective” qualities: when I say subjective-red is more similar to violet than to green, to what does “subjective-red” refer? On the usual theories of how words in general refer—see above on “horses” and cows—it must refer to the things that cause people to say S2: “subjectively this looks red when I wear these glasses” and the like.
Suppose the epiphenomenalist is a physicalist. He believes that subjective-red is brain activity A. But, by definition of epiphenomenalism, it’s not A that causes people to say the above sentences S1 and S2, but rather some other brain activity, call it B. But now by our theory of reference, subjective-red is B, rather than A. If the epiphenomenalist is a dualist, a similar problem applies.
The point is literally semantic. “Experience” refers to (to put it crudely) the things that generally cause us to say “experience”, because almost all words derive their reference from the things that cause their utterances (inscriptions, etc.). “Horse” means horse because horses typically occasion the use of “horse”. If there were a language in which cows typically occasioned the word “horse”, in that language “horse” would mean cow.
I agree that non-universal-optimizers are not necessarily safe. There’s a reason I wrote “many” not “all” canonical arguments. In addition to gaming the system, there’s also the time honored technique of rewriting the rules. I’m concerned about possible feedback loops. Evolution brought about the values we know and love in a very specific environment. If that context changes while evolution accelerates, I foresee a problem.
I think the “non universal optimizer” point is crucial; that really does seem to be a weakness in many of the canonical arguments. And as you point out elsewhere, humans don’t seem to be universal optimizers either. What is needed from my epistemic vantage point is either a good argument that the best AGI architectures (best for accomplishing the multi-decadal economic goals of AI builders) will turn out to be close approximations to such optimizers, or else some good evidence of the promise and pitfalls of more likely architectures.
Needless to say, that there are bad arguments for X does not constitute evidence against X.
This is the right answer, but I’d like to add emphasis on the self-referential nature of the evaluation of humans in the OP. That is, it uses human values to assess humanity, and comes up with a positive verdict. Not terribly surprising, nor terribly useful in predicting the value, in human terms, of an AI. What the analogy predicts is that evaluated by AI values, AI will probably be a wonderful thing. I don’t find that very reassuring.
Well if you narrow “metaphysics” down to “a priori First Philosophy”, as the example suggests—then I’m much less enthusiastic about “metaphysics”. But if it’s just (as I conceive it) continuous with science, just an account of what the world contains and how it works, we need a healthy dose of that just to get off the ground in epistemology..
The post persuasively displays some of the value of hermeneutics for philosophy and knowledge in general. Where I part ways is with the declaration that epistemology precedes metaphysics. We know far more about the world than we do about our senses. Our minds are largely outward-directed by default. What you know far exceeds what you know that you know, and what you know how you know is smaller still. The prospects for reversing cart and horse are dim to nonexistent.
When I read
I thought, thank goodness, Graziano (and steve2152) gets it. But in the moral implications section, you immediately start talking about attention schemas rather than simply attention. Attention schemas aren’t necessary for consciousness or sentience; they’re necessary for meta-consciousness. I don’t mean to deny that meta-consciousness is also morally important, but it strikes me as a bad move to skip right over simple consciousness.
This may make little difference to your main points. I agree that “There are (presumably) computations that arguably involve something like an ‘attention schema’ but with radically alien properties.” And I doubt that I could see any value in an attention schema with sufficiently alien properties, nor would I expect it to see value in my attentional system.