Toby, I spent a while looking into the meta-ethical debates about realism. When I thought moral realism was a likely option on the table, I meant:
Strong Moral Realism: All (or perhaps just almost all) beings, human, alien or AI, when given sufficient computing power and the ability to learn science and get an accurate map-territory distinction, will agree on what physical state the universe ought to be transformed into, and therefore they will assist you in transforming it into this state.
But modern philosophers who call themselves “realists” don’t mean anything nearly this strong. They mean that that there are moral “facts”. But what use is it if the paperclipper agrees that it is a “moral fact” that human rights ought to be respected, if it then goes on to say it has no desire to act according to the prescription of moral facts, and moral facts can’t somehow revoke it.
The force of “scientific facts” is that they constrain the world. If an alien wants to get from Andromeda to here, it has to take at least 2.5 million years, the physical fact of the finite speed of light literally stops the alien from getting here sooner, whether it likes it or not.
The 56.3/27.7% split on philpapers seems to me to be an argument about whether you should be allowed to attach the word “fact” to your preferences, kind of as a shiny badge of merit, without actually disagreeing on any physical prediction about the world. The debate between weak moral realists and antirealists sounds like the debate where two people ask “if a tree falls in the forest, does it really make a sound?”—they’re not arguing about anything substantive.
So, I ask, how many philosophers are strong moral realists, in the sense I defined?
EDIT: After seeing Carl’s comment, it seems likely to me that there probably are a bunch of theists who would, in fact, support the strong moral realism position; but they’re clowns, so who cares.
I strongly agree with Roko that something like his strong version is the interesting version. What matters is what range of creatures will come to agree on outcomes; it matters much less what range of creatures think their desires are “right” in some absolute sense, if they don’t think that will eventually be reflected in agreement.
The force of “scientific facts” is that they constrain the world.
In the context of this comment, the goal of FAI can be said to be to constrain the world by “moral facts”, just like laws of physics constrain the world by “physical facts”. This is the sense in which I mean “FAI=Physical Laws 2.0”.
Only in a useless way: there is a specific FAI that does the “truly right” thing, but the truthhood of rightness doesn’t stop you from having to code the rightness in. Goodness is not discoverably true: if you don’t already know exactly what goodness is, you can’t find out.
hmmm. That is interesting. Well, let us define the collection W_i of worlds run by superintelligences with the subscript i ranging over goals. No matter what i is, those worlds are going to look, to any agents in them, like worlds with “moral truths”.
However, any agent that learned the real physics of such a world would see that the goodness is written in to the initial conditions, not the laws.
Roko, you make a good point that it can be quite murky just what realism and anti-realism mean (in ethics or in anything else). However, I don’t agree with what you write after that. Your Strong Moral Realism is a claim that is outside the domain of philosophy, as it is an empirical claim in the domain of exo-biology or exo-sociology or something. No matter what the truth of a meta-ethical claim, smart entities might refuse to believe it (the same goes for other philosophical claims or mathematical claims).
Pick your favourite philosophical claim. I’m sure there are very smart possible entities that don’t believe this and very smart ones that do. There are probably also very smart entities without the concepts needed to consider it.
I understand why you introduced Strong Moral Realism: you want to be able to see why the truth of realism would matter and so you came up with truth conditions. However, reducing a philosophical claim to an empirical one never quite captures it.
For what its worth, I think that the empirical claim Strong Moral Realism is false, but I wouldn’t be surprised if there was considerable agreement among radically different entities on how to transform the world.
Pick your favourite philosophical claim. I’m sure there are very smart possible entities that don’t believe this and very smart ones that do
If there’s a philosophical claim that intelligent agents across the universe wouldn’t display massive agreement on, then I don’t really think it is worth its salt. I think that this principle can be used to eliminate a lot of nonsense from philosophy.
Which of anti-realism or weak realism is true seems to be a question we can eliminate. Whether strong realism is true or not seems substantive, because it matters to our policy which is true.
However, reducing a philosophical claim to an empirical one never quite captures it.
There are clearly some examples where there can be interesting things to say that aren’t really empirical, e.g. decision theory, mystery of subjective experience. But I think that this isn’t one of them.
Suffice it to say I can’t think of anything that makes the debate between weak realism and antirealism at all interesting or worthy of attention. Certainly, Friendly AI theorists ought not care about the difference, because the empirical claims about an AI system will do are identical. Once the illusions and fallacies surrounding rationalist moral psychology has been debunked, proponents of other AI motivation methods than FAI also ought not to care about the weak realism vs. anti-realism pseudo-question
I’m having trouble reconciling this with the beginning of your first comment:
These views go together well: if value is not fundamental, but dependent on characteristics of humans, then it can derive complexity from this and not suffer due to Occam’s Razor.
An “optimal organism” may be a possibility, though. Assuming god’s utility function, it is theoretically possible that a unique optimal agent might exist. Whether it would be found before the universal heat death is another issue, though.
From my naturalist POV, you need to show me a paperclipper before it is convincing evidence about the real world. Paperclipper’s are theoretical possibilities, but who would build one, why, and how long would it last in the wild?
...and if the “paperclips” part is a metaphor, then which preferred ordered atomic states count, and which don’t? Is a cockroach a “paperclipper”—because it acts as though it wants to fill the universe with its DNA?
Yes, paperclips are a metaphor. No one expects a literal paperclip maximizer; the point is to illustrate unFriendly AI as a really powerful system with little or no moral worth as humans would understand moral worth. A non-conscious superintelligent cockroach-type thing that fills the universe with its DNA or equivalent would indeed qualify.
In that case, I don’t think a division of superintelligences into paperclippers and non-paperclippers “carves nature at the joints” very well. It appears to be a human-centric classification scheme.
I’ve proposed another way of classifying superintelligence goal systems—according to whether or not they are “handicapped”.
Healthy superintelligences execute god’s utility function—i.e. they don’t value anything apart from their genes.
Handicapped superintelligences value other things—paperclips, gold atoms, whatever. Genes are valued too—but they may only have proximate value.
According to this classification scheme, the cockroach and paperclipper would be in different categories.
“Handicapped” superintelligences value things besides their genes. They typically try and leave something behind. Most other agents keep dissipating negentropy until they have flattened energy gradients as much as they can—the way most living ecosystems do.
It appears to be a human-centric classification scheme.
Yes, that’s the point! We’re humans, and so for some purposes we find it useful to categorize superintelligences into those that do and don’t do what we want, even if it isn’t a natural categorization from a more objective standpoint.
Right—well, fine. One issue is that the classification into paperclippers and non-paperclippers was not clear to me until you clarified it. Another poster has “clarified” things the other way in response to the same comment. So, as a classification scheme, IMO the idea seems rather vague and unclear.
The next issue is: how close does an agent have to be to what you (we?) want before it is a non-paperclipper?
IMO, the idea of a metaphorical unfriendly paperclipper appears to need pinning down before it is of very much use as a means of superintelligence classification scheme.
Another poster has “clarified” things the other way in response to the same comment.
I’m pretty confident Roko agrees with me and that this is just a communication error.
So, as a classification scheme, IMO the idea seems rather vague and unclear.
I’m given to understand that the classification scheme is Friendly versus unFriendly, with paperclip maximizer being an illustrative (albeit not representative) example of the latter. I agree that more rigor (and perhaps clearer terminology) is in order.
Machine intelligences seem likely to vary in their desirability to humans.
Technically true. However, most naive superintelligence designs will simply kill all humans. You’ve accomplished quite a lot to even get to a failed utopia, much less deciding whether you want Prime Intellect or Coherent Extrapolated Volition.
It’s also unlikely you’ll accidentally do something significantly worse than killing all humans, for the same reasons. A superintelligent sadist is just as hard as a utopia.
I read the essay you linked to. I really don’t know where to start.
Now, we are not currently facing threats from any alien races. However, if we do so in the future, then we probably do not want to have handicapped our entire civilisation.
So we should guard against potential threats from non-human intelligent life by building a non-human superintelligence that doesn’t care about humans?
While dependencies on humans may have the effect of postponing the demise of our species, they also have considerable potential to hamper and slow evolutionary progress.
Postpone? I thought the point of friendly AI was to preserve human values for as long as physically possible. “Evolutionary progress?” Evolution is stupid and doesn’t care about the individual organisms. Evolution causes pointless suffering and death. It produces stupid designs. As Michael Vassar once said: think of all the simple things that evolution didn’t invent. The wheel. The bow and arrow. The axial-flow pump. Evolution had billions of years creating and destroying organisms and it couldn’t invent stuff built by cave men. Is it OK in your book that people die of antibiotic resistant diseases? MRSA is a result of evolutionary progress.
For example humans have poor space-travel potential, and any tendency to keep humans around will be associated with remaining stuck on the home world.
Who said humans have to live on planets or breathe oxygen or run on neurons? Why do you think a superintelligence will have problems dealing with asteroids when humans today are researching ways to deflect them?
I think your main problem is that you’re valuing the wrong thing. You practically worship evolution while neglecting important things like people, animals, or anything that can suffer. Also, I think you fail to notice the huge first-mover advantage of any superintelligence, even one as “handicapped” as a friendly AI.
Finally, I know the appearance of the arguer doesn’t change the validity of the argument, but I feel compelled to tell you this: You would look much better with a haircut, a shave, and some different glasses.
I don’t avocate building machines that are indiffierent to humans. For instance, I think machine builders would be well advised to (and probably mostly will) construct devices that obey the law—which includes all kinds of provisions for preventing harm to humans.
Evolution did produce the wheel and the bow and arrow. If you think otherwise, please state clearly what definition of the term “evolution” you are using.
Regarding space travel—I was talking about wetware humans.
Re: “Why do you think a superintelligence will have problems dealing with asteroids when humans today are researching ways to deflect them?”
...that is a projection on your part—not something I said.
Re: “Also, I think you fail to notice the huge first-mover advantage of any superintelligence”
To quote mine myself:
“IMHO, it is indeed possible that the first AI will effectively take over the world. I.T. is an environment with dramatic first-mover advantages. It is often a winner-takes-all market – and AI seems likely to exhibit such effects in spades.”
“Google was not the first search engine, Microsoft was not the first OS maker—and Diffie–Hellman didn’t invent public key crypto.
Being first does not necessarily make players uncatchable—and there’s a selection process at work in the mean time, that weeds out certain classes of failures.”
I have thought and written about this issue quite a bit—and my position seems a bit more nuanced and realistic than the position you are saying you think I should have.
I disagree even with your interpretation of that document, but that is not the point emphasized in the grandparent. I acknowledge that while a superintelligence need not have genes it is in fact possible to construct a superintelligence that does relies significantly on “small sections of heritable information”, including the possibility of a superintelligence that relies on genes in actual DNA. Hence the slight weakening of the claim.
I think your term “God’s utility function” is a bit confusing—as if it’s just one utility function. If you value your genes, and I value my genes, and our genes are different, then we have different utility functions.
Also, the vast majority of possible minds don’t have genes.
By “gene” I mean: “Small chunk of heritable information”
Any sufficiently long-term persistent structure persists via a copying process—and so has “genes” in this sense.
What we mean by preference. Except that preference, being a specification of a computation, has a lot of forms of expression, so it doesn’t “persist” by a copying process, it “persists” as a nontrivial computational process.
A superintelligence that persists in copying a given piece of information is running a preference (computational process) that specifies copying as the preferable form of expression, over all the other things it could be doing.
No, no! Genes is just intended to refer to any heritable information. Preferences are something else entirely. Agents can have preferences which aren’t inherited—and not everything that gets inherited is a preference.
Anything information that persists over long periods of time persists via copying.
“Copying” just means there’s Shannon-mutual information between the source and the destination which originated in the source. Complex computations are absolutely included—provided that they share this property.
[Any] information that persists over long periods of time persists via copying.
“Copying” just means there’s Shannon-mutual information between the source and the destination which originated in the source. Complex computations are absolutely included—provided that they share this property.
Then preference still qualifies. This holds as a factual claim provided we are talking about reflectively consistent agents (i.e. those that succeed in not losing their preference), and as a normative claim regardless.
I would appreciate it if you avoid redefining words into highly qualified meanings, like “gene” for “anything that gets copied”, and then “copying” for “any computation process that preserves mutual information”.
Re: Then preference still qualifies. This holds as a factual claim provided [bunch of conditions]
Yes, there are some circumstances under which preferences are coded genetically and reliably inherited. However, your claim was stronger. You said what meant by genes was what “we” would call preferences. That implies that genes are preferences and preferences are genes.
You have just argued that a subset of preferences can be genetically coded—and I would agree with that. However, you have yet to argue that everything that is inherited is a preference.
I think you are barking up the wrong tree here—the concepts of preferences and genes are just too different. For example, clippy likes paperclips, in addition to the propagation of paperclip-construction instructions. The physical paperclips are best seen as phenotype—not genotype.
Re: “I would appreciate it if you avoid redefining words into highly qualified meanings [...]”
I am just saying what I mean—so as to be clear.
If you don’t want me to use the words “copy” and “gene” for those concepts—then you are out of luck—unless you have a compelling case to make for better terminology. My choice of words in both cases is pretty carefully considered.
Re: Then preference still qualifies. This holds as a factual claim provided [bunch of conditions]
Not “bunch of conditions”. Reflective consistency is the same concept as “correctly copying preference”, if I read your sense of “copying” correctly, and given that preference is not just “thing to be copied”, but also plays the appropriate role in decision-making (wording in the grandparent comment improved). And reflectively consistent agents are taken as a natural and desirable (from the point of view of those agents) attractor where all agents tend to end up, so it’s not just an arbitrary category of agents.
That implies that genes are preferences and preferences are genes.
But there are many different preferences for different agents, just as there are different genes. Using the word “genes” in the context where both human preference and evolution are salient is misleading, because human genes, even if we take them as corresponding to a certain preference, don’t reflect human preference, and are not copied in the same sense human preference is copied. Human genes are exactly the thing that currently persists by vanilla “copying”, not by any reversible (mutual information-preserving) process.
If you don’t want me to use the words “copy” and “gene” for those concepts—then you are out of luck
Confusing terminology is still bad even if you failed to think up a better alternative.
You appear to be on some kind of different planet to me—and are so far away that I can’t easily see where your ideas are coming from.
The idea I was trying to convey was really fairly simple, though:
“Small chunks of heritable information” (a.k.a. “genes”) are one thing, and the term “preferences” refers to a different concept.
As an example of a preference that is not inherited, consider the preference of an agent for cats—after being bitten by a dog as a child.
As an example of something that is inherited that is not a preference, consider the old socks that I got from my grandfather after his funeral.
These are evidently different concepts—thus the different terms.
Thanks for your terminology feedback. Alas, I am unmoved. That’s the best terminology I have found, and you don’t provide an alternative proposal. It is easy to bitch about terminology—but not always so easy to improve on it.
Right—well, that seems pretty unlikely. What is the story? A paperclip manufacturer with an ambitious IT department that out-performs every other contender? How come the government doesn’t just step on the results?
Is there a story about how humanity drops the reins of civilisation like that that is not extremely contrived?
I am unclear on how this story is contrived. There are vast numbers of business with terrible externalities today; this is deeply related to the debate on climate change. Alternately, we have large cutting machines and those machines don’t care whether they are cutting trees or people; if a wood chipper could pick things up but not distinguish between which things it was picking up it would be very dangerous (but still very useful if you have a large space of things you want chipped).
Toby, I spent a while looking into the meta-ethical debates about realism. When I thought moral realism was a likely option on the table, I meant:
Strong Moral Realism: All (or perhaps just almost all) beings, human, alien or AI, when given sufficient computing power and the ability to learn science and get an accurate map-territory distinction, will agree on what physical state the universe ought to be transformed into, and therefore they will assist you in transforming it into this state.
But modern philosophers who call themselves “realists” don’t mean anything nearly this strong. They mean that that there are moral “facts”. But what use is it if the paperclipper agrees that it is a “moral fact” that human rights ought to be respected, if it then goes on to say it has no desire to act according to the prescription of moral facts, and moral facts can’t somehow revoke it.
The force of “scientific facts” is that they constrain the world. If an alien wants to get from Andromeda to here, it has to take at least 2.5 million years, the physical fact of the finite speed of light literally stops the alien from getting here sooner, whether it likes it or not.
The 56.3/27.7% split on philpapers seems to me to be an argument about whether you should be allowed to attach the word “fact” to your preferences, kind of as a shiny badge of merit, without actually disagreeing on any physical prediction about the world. The debate between weak moral realists and antirealists sounds like the debate where two people ask “if a tree falls in the forest, does it really make a sound?”—they’re not arguing about anything substantive.
So, I ask, how many philosophers are strong moral realists, in the sense I defined?
EDIT: After seeing Carl’s comment, it seems likely to me that there probably are a bunch of theists who would, in fact, support the strong moral realism position; but they’re clowns, so who cares.
I strongly agree with Roko that something like his strong version is the interesting version. What matters is what range of creatures will come to agree on outcomes; it matters much less what range of creatures think their desires are “right” in some absolute sense, if they don’t think that will eventually be reflected in agreement.
Roko’s question seems engineered to be wrong to me.
If this is what people think moral realism means—or should mean—no wonder they disagree with it.
In the context of this comment, the goal of FAI can be said to be to constrain the world by “moral facts”, just like laws of physics constrain the world by “physical facts”. This is the sense in which I mean “FAI=Physical Laws 2.0”.
Only in a useless way: there is a specific FAI that does the “truly right” thing, but the truthhood of rightness doesn’t stop you from having to code the rightness in. Goodness is not discoverably true: if you don’t already know exactly what goodness is, you can’t find out.
I’m describing the sense of post-FAI world.
hmmm. That is interesting. Well, let us define the collection W_i of worlds run by superintelligences with the subscript i ranging over goals. No matter what i is, those worlds are going to look, to any agents in them, like worlds with “moral truths”.
However, any agent that learned the real physics of such a world would see that the goodness is written in to the initial conditions, not the laws.
Roko, you make a good point that it can be quite murky just what realism and anti-realism mean (in ethics or in anything else). However, I don’t agree with what you write after that. Your Strong Moral Realism is a claim that is outside the domain of philosophy, as it is an empirical claim in the domain of exo-biology or exo-sociology or something. No matter what the truth of a meta-ethical claim, smart entities might refuse to believe it (the same goes for other philosophical claims or mathematical claims).
Pick your favourite philosophical claim. I’m sure there are very smart possible entities that don’t believe this and very smart ones that do. There are probably also very smart entities without the concepts needed to consider it.
I understand why you introduced Strong Moral Realism: you want to be able to see why the truth of realism would matter and so you came up with truth conditions. However, reducing a philosophical claim to an empirical one never quite captures it.
For what its worth, I think that the empirical claim Strong Moral Realism is false, but I wouldn’t be surprised if there was considerable agreement among radically different entities on how to transform the world.
If there’s a philosophical claim that intelligent agents across the universe wouldn’t display massive agreement on, then I don’t really think it is worth its salt. I think that this principle can be used to eliminate a lot of nonsense from philosophy.
Which of anti-realism or weak realism is true seems to be a question we can eliminate. Whether strong realism is true or not seems substantive, because it matters to our policy which is true.
There are clearly some examples where there can be interesting things to say that aren’t really empirical, e.g. decision theory, mystery of subjective experience. But I think that this isn’t one of them.
Suffice it to say I can’t think of anything that makes the debate between weak realism and antirealism at all interesting or worthy of attention. Certainly, Friendly AI theorists ought not care about the difference, because the empirical claims about an AI system will do are identical. Once the illusions and fallacies surrounding rationalist moral psychology has been debunked, proponents of other AI motivation methods than FAI also ought not to care about the weak realism vs. anti-realism pseudo-question
I’m having trouble reconciling this with the beginning of your first comment:
Not me.
An “optimal organism” may be a possibility, though. Assuming god’s utility function, it is theoretically possible that a unique optimal agent might exist. Whether it would be found before the universal heat death is another issue, though.
From my naturalist POV, you need to show me a paperclipper before it is convincing evidence about the real world. Paperclipper’s are theoretical possibilities, but who would build one, why, and how long would it last in the wild?
...and if the “paperclips” part is a metaphor, then which preferred ordered atomic states count, and which don’t? Is a cockroach a “paperclipper”—because it acts as though it wants to fill the universe with its DNA?
Yes, paperclips are a metaphor. No one expects a literal paperclip maximizer; the point is to illustrate unFriendly AI as a really powerful system with little or no moral worth as humans would understand moral worth. A non-conscious superintelligent cockroach-type thing that fills the universe with its DNA or equivalent would indeed qualify.
In that case, I don’t think a division of superintelligences into paperclippers and non-paperclippers “carves nature at the joints” very well. It appears to be a human-centric classification scheme.
I’ve proposed another way of classifying superintelligence goal systems—according to whether or not they are “handicapped”.
Healthy superintelligences execute god’s utility function—i.e. they don’t value anything apart from their genes.
Handicapped superintelligences value other things—paperclips, gold atoms, whatever. Genes are valued too—but they may only have proximate value.
According to this classification scheme, the cockroach and paperclipper would be in different categories.
“Handicapped” superintelligences value things besides their genes. They typically try and leave something behind. Most other agents keep dissipating negentropy until they have flattened energy gradients as much as they can—the way most living ecosystems do.
http://alife.co.uk/essays/handicapped_superintelligence/
Yes, that’s the point! We’re humans, and so for some purposes we find it useful to categorize superintelligences into those that do and don’t do what we want, even if it isn’t a natural categorization from a more objective standpoint.
Right—well, fine. One issue is that the classification into paperclippers and non-paperclippers was not clear to me until you clarified it. Another poster has “clarified” things the other way in response to the same comment. So, as a classification scheme, IMO the idea seems rather vague and unclear.
The next issue is: how close does an agent have to be to what you (we?) want before it is a non-paperclipper?
IMO, the idea of a metaphorical unfriendly paperclipper appears to need pinning down before it is of very much use as a means of superintelligence classification scheme.
I’m pretty confident Roko agrees with me and that this is just a communication error.
I’m given to understand that the classification scheme is Friendly versus unFriendly, with paperclip maximizer being an illustrative (albeit not representative) example of the latter. I agree that more rigor (and perhaps clearer terminology) is in order.
Machine intelligences seem likely to vary in their desirability to humans.
Friendly / unFriendly seems rather binary, maybe a “desirability” scale would help.
Alas, this seems to be drifting away from the topic.
Technically true. However, most naive superintelligence designs will simply kill all humans. You’ve accomplished quite a lot to even get to a failed utopia, much less deciding whether you want Prime Intellect or Coherent Extrapolated Volition.
It’s also unlikely you’ll accidentally do something significantly worse than killing all humans, for the same reasons. A superintelligent sadist is just as hard as a utopia.
I read the essay you linked to. I really don’t know where to start.
So we should guard against potential threats from non-human intelligent life by building a non-human superintelligence that doesn’t care about humans?
Postpone? I thought the point of friendly AI was to preserve human values for as long as physically possible. “Evolutionary progress?” Evolution is stupid and doesn’t care about the individual organisms. Evolution causes pointless suffering and death. It produces stupid designs. As Michael Vassar once said: think of all the simple things that evolution didn’t invent. The wheel. The bow and arrow. The axial-flow pump. Evolution had billions of years creating and destroying organisms and it couldn’t invent stuff built by cave men. Is it OK in your book that people die of antibiotic resistant diseases? MRSA is a result of evolutionary progress.
Who said humans have to live on planets or breathe oxygen or run on neurons? Why do you think a superintelligence will have problems dealing with asteroids when humans today are researching ways to deflect them?
I think your main problem is that you’re valuing the wrong thing. You practically worship evolution while neglecting important things like people, animals, or anything that can suffer. Also, I think you fail to notice the huge first-mover advantage of any superintelligence, even one as “handicapped” as a friendly AI.
Finally, I know the appearance of the arguer doesn’t change the validity of the argument, but I feel compelled to tell you this: You would look much better with a haircut, a shave, and some different glasses.
Briefly:
I don’t avocate building machines that are indiffierent to humans. For instance, I think machine builders would be well advised to (and probably mostly will) construct devices that obey the law—which includes all kinds of provisions for preventing harm to humans.
Evolution did produce the wheel and the bow and arrow. If you think otherwise, please state clearly what definition of the term “evolution” you are using.
Regarding space travel—I was talking about wetware humans.
Re: “Why do you think a superintelligence will have problems dealing with asteroids when humans today are researching ways to deflect them?”
...that is a projection on your part—not something I said.
Re: “Also, I think you fail to notice the huge first-mover advantage of any superintelligence”
To quote mine myself:
“IMHO, it is indeed possible that the first AI will effectively take over the world. I.T. is an environment with dramatic first-mover advantages. It is often a winner-takes-all market – and AI seems likely to exhibit such effects in spades.”
http://www.overcomingbias.com/2008/05/roger-shank-ai.html
“Google was not the first search engine, Microsoft was not the first OS maker—and Diffie–Hellman didn’t invent public key crypto.
Being first does not necessarily make players uncatchable—and there’s a selection process at work in the mean time, that weeds out certain classes of failures.”
http://lesswrong.com/lw/1mm/advice_for_ai_makers/1gkg
I have thought and written about this issue quite a bit—and my position seems a bit more nuanced and realistic than the position you are saying you think I should have.
Superintelligences don’t have genes.
Well, most superintelligences don’t have genes.
They do if you use an information-theory definition of the term—like the ones on:
http://alife.co.uk/essays/informational_genetics/
I disagree even with your interpretation of that document, but that is not the point emphasized in the grandparent. I acknowledge that while a superintelligence need not have genes it is in fact possible to construct a superintelligence that does relies significantly on “small sections of heritable information”, including the possibility of a superintelligence that relies on genes in actual DNA. Hence the slight weakening of the claim.
What follows is just a copy-and-paste of another reply, but:
By “gene” I mean:
“Small chunk of heritable information”
http://alife.co.uk/essays/informational_genetics/
Any sufficiently long-term persistent structure persists via a copying process—and so has “genes” in this sense.
I think your term “God’s utility function” is a bit confusing—as if it’s just one utility function. If you value your genes, and I value my genes, and our genes are different, then we have different utility functions.
Also, the vast majority of possible minds don’t have genes.
Maybe. Though if you look at:
http://originoflife.net/gods_utility_function/
...then first of all the term is borrowed/inherited from:
http://en.wikipedia.org/wiki/God%27s_utility_function
...and also, I do mean it in a broader sense where (hopefully) it makes a bit more sense.
The concept is also referred to as “Goal system zero”—which I don’t like much.
My latest name for the idea is “Shiva’s goals” / “Shiva’s values”—a reference to the Hindu god of destruction, creation and transformation.
By “gene” I mean:
“Small chunk of heritable information”
http://alife.co.uk/essays/informational_genetics/
Any sufficiently long-term persistent structure persists via a copying process—and so has “genes” in this sense.
What we mean by preference. Except that preference, being a specification of a computation, has a lot of forms of expression, so it doesn’t “persist” by a copying process, it “persists” as a nontrivial computational process.
A superintelligence that persists in copying a given piece of information is running a preference (computational process) that specifies copying as the preferable form of expression, over all the other things it could be doing.
No, no! Genes is just intended to refer to any heritable information. Preferences are something else entirely. Agents can have preferences which aren’t inherited—and not everything that gets inherited is a preference.
Anything information that persists over long periods of time persists via copying.
“Copying” just means there’s Shannon-mutual information between the source and the destination which originated in the source. Complex computations are absolutely included—provided that they share this property.
Then preference still qualifies. This holds as a factual claim provided we are talking about reflectively consistent agents (i.e. those that succeed in not losing their preference), and as a normative claim regardless.
I would appreciate it if you avoid redefining words into highly qualified meanings, like “gene” for “anything that gets copied”, and then “copying” for “any computation process that preserves mutual information”.
Re: Then preference still qualifies. This holds as a factual claim provided [bunch of conditions]
Yes, there are some circumstances under which preferences are coded genetically and reliably inherited. However, your claim was stronger. You said what meant by genes was what “we” would call preferences. That implies that genes are preferences and preferences are genes.
You have just argued that a subset of preferences can be genetically coded—and I would agree with that. However, you have yet to argue that everything that is inherited is a preference.
I think you are barking up the wrong tree here—the concepts of preferences and genes are just too different. For example, clippy likes paperclips, in addition to the propagation of paperclip-construction instructions. The physical paperclips are best seen as phenotype—not genotype.
Re: “I would appreciate it if you avoid redefining words into highly qualified meanings [...]”
I am just saying what I mean—so as to be clear.
If you don’t want me to use the words “copy” and “gene” for those concepts—then you are out of luck—unless you have a compelling case to make for better terminology. My choice of words in both cases is pretty carefully considered.
Not “bunch of conditions”. Reflective consistency is the same concept as “correctly copying preference”, if I read your sense of “copying” correctly, and given that preference is not just “thing to be copied”, but also plays the appropriate role in decision-making (wording in the grandparent comment improved). And reflectively consistent agents are taken as a natural and desirable (from the point of view of those agents) attractor where all agents tend to end up, so it’s not just an arbitrary category of agents.
But there are many different preferences for different agents, just as there are different genes. Using the word “genes” in the context where both human preference and evolution are salient is misleading, because human genes, even if we take them as corresponding to a certain preference, don’t reflect human preference, and are not copied in the same sense human preference is copied. Human genes are exactly the thing that currently persists by vanilla “copying”, not by any reversible (mutual information-preserving) process.
Confusing terminology is still bad even if you failed to think up a better alternative.
Confusing terminology is still bad even if you failed to think up a better alternative.
You appear to be on some kind of different planet to me—and are so far away that I can’t easily see where your ideas are coming from.
The idea I was trying to convey was really fairly simple, though:
“Small chunks of heritable information” (a.k.a. “genes”) are one thing, and the term “preferences” refers to a different concept.
As an example of a preference that is not inherited, consider the preference of an agent for cats—after being bitten by a dog as a child.
As an example of something that is inherited that is not a preference, consider the old socks that I got from my grandfather after his funeral.
These are evidently different concepts—thus the different terms.
Thanks for your terminology feedback. Alas, I am unmoved. That’s the best terminology I have found, and you don’t provide an alternative proposal. It is easy to bitch about terminology—but not always so easy to improve on it.
I meant a literal paperclip maximizing superintelligent AI, so no, a cockroach is not one of those.
Right—well, that seems pretty unlikely. What is the story? A paperclip manufacturer with an ambitious IT department that out-performs every other contender? How come the government doesn’t just step on the results?
Is there a story about how humanity drops the reins of civilisation like that that is not extremely contrived?
I am unclear on how this story is contrived. There are vast numbers of business with terrible externalities today; this is deeply related to the debate on climate change. Alternately, we have large cutting machines and those machines don’t care whether they are cutting trees or people; if a wood chipper could pick things up but not distinguish between which things it was picking up it would be very dangerous (but still very useful if you have a large space of things you want chipped).