I think that the most interesting thing about the comments here is that no one actually proposed a predicate that could be used to distinguish between something that might be a person and something that definitely isn’t a person (to rephrase Eliezer’s terms).
It is, to be fair, a viciously hard problem. I’ve thought through 10 or 20 possible predicates or approaches to finding predicates, and exactly one of them is of any value at all; even then it would restrict an AI’s ability to model other intelligences to a degree that is probably unacceptable unless we can find other, complementary predicates. It may be a trivial predicate of the sort that Eliezer has already considered and dismissed. But enough with the attempts to signal my lack of certitude.
The problem as presented in this post is, first of all, a little unclear. We are concerned with the creation of simulations that are people, but to prevent this run-away-screaming tragedy we should probably have some way of distinguishing between a simulation and a code module that is a part of the AI itself; if a sentient AI were to delete some portion of its own code to make way for an improved version, it would not seem to be problematic, and I will assume that this behavior is not what we are screening for here.
To cast as wide a net as possible, I would define a simulation as some piece of an AI that can not access all of the information available to that AI; that is, there are some addresses in memory for which the simulation lacks read permissions or knowledge of their existence. Because data and code are functionally identical, non-simulation modules would then by definition be able to access every function comprising the AI; I don’t think that we could call such a module a separate consciousness. (The precise definition would necessarily depend on the AI’s implementation; a Bayesian AI might be more concerned with statistical evidence than memory pointers, e.g.)
Even relying on this definition, a predicate that entirely rules out the simulation of a person is no picnic. The best I’ve been able to come up with is:
A simulation can be guaranteed to not be a person if it is not Turing complete.
I don’t know enough language theory to say whether linear bounded automata should also be excluded (or to even link somewhere more helpful than Wikipedia). It might be necessary to restrict simulations to push-down automata, which are much less expressive.
ETA: Another possible predicate would be
A simulation can be guaranteed to not be a person if it includes no functions that act as an expected utility function under the definition offered by Cumulative Prospect Theory.
If there’s a theoretical definition of an expected utility function that is superior to CPT’s, then please imagine that I proposed that instead.
A simulation can be guaranteed to not be a person if it is not Turing complete.
What does that mean? A single run of an algorithm can’t be said to be Turing complete or incomplete. Completeness is a property of algorithms taken as functions over all possible inputs.
I think that the most interesting thing about the comments here is that no one actually proposed a predicate that could be used to distinguish between something that might be a person and something that definitely isn’t a person (to rephrase Eliezer’s terms).
It is, to be fair, a viciously hard problem. I’ve thought through 10 or 20 possible predicates or approaches to finding predicates, and exactly one of them is of any value at all; even then it would restrict an AI’s ability to model other intelligences to a degree that is probably unacceptable unless we can find other, complementary predicates. It may be a trivial predicate of the sort that Eliezer has already considered and dismissed. But enough with the attempts to signal my lack of certitude.
The problem as presented in this post is, first of all, a little unclear. We are concerned with the creation of simulations that are people, but to prevent this run-away-screaming tragedy we should probably have some way of distinguishing between a simulation and a code module that is a part of the AI itself; if a sentient AI were to delete some portion of its own code to make way for an improved version, it would not seem to be problematic, and I will assume that this behavior is not what we are screening for here.
To cast as wide a net as possible, I would define a simulation as some piece of an AI that can not access all of the information available to that AI; that is, there are some addresses in memory for which the simulation lacks read permissions or knowledge of their existence. Because data and code are functionally identical, non-simulation modules would then by definition be able to access every function comprising the AI; I don’t think that we could call such a module a separate consciousness. (The precise definition would necessarily depend on the AI’s implementation; a Bayesian AI might be more concerned with statistical evidence than memory pointers, e.g.)
Even relying on this definition, a predicate that entirely rules out the simulation of a person is no picnic. The best I’ve been able to come up with is:
I don’t know enough language theory to say whether linear bounded automata should also be excluded (or to even link somewhere more helpful than Wikipedia). It might be necessary to restrict simulations to push-down automata, which are much less expressive.
ETA: Another possible predicate would be
If there’s a theoretical definition of an expected utility function that is superior to CPT’s, then please imagine that I proposed that instead.
What does that mean? A single run of an algorithm can’t be said to be Turing complete or incomplete. Completeness is a property of algorithms taken as functions over all possible inputs.