It’s the nature of hypotheticals to accomplish things they are defined as being able to accomplish. Friendly AI is such a creature that is able and willing to accomplish the associated feats. If it’s not going to do so, it’s not a Friendly AI. If it’s not possible to make it so, Friendly AI is impossible. Even if provably impossible, it still has the property of being able to do these things, as a hypothetical.
A hypothetical is typically something that you define as an aid to reason about something else. It is very tricky to set up FAI as a hypothetical construct, when the possibility of FAI is what you want to talk about.
Here’s my problem. I want the underlying problems in the notion of what an FAI is, to be resolved. Most of these problems are hidden by the definitions used. People need to think about how to implement the concept they’ve defined, in order to see the problems with the definition.
A hypothetical is typically something that you define as an aid to reason about something else. It is very tricky to set up FAI as a hypothetical construct, when the possibility of FAI is what you want to talk about.
It is a typical move in any problem about constructing a mathematical structure, for example in typical school compass and straightedge constructions problems. First, you assume that you’ve done what you needed to do, and figure out the properties it implies (requires); then, you actually construct the structure and prove that it has the required properties. It’s also a standard thing in decision theory, to assume that you’ve made a certain action, and then look what would happen if you do that, all in order to determine which action will be actually chosen (even though it’s impossible that the action that you won’t actually choose will happen).
Here’s my problem. I want the underlying problems in the notion of what an FAI is, to be resolved. Most of these problems are hidden by the definitions used. People need to think about how to implement the concept they’ve defined, in order to see the problems with the definition.
The most frequent problems with definitions are relevance, or emptiness (which feeds into relevance), in pathological cases tendency to mislead. (There are many possible problems.) You might propose a better (more relevant, that is more useful) definition, or prove that the defined concept is empty.
It’s the nature of hypotheticals to accomplish things they are defined as being able to accomplish. Friendly AI is such a creature that is able and willing to accomplish the associated feats. If it’s not going to do so, it’s not a Friendly AI. If it’s not possible to make it so, Friendly AI is impossible. Even if provably impossible, it still has the property of being able to do these things, as a hypothetical.
A hypothetical is typically something that you define as an aid to reason about something else. It is very tricky to set up FAI as a hypothetical construct, when the possibility of FAI is what you want to talk about.
Here’s my problem. I want the underlying problems in the notion of what an FAI is, to be resolved. Most of these problems are hidden by the definitions used. People need to think about how to implement the concept they’ve defined, in order to see the problems with the definition.
It is a typical move in any problem about constructing a mathematical structure, for example in typical school compass and straightedge constructions problems. First, you assume that you’ve done what you needed to do, and figure out the properties it implies (requires); then, you actually construct the structure and prove that it has the required properties. It’s also a standard thing in decision theory, to assume that you’ve made a certain action, and then look what would happen if you do that, all in order to determine which action will be actually chosen (even though it’s impossible that the action that you won’t actually choose will happen).
The most frequent problems with definitions are relevance, or emptiness (which feeds into relevance), in pathological cases tendency to mislead. (There are many possible problems.) You might propose a better (more relevant, that is more useful) definition, or prove that the defined concept is empty.