Problem I see, our values are defined in a stable way only inside the distribution. I.e. for the situations which are similar to those we have already experienced.
Outside of it there may be many radically different extrapolations which are consistent with themselves and with our values inside the distribution. And it’s problem not with AI, but with the values themselves.
For example, there is no correct answer about what the human is. I.e. how much we can “improve” the human until it stops being a human. We can choose different answers and they will all be consistent with out pre-singularity concept of the human, and do not contradict with already established values.
Yeah. Or rather, we do have one possible answer—let the person themselves figure out by what process they want to be extrapolated, as steven0461 explained in this old thread—but that answer isn’t very good, as it’s probably very sensitive to initial conditions, like which brand of coffee you happened to drink before you started self-extrapolating.
This is actually a problem, but I do not believe there’s a single answer to that question, indeed I suspect there are an infinite number of valid ways to answer the question (once we consider multiverses)
And I think the sensitivity to initial condition and assumptions is exactly what morality and values have. That is, one can freely change you assumptions, thus leading to inconsistent but complete morality.
The point is that your starting assumptions and conditions matter for what you eventually want to end up in.
Problem I see, our values are defined in a stable way only inside the distribution. I.e. for the situations which are similar to those we have already experienced.
Outside of it there may be many radically different extrapolations which are consistent with themselves and with our values inside the distribution. And it’s problem not with AI, but with the values themselves.
For example, there is no correct answer about what the human is. I.e. how much we can “improve” the human until it stops being a human. We can choose different answers and they will all be consistent with out pre-singularity concept of the human, and do not contradict with already established values.
Yeah. Or rather, we do have one possible answer—let the person themselves figure out by what process they want to be extrapolated, as steven0461 explained in this old thread—but that answer isn’t very good, as it’s probably very sensitive to initial conditions, like which brand of coffee you happened to drink before you started self-extrapolating.
“Making decision oneself” will also become a very vague concept when superconvincing AIs are running around.
This is actually a problem, but I do not believe there’s a single answer to that question, indeed I suspect there are an infinite number of valid ways to answer the question (once we consider multiverses)
And I think the sensitivity to initial condition and assumptions is exactly what morality and values have. That is, one can freely change you assumptions, thus leading to inconsistent but complete morality.
The point is that your starting assumptions and conditions matter for what you eventually want to end up in.