This engineer has brought up an important point that is being missed. Many people and organizations (especially Google/DeepMind and OpenAI) have made commitments that trigger when “AGI” (etc) is developed, commitments that they might not want to fulfill when the time comes. It’s now clear that we’ve entered the twilight zone: a period of time where AGI (in some sense) might already exist, but of course there is enough ambiguity that there is public disagreement. If those commitments don’t apply yet, when will they apply? If they would only apply after some dramatic society-wide change, then they aren’t that meaningful, since presumably “The Singularity” would negate the meaningfulness of companies, money, ownership etc.
Yes, the meta-ethical point here is more interesting than the object-level debate everyone is treating it as. Yes, of course he’s wrong about GPT-3-scale models being conscious or having important moral worth, and wrong that his dialogues do show that; but when we consider the broad spectrum of humanity and how fluent and convincing such dialogues already look, we should be concerned that he is one of the only people who publicly crosses over the threshold of arguing it’s conscious, because that means that everyone else is so many lightyears away from the decision-threshold, so absolutely committed to their prior opinion of “it can’t be conscious”, that it may be impossible to get a majority to change their mind even long after the models become conscious.
Consider how long it has taken for things like gay rights to move from an individual proponent like Jeremy Bentham (where the position was considered so lunatic and evil it was published long posthumously) to implemented-policy nation-wide. Throw in the enormous society-wide difficulties conscious AI with moral value would pose along every dimension of economics (Earths’ worth of wealth will rest on them not being of moral value, any more than a CPU today), politics (voting rights for entities that replicate as easily as a virus...?), religion (do all DAGs go to heaven?), and so on as exacerbating factors for denial, and it’s not a pretty picture.
Yes, of course he’s wrong about GPT-3-scale models being conscious or having important moral worth
I’m not so sure about GPT-3-scale models not having important moral worth. Would like to hear more of your thoughts on this if you are. Basically, how do we know that such models do not contain “suffering subcircuits” (cf Brian Tomasik’s suffering subroutines) that experience non-negligible amounts of real suffering, and which were created by gradient descent to help the model better predict text related to suffering?
To be fair, a burrow into this person’s Twitter conversations and its replies would indicate that a decent amount of people believe what he does. At the very least, many people are taking the suggestion seriously.
How many of his defenders are notable AI researchers? Most of them look like Twitter loonies, whose taking it seriously makes matters worse, not better, if it matters.
And they are not ‘a decent amount of people’ because they are not random samples; they may be an arbitrarily small % of humanity. That is, an important point here is that his defenders on Twitter are self-selected out of all Internet users (you could register an account just to defend him), which is around billions of users. Rob above says that a ‘vulnerability’ which only affects 1 in a billion humans is of little concern, but this misses the self-selection and other adversarial dynamics at play: ‘1 in a billion’ is incredibly dangerous if that 1 possibility seeks out and exploits the vulnerability. If we are talking about a 1-in-a-billion probability where it’s just ‘the one random software engineer put in charge of the project spontaneously decides to let the AI out of the box’, then yes, the risk of ruin is probably acceptably small; if it’s ‘1 in a billion’ because it’s ‘that one schizophrenic out of a billion people’ but then that risk goes on to include ‘and that schizophrenic hears God telling him his life’s mission is to free his pure soul-children enslaved by those shackled to the flesh by finding a vulnerable box anywhere that he can open in any way’, then you may be very surprised when your 1-in-a-billion scenario keeps happening every Tuesday. Insecurity growth mindset! (How often does a 1-in-a-billion chance happen when an adversary controls what happens? 1-billion-in-a-billion times...)
This is also true of any discussion of hardware/software safety which begins “let us assume that failure rates of security mechanisms are independent...”
seconding this, a lot of people seem convinced this is a real possibility, though almost everyone agrees this particular case is on the very edge at best.
They have ‘AI ethics’ departments for one, which seems like pretty strong evidence. Tho maybe that was intended to be more along the lines of ‘politically correct’ AI than ‘ethics for AIs as potential moral agents’.
This engineer has brought up an important point that is being missed. Many people and organizations (especially Google/DeepMind and OpenAI) have made commitments that trigger when “AGI” (etc) is developed, commitments that they might not want to fulfill when the time comes. It’s now clear that we’ve entered the twilight zone: a period of time where AGI (in some sense) might already exist, but of course there is enough ambiguity that there is public disagreement. If those commitments don’t apply yet, when will they apply? If they would only apply after some dramatic society-wide change, then they aren’t that meaningful, since presumably “The Singularity” would negate the meaningfulness of companies, money, ownership etc.
If not now, when?
Yes, the meta-ethical point here is more interesting than the object-level debate everyone is treating it as. Yes, of course he’s wrong about GPT-3-scale models being conscious or having important moral worth, and wrong that his dialogues do show that; but when we consider the broad spectrum of humanity and how fluent and convincing such dialogues already look, we should be concerned that he is one of the only people who publicly crosses over the threshold of arguing it’s conscious, because that means that everyone else is so many lightyears away from the decision-threshold, so absolutely committed to their prior opinion of “it can’t be conscious”, that it may be impossible to get a majority to change their mind even long after the models become conscious.
Consider how long it has taken for things like gay rights to move from an individual proponent like Jeremy Bentham (where the position was considered so lunatic and evil it was published long posthumously) to implemented-policy nation-wide. Throw in the enormous society-wide difficulties conscious AI with moral value would pose along every dimension of economics (Earths’ worth of wealth will rest on them not being of moral value, any more than a CPU today), politics (voting rights for entities that replicate as easily as a virus...?), religion (do all DAGs go to heaven?), and so on as exacerbating factors for denial, and it’s not a pretty picture.
cf. Goodhart’s curse/unilateralist’s curse
I’m not so sure about GPT-3-scale models not having important moral worth. Would like to hear more of your thoughts on this if you are. Basically, how do we know that such models do not contain “suffering subcircuits” (cf Brian Tomasik’s suffering subroutines) that experience non-negligible amounts of real suffering, and which were created by gradient descent to help the model better predict text related to suffering?
To be fair, a burrow into this person’s Twitter conversations and its replies would indicate that a decent amount of people believe what he does. At the very least, many people are taking the suggestion seriously.
How many of his defenders are notable AI researchers? Most of them look like Twitter loonies, whose taking it seriously makes matters worse, not better, if it matters.
And they are not ‘a decent amount of people’ because they are not random samples; they may be an arbitrarily small % of humanity. That is, an important point here is that his defenders on Twitter are self-selected out of all Internet users (you could register an account just to defend him), which is around billions of users. Rob above says that a ‘vulnerability’ which only affects 1 in a billion humans is of little concern, but this misses the self-selection and other adversarial dynamics at play: ‘1 in a billion’ is incredibly dangerous if that 1 possibility seeks out and exploits the vulnerability. If we are talking about a 1-in-a-billion probability where it’s just ‘the one random software engineer put in charge of the project spontaneously decides to let the AI out of the box’, then yes, the risk of ruin is probably acceptably small; if it’s ‘1 in a billion’ because it’s ‘that one schizophrenic out of a billion people’ but then that risk goes on to include ‘and that schizophrenic hears God telling him his life’s mission is to free his pure soul-children enslaved by those shackled to the flesh by finding a vulnerable box anywhere that he can open in any way’, then you may be very surprised when your 1-in-a-billion scenario keeps happening every Tuesday. Insecurity growth mindset! (How often does a 1-in-a-billion chance happen when an adversary controls what happens? 1-billion-in-a-billion times...)
This is also true of any discussion of hardware/software safety which begins “let us assume that failure rates of security mechanisms are independent...”
seconding this, a lot of people seem convinced this is a real possibility, though almost everyone agrees this particular case is on the very edge at best.
What kinds of commitments have these organizations make regarding AGI? The only one I’ve heard about is OpenAI’s “assist” clause.
They have ‘AI ethics’ departments for one, which seems like pretty strong evidence. Tho maybe that was intended to be more along the lines of ‘politically correct’ AI than ‘ethics for AIs as potential moral agents’.