Personally, I think that overall it’s good on the margin for staff at companies risking human extinction to be sharing their perspectives on criticisms and moving towards having dialogue at all, so I think (what I read as) your implicit demand for Evan Hubinger to do more work here is marginally unhelpful; I weakly think quick takes like this are marginally good.
I will add: It’s odd to me, Stephen, that this is your line for (what I read as) disgust at Anthropic staff espousing extremely convenient positions while doing things that seem to you to be causing massive harm. To my knowledge the Anthropic leadership has ~never engaged in public dialogue about why they’re getting rich building potentially-omnicidal-minds with worthy critics like Hinton, Bengio, Russell, Yudkowsky, etc, so I wouldn’t expect them or their employees to have high standards for public defenses of far less risky behavior like working with the US military.[1]
As an example of the low standards for Anthropic’s public discourse, notice how a recent essay about what’s required for Anthropic to succeed at AI Safety by Sam Bowman (a senior safety researcher at Anthropic) flatly states “Our ability to do our safety work depends in large part on our access to frontier technology… staying close to the frontier is perhaps our top priority in Chapter 1” with ~no defense of this claim or engagement with the sorts of reasons that I consider adding a marginal competitor to the suicide race is an atrocity, or acknowledgement that this makes him personally very wealthy (i.e. he and most other engineers at Anthropic will make millions of dollars due to Anthropic acting on this claim).
I think that overall it’s good on the margin for staff at companies risking human extinction to be sharing their perspectives on criticisms and moving towards having dialogue at all
No disagreement.
your implicit demand for Evan Hubinger to do more work here is marginally unhelpful
The community seems to be quite receptive to the opinion, it doesn’t seem unreasonable to voice an objection. If you’re saying it is primarily the way I’ve written it that makes it unhelpful, that seems fair.
I originally felt that either question I asked would be reasonably easy to answer, if time was given to evaluating the potential for harm.
However, given that Hubinger might have to run any reply by Anthropic staff, I understand that it might be negative to demand further work. This is pretty obvious, but didn’t occur to me earlier.
I will add: It’s odd to me, Stephen, that this is your line for (what I read as) disgust at Anthropic staff espousing extremely convenient positions while doing things that seem to you to be causing massive harm.
Ultimately, the original quicktake was only justifying one facet of Anthropic’s work so that’s all I’ve engaged with. It would seem less helpful to bring up my wider objections.
I wouldn’t expect them or their employees to have high standards for public defenses of far less risky behavior
I don’t expect them to have a high standard for defending Anthropic’s behavior, but I do expect the LessWrong community to have a high standard for arguments.
Thanks for the responses, I have a better sense of how you’re thinking about these things.
I don’t feel much desire to dive into this further, except I want to clarify one thing, on the question of any demands in your comment.
I originally felt that either question I asked would be reasonably easy to answer, if time was given to evaluating the potential for harm.
However, given that Hubinger might have to run any reply by Anthropic staff, I understand that it might be negative to demand further work. This is pretty obvious, but didn’t occur to me earlier.
That actually wasn’t primarily the part that felt like a demand to me. This was the part:
How much time have you spent analyzing the positive or negative impact of US intelligence efforts prior to concluding that merely using Claude for intelligence “seemed fine”?
I’m not quite sure what the relevance of the time was if not to suggest it needed to be high. I felt that this line implied something like “If your answer is around ’20 hours’, then I want to say that the correct standard should be ‘200 hours’”. I felt like it was a demand that Hubinger may have to spend 10x the time thinking about this question before he met your standards for being allowed to express his opinion on it.
But perhaps you just meant you wanted him to include an epistemic status, like “Epistemic status: <Here’s how much time I’ve spent thinking about this question>”.
Personally, I think that overall it’s good on the margin for staff at companies risking human extinction to be sharing their perspectives on criticisms and moving towards having dialogue at all, so I think (what I read as) your implicit demand for Evan Hubinger to do more work here is marginally unhelpful; I weakly think quick takes like this are marginally good.
I will add: It’s odd to me, Stephen, that this is your line for (what I read as) disgust at Anthropic staff espousing extremely convenient positions while doing things that seem to you to be causing massive harm. To my knowledge the Anthropic leadership has ~never engaged in public dialogue about why they’re getting rich building potentially-omnicidal-minds with worthy critics like Hinton, Bengio, Russell, Yudkowsky, etc, so I wouldn’t expect them or their employees to have high standards for public defenses of far less risky behavior like working with the US military.[1]
As an example of the low standards for Anthropic’s public discourse, notice how a recent essay about what’s required for Anthropic to succeed at AI Safety by Sam Bowman (a senior safety researcher at Anthropic) flatly states “Our ability to do our safety work depends in large part on our access to frontier technology… staying close to the frontier is perhaps our top priority in Chapter 1” with ~no defense of this claim or engagement with the sorts of reasons that I consider adding a marginal competitor to the suicide race is an atrocity, or acknowledgement that this makes him personally very wealthy (i.e. he and most other engineers at Anthropic will make millions of dollars due to Anthropic acting on this claim).
No disagreement.
The community seems to be quite receptive to the opinion, it doesn’t seem unreasonable to voice an objection. If you’re saying it is primarily the way I’ve written it that makes it unhelpful, that seems fair.
I originally felt that either question I asked would be reasonably easy to answer, if time was given to evaluating the potential for harm.
However, given that Hubinger might have to run any reply by Anthropic staff, I understand that it might be negative to demand further work. This is pretty obvious, but didn’t occur to me earlier.
Ultimately, the original quicktake was only justifying one facet of Anthropic’s work so that’s all I’ve engaged with. It would seem less helpful to bring up my wider objections.
I don’t expect them to have a high standard for defending Anthropic’s behavior, but I do expect the LessWrong community to have a high standard for arguments.
Thanks for the responses, I have a better sense of how you’re thinking about these things.
I don’t feel much desire to dive into this further, except I want to clarify one thing, on the question of any demands in your comment.
That actually wasn’t primarily the part that felt like a demand to me. This was the part:
I’m not quite sure what the relevance of the time was if not to suggest it needed to be high. I felt that this line implied something like “If your answer is around ’20 hours’, then I want to say that the correct standard should be ‘200 hours’”. I felt like it was a demand that Hubinger may have to spend 10x the time thinking about this question before he met your standards for being allowed to express his opinion on it.
But perhaps you just meant you wanted him to include an epistemic status, like “Epistemic status: <Here’s how much time I’ve spent thinking about this question>”.