Re: the two claims, that’s different from what I thought you meant by the distinction. I would describe both dot points as being normative claims buttressed by empirical claims. To the extent that I see a difference, it’s that the first dot point is perhaps addressing low-probability risks, while the second is addressing medium-to-high-probability risks. I think that pushing for mechanistic transparency would address medium-to-high-probability risks, but don’t argue for that here, since I think the arguments for medium-to-high-probability risk from AI are better made elsewhere.
Hmm, I was more pointing at the distinction where the first claim doesn’t need to argue for the subclaim “we will be able to get people to use mechanistic transparency” (it’s assumed away by “if I were in charge of the world”), while the second claim does have to argue for it.
I am mostly interested in allowing the developers of AI systems to determine whether their system has the cognitive ability to cause human extinction, and whether their system might try to cause human extinction.
The way I read this, if the research community enables the developers to determine these things at prohibitive cost, then we mostly haven’t “allowed” them to do it, but if the cost is manageable then we have. So I’d say my desiderata here (and also in my head) include the cost being manageable. If the cost of any such approach were necessarily prohibitive, I wouldn’t be very excited about it.
Re: the two claims, that’s different from what I thought you meant by the distinction. I would describe both dot points as being normative claims buttressed by empirical claims. To the extent that I see a difference, it’s that the first dot point is perhaps addressing low-probability risks, while the second is addressing medium-to-high-probability risks. I think that pushing for mechanistic transparency would address medium-to-high-probability risks, but don’t argue for that here, since I think the arguments for medium-to-high-probability risk from AI are better made elsewhere.
Hmm, I was more pointing at the distinction where the first claim doesn’t need to argue for the subclaim “we will be able to get people to use mechanistic transparency” (it’s assumed away by “if I were in charge of the world”), while the second claim does have to argue for it.
The way I read this, if the research community enables the developers to determine these things at prohibitive cost, then we mostly haven’t “allowed” them to do it, but if the cost is manageable then we have. So I’d say my desiderata here (and also in my head) include the cost being manageable. If the cost of any such approach were necessarily prohibitive, I wouldn’t be very excited about it.