I appreciate your point about compelling experimental evidence, and I think it’s important that we’re currently at a point with very little of that evidence. I still feel a lot of uncertainty here, and I expect the evidence to basically always be super murky and for interpretations to be varied/controversial, but I do feel more optimistic than before reading your comment.
You could find a way of proving to the world that your AI is aligned, which other labs can’t replicate, giving you economic advantage.
I don’t expect this to be a very large effect. It feels similar to an argument like “company A will be better on ESG dimensions and therefore more and customers will switch to using it”. Doing a quick review of the literature on that, it seems like there’s a small but notable change in consumer behavior for ESG-labeled products. In the AI space, it doesn’t seem to me like any customers care about OpenAI’s safety team disappearing (except a few folks in the AI safety world). In this particular case, I expect the technical argument needed to demonstrate that some family of AI systems are aligned while others are not is a really complicated argument; I expect fewer than 500 people would be able to actually verify such an argument (or the initial “scalable alignment solution”), maybe zero people. I realize this is a bit of a nit because you were just gesturing toward one of many ways it could be good to have an alignment solution.
I endorse arguing for alternative perspectives and appreciate you doing it. And I disagree with your synthesis here.
You could find a way of proving to the world that your AI is aligned, which other labs can’t replicate, giving you economic advantage.
I don’t expect this to be a very large effect. It feels similar to an argument like “company A will be better on ESG dimensions and therefore more and customers will switch to using it”. Doing a quick review of the literature on that, it seems like there’s a small but notable change in consumer behavior for ESG-labeled products.
It seems quite different to the ESG case. Customers don’t personally benefit from using a company with good ESG. They will benefit from using an aligned AI over a misaligned one.
In the AI space, it doesn’t seem to me like any customers care about OpenAI’s safety team disappearing (except a few folks in the AI safety world).
Again though, customers currently have no selfish reason to care.
In this particular case, I expect the technical argument needed to demonstrate that some family of AI systems are aligned while others are not is a really complicated argument; I expect fewer than 500 people would be able to actually verify such an argument (or the initial “scalable alignment solution”), maybe zero people.
It’s quite common for only a very small number of ppl to have the individual ability to verify a safety case, but many more to defer to their judgement. People may defer to an AISI, or a regulatory agency.
Thanks for your continued engagement.
I appreciate your point about compelling experimental evidence, and I think it’s important that we’re currently at a point with very little of that evidence. I still feel a lot of uncertainty here, and I expect the evidence to basically always be super murky and for interpretations to be varied/controversial, but I do feel more optimistic than before reading your comment.
I don’t expect this to be a very large effect. It feels similar to an argument like “company A will be better on ESG dimensions and therefore more and customers will switch to using it”. Doing a quick review of the literature on that, it seems like there’s a small but notable change in consumer behavior for ESG-labeled products. In the AI space, it doesn’t seem to me like any customers care about OpenAI’s safety team disappearing (except a few folks in the AI safety world). In this particular case, I expect the technical argument needed to demonstrate that some family of AI systems are aligned while others are not is a really complicated argument; I expect fewer than 500 people would be able to actually verify such an argument (or the initial “scalable alignment solution”), maybe zero people. I realize this is a bit of a nit because you were just gesturing toward one of many ways it could be good to have an alignment solution.
I endorse arguing for alternative perspectives and appreciate you doing it. And I disagree with your synthesis here.
It seems quite different to the ESG case. Customers don’t personally benefit from using a company with good ESG. They will benefit from using an aligned AI over a misaligned one.
Again though, customers currently have no selfish reason to care.
It’s quite common for only a very small number of ppl to have the individual ability to verify a safety case, but many more to defer to their judgement. People may defer to an AISI, or a regulatory agency.