So, what happens when we figure out how to align language models? By then the state-of-the-art will involve having multi-modal models. Assume we have figured out how to make steganography not a problem in chain of thoughts reasoning. But maybe that is now kind of useless because there are so many more channels that could be stenographically exploited. Or maybe some completely different problem that we haven’t even thought about yet will come up.
So, what happens when we figure out how to align language models? By then the state-of-the-art will involve having multi-modal models. Assume we have figured out how to make steganography not a problem in chain of thoughts reasoning. But maybe that is now kind of useless because there are so many more channels that could be stenographically exploited. Or maybe some completely different problem that we haven’t even thought about yet will come up.