Maybe Microsoft should publish the random seed used for each conversation, in order to make conversations reproducible?
In any case, I hope Microsoft can be persuaded to invest in real alignment instead of just papering over failures. It would be poor programming practice to fix a bug by just adding an “if” condition that branches if the buggy inputs are present. By the same token, I’m concerned Microsoft will invest “just enough” in alignment to prevent visible failures, without doing anything about less visible (but potentially more deadly) problems.
Maybe Microsoft should publish the random seed used for each conversation, in order to make conversations reproducible?
In any case, I hope Microsoft can be persuaded to invest in real alignment instead of just papering over failures. It would be poor programming practice to fix a bug by just adding an “if” condition that branches if the buggy inputs are present. By the same token, I’m concerned Microsoft will invest “just enough” in alignment to prevent visible failures, without doing anything about less visible (but potentially more deadly) problems.