The Curse of Reversal seems to match the lack of bidirectionality of ROME edits mentioned here: https://www.alignmentforum.org/posts/QL7J9wmS6W2fWpofd/but-is-it-really-in-rome-an-investigation-of-the-rome-model
We think there’s a connection between the Reversal Curse and some results in the model editing literature. I’m not sure if this applies to the specific ROME results in that post. We’ll have the Reversal Curse paper out soon, which will explain more.
The Curse of Reversal seems to match the lack of bidirectionality of ROME edits mentioned here: https://www.alignmentforum.org/posts/QL7J9wmS6W2fWpofd/but-is-it-really-in-rome-an-investigation-of-the-rome-model
We think there’s a connection between the Reversal Curse and some results in the model editing literature. I’m not sure if this applies to the specific ROME results in that post. We’ll have the Reversal Curse paper out soon, which will explain more.