Oh! Pretty cool, I hadn’t thought of that effect. Another consequence of tagging all text with the date that seems particularly interesting to me is that it allows us to query GPT on its beliefs about the future in a more direct way. Say you want to know if GPT believes that a war would break out in France by 2040, we could ask GPT to give us the likelihood for the text “France enters war!” tagged with “journal: New York Times; Date: 2040″. We could see how the likelihood for that phrase changes with different tagged dates in order to see what GPT believes. We can repeat this with any sort of headlines we want. We only need that GPT believes that the New York Times is a relatively accurate source of facts about the real world for this to work.
A further trick is that we can ask GPT about how its own outputs will affect the world. Suppose we ask GPT to produce that “Fusion Plant Design” textbook I mentioned in the first comment. We can then take the textbook it outputs, change its date tag to 2023, introduce it into GPT’s training set and take a small gradient step with it, this essentially makes GPT “aware” that the textbook now exists in the world, as if it was released publicly. We then ask this updated model about the likelihood of future war in France through the same way as above. In effect this allows us to answer the question “Does GPT think that releasing this textbook will increase or decrease the likelihood of war in France by 2040?”. It would be hopeless to directly ask it that question, because no human could possibly know the answer to that, so GPT won’t give it directly, but we can still tease it out if we use date-tagged data like this.
Oh! Pretty cool, I hadn’t thought of that effect. Another consequence of tagging all text with the date that seems particularly interesting to me is that it allows us to query GPT on its beliefs about the future in a more direct way. Say you want to know if GPT believes that a war would break out in France by 2040, we could ask GPT to give us the likelihood for the text “France enters war!” tagged with “journal: New York Times; Date: 2040″. We could see how the likelihood for that phrase changes with different tagged dates in order to see what GPT believes. We can repeat this with any sort of headlines we want. We only need that GPT believes that the New York Times is a relatively accurate source of facts about the real world for this to work.
A further trick is that we can ask GPT about how its own outputs will affect the world. Suppose we ask GPT to produce that “Fusion Plant Design” textbook I mentioned in the first comment. We can then take the textbook it outputs, change its date tag to 2023, introduce it into GPT’s training set and take a small gradient step with it, this essentially makes GPT “aware” that the textbook now exists in the world, as if it was released publicly. We then ask this updated model about the likelihood of future war in France through the same way as above. In effect this allows us to answer the question “Does GPT think that releasing this textbook will increase or decrease the likelihood of war in France by 2040?”. It would be hopeless to directly ask it that question, because no human could possibly know the answer to that, so GPT won’t give it directly, but we can still tease it out if we use date-tagged data like this.