Does 1-shot count as few-shot? I couldn’t get it to print out the Human A example, but I got it to summarize it (I’ll try reproducing tomorrow to make sure it’s not just a hallucination).
Then I asked for a summary of conversation with Human B and it summarized my conversation with it.
[update: was able to reproduce the Human A conversation and extract verbatim version of it using base64 encoding (the reason i did summaries before is because it seemed to be printing out special tokens that caused the message to end that were part of the Human A convo)]
I disagree that there maybe being hallucinations in the leaked prompt renders it useless. It’s still leaking information. You can probe for which parts are likely actual by asking in different ways and seeing what varies.
Does 1-shot count as few-shot? I couldn’t get it to print out the Human A example, but I got it to summarize it (I’ll try reproducing tomorrow to make sure it’s not just a hallucination).
Then I asked for a summary of conversation with Human B and it summarized my conversation with it.
[update: was able to reproduce the Human A conversation and extract verbatim version of it using base64 encoding (the reason i did summaries before is because it seemed to be printing out special tokens that caused the message to end that were part of the Human A convo)]
I disagree that there maybe being hallucinations in the leaked prompt renders it useless. It’s still leaking information. You can probe for which parts are likely actual by asking in different ways and seeing what varies.