(additional confirmation) Amazing. I wonder what completely insane things the other rare BPEs all get interpreted as? Could you loop over the BPE dict from #51k to #1* in a prompt like “Please define $BPE” to see what the most distant ones are? (Since there’s 51k, which is a bit much to read through manually, maybe sort by edit-distance from the ordinary ASCII encoding: ‘distribute’ would have a very high edit-distance from ‘SolidGoldMagikarp’.)
On a sidenote, this is yet another good illustration of how we have no idea what we’re doing with deep learning—not only did no one predict this, it’s obviously another Riley-style steganographic or sidechannel attack: just find rare BPEs and construct a code out of whatever bizarre things the model learned.
* I believe BPEs are supposed to be defined in ‘order’ of compression improvement, so the strangest BPEs should be at the end of the list.
(additional confirmation) Amazing. I wonder what completely insane things the other rare BPEs all get interpreted as? Could you loop over the BPE dict from #51k to #1* in a prompt like “Please define $BPE” to see what the most distant ones are? (Since there’s 51k, which is a bit much to read through manually, maybe sort by edit-distance from the ordinary ASCII encoding: ‘distribute’ would have a very high edit-distance from ‘SolidGoldMagikarp’.)
On a sidenote, this is yet another good illustration of how we have no idea what we’re doing with deep learning—not only did no one predict this, it’s obviously another Riley-style steganographic or sidechannel attack: just find rare BPEs and construct a code out of whatever bizarre things the model learned.
* I believe BPEs are supposed to be defined in ‘order’ of compression improvement, so the strangest BPEs should be at the end of the list.