Our paper on this distillation-based attack technique is now on arxiv.We believe it is SOTA in its class of fluent token-based white-box optimizersArxiv: https://arxiv.org/pdf/2407.17447Twitter: https://x.com/tbenthompson/status/1816532156031643714 Github:https://github.com/Confirm-Solutions/flrt Code demo: https://confirmlabs.org/posts/flrt.html
Our paper on this distillation-based attack technique is now on arxiv.
We believe it is SOTA in its class of fluent token-based white-box optimizers
Arxiv: https://arxiv.org/pdf/2407.17447
Twitter: https://x.com/tbenthompson/status/1816532156031643714
Github:https://github.com/Confirm-Solutions/flrt
Code demo: https://confirmlabs.org/posts/flrt.html