Diffusion Guided NLP: better steering, mostly a good thing

Nathan Helm-Burger10 Aug 2024 19:49 UTC

13 points

AI Machine Learning (ML)AI Control AI Capabilities

I think this is a very promising method for improving the steering of LLMs. Which is great for reducing risk from model-originating harms like deception.

The flipside is that it increases misuse potential.

This is yet another possibility for the widening of the safety gap between closed-weight models with locked-down controls, and open weight models.

Nathan Helm-Burger10 Aug 2024 19:49 UTC

13 points

0 comments1 min readLW link

AI Machine Learning (ML)AI Control AI Capabilities

No comments.