Dan H comments on An Introduction to Representation Engineering—an activation-based paradigm for controlling LLMs

Dan H Jul 18, 2024, 5:49 AM
3 points
0
AF
It’s worth noting that activations are one thing you can modify, but many of the most performant methods (e.g., LoRRA) modify the weights. (Representations = {weights, activations}, hence “representation” engineering.)