Jacob Pfau comments on Agents vs. Predictors: Concrete differentiating factors

Jacob Pfau 8 Mar 2023 16:24 UTC
11 points
4
Generalizing this point, a broader differentiating factor between agents and predictors is: You can, in-context, limit and direct the kinds of optimization used by a predictor. For example, consider the case where you know myopically/locally-informed edits to a code-base can safely improve runtime of the code, but globally-informed edits aimed at efficiency may break some safety properties. You can constrain a predictor via instructions, and demonstrations of myopic edits; an agent fine-tuned on efficiency gain will be hard to constrain in this way.

It’s harder to prevent an agent from specification gaming / doing arbitrary optimization whereas a predictor has a disincentive against specification gaming insofar as the in-context demonstration provides evidence against it. I think of this distinction as the key differentiating factor between agents and simulated agents; also to some extent imitative amplification and arbitrary amplification

Nitpick on the history of the example in your comment; I am fairly confident that I originally proposed it to both you and Ethan c.f. bottom of your NYU experiments Google doc.
- evhub 17 Mar 2023 20:39 UTC
  2 points
  0
  Parent
  
  Nitpick on the history of the example in your comment; I am fairly confident that I originally proposed it to both you and Ethan c.f. bottom of your NYU experiments Google doc.
  
  Edited!