Gerald Monroe comments on AI #57: All the AI News That’s Fit to Print

Gerald Monroe 29 Mar 2024 19:53 UTC
2 points
0
What about maximal scaffolding, or “fine tune the model on successes and failures in adversarial challenges”. Starting probably with the base model.

It seems like it would be extremely helpful to know what’s even possible here.
1. Are Gemini scale models capable of better than human performance at any of these evals?
2. Once you achieve it, what does super persuasion look like, how effective is it.
For example, if a human scammer succeeds 2 percent of the time (do you have a baseline crew of scammers hired remotely for these benches?), does super persuasion succeed 3 percent or 30 percent? Does it scale with model capabilities or slam into a wall at say, 4 percent, where 96 percent of humans just can’t reliably be tricked?

Or does it really have no real limit like in sci Fi stories …