Some people here inspire me to make predictions ;) So here’s my attempt:
My guess, mainly based on this image (linked from the post):
Is that he’d say it’s a sub category of “getting models to output things based only on their training data, while treating them as a black box and still assuming unexpected outputs will happen sometimes”, as well as “this might work well for training, but obviously not for an AGI” and “if we’re going to talk about limiting a model’s output, Redwood Research is more of a way to go” and perhaps “this will just advance AI faster”
Some people here inspire me to make predictions ;) So here’s my attempt:
My guess, mainly based on this image (linked from the post):
Is that he’d say it’s a sub category of “getting models to output things based only on their training data, while treating them as a black box and still assuming unexpected outputs will happen sometimes”, as well as “this might work well for training, but obviously not for an AGI” and “if we’re going to talk about limiting a model’s output, Redwood Research is more of a way to go” and perhaps “this will just advance AI faster”