I agree that the (unprompted) generative model is doing something kind of like: choose a random goal, then optimize it.
In some sense that does reflect the “plurality of realistic human goals.” But I don’t think it’s a good way to reflect that diversity. It seems like you want to either (i) be able to pick which goal you pursue, (ii) optimize an aggregate of several goals.
Either way, I think that’s probably best reflected by a deterministic reward function, and you’d probably prefer be mindful about what you are getting rather than randomly sampling from webtext. (Though as I mention in my other comment, I think there are other good reasons to want the pure generative model.)
I agree that the (unprompted) generative model is doing something kind of like: choose a random goal, then optimize it.
In some sense that does reflect the “plurality of realistic human goals.” But I don’t think it’s a good way to reflect that diversity. It seems like you want to either (i) be able to pick which goal you pursue, (ii) optimize an aggregate of several goals.
Either way, I think that’s probably best reflected by a deterministic reward function, and you’d probably prefer be mindful about what you are getting rather than randomly sampling from webtext. (Though as I mention in my other comment, I think there are other good reasons to want the pure generative model.)