Interesting; I hadn’t heard of DreamerV2. From a quick look at the paper, it looks like one might describe it as a step on the way to something like EfficientZero. Does that sound roughly correct?
Yes. They don’t share a common lineage, but are similar in that they’re both recent advances in efficient model-based RL. Personally speaking, I think this is the subfield to be closely tracking progress in, because 1) it has far-reaching implications in the long term and 2) it has garnered relatively little attention compared to other subfields.
We may extend this to older models in the future. But our goal right now is to focus on these models’ public safety risks as standalone (or nearly standalone) systems.
I see. If you’d like to visualize trends though, you’ll need more historical data points, I think.
Personally speaking, I think this is the subfield to be closely tracking progress in, because 1) it has far-reaching implications in the long term and 2) it has garnered relatively little attention compared to other subfields.
Thanks for the clarification — definitely agree with this.
If you’d like to visualize trends though, you’ll need more historical data points, I think.
Yeah, you’re right. Our thinking was that we’d be able to do this with future data points or by increasing the “density” of points within the post-GPT-3 era, but ultimately it will probably be necessary (and more compelling) to include somewhat older examples too.
Yes. They don’t share a common lineage, but are similar in that they’re both recent advances in efficient model-based RL. Personally speaking, I think this is the subfield to be closely tracking progress in, because 1) it has far-reaching implications in the long term and 2) it has garnered relatively little attention compared to other subfields.
I see. If you’d like to visualize trends though, you’ll need more historical data points, I think.
Thanks for the clarification — definitely agree with this.
Yeah, you’re right. Our thinking was that we’d be able to do this with future data points or by increasing the “density” of points within the post-GPT-3 era, but ultimately it will probably be necessary (and more compelling) to include somewhat older examples too.