For instance, the latter response obtains if the “pointing” is done by naive training.
Oh, sorry. I’m “uncertain” assuming Model-Based RL with the least-doomed plan that I feel like I more-or-less know how to implement right now. If we’re talking about “naïve training”, then I’m probably very pessimistic, depending on the details.
Also, as a reminder, my high credence in doom doesn’t come from high confidence in a claim like this. You can maybe get one nine here; I doubt you can get three. My high credence in doom comes from its disjunctive nature.
That’s helpful, thanks!
UPDATE: The “least-doomed plan” I mentioned is now described in a more simple & self-contained post, for readers’ convenience. :)
Oh, sorry. I’m “uncertain” assuming Model-Based RL with the least-doomed plan that I feel like I more-or-less know how to implement right now. If we’re talking about “naïve training”, then I’m probably very pessimistic, depending on the details.
That’s helpful, thanks!
UPDATE: The “least-doomed plan” I mentioned is now described in a more simple & self-contained post, for readers’ convenience. :)