I think that might be a generally good critique, but I don’t think it applies to this post (it may apply better to post #3 in the series).
I used “metal with knobs” and “beefy arm” as human-parsable examples, but the main point is detecting when something is out-off-distribution, which relies on the image being different in AI-detectable ways, not on the specifics of the categories I mentioned.
I don’t think this is necessarily a critique—after all, it’s inevitable that AI-you is going to inherit some anthropomorphic powers. The trick is figuring out what they are and seeing if it seems like a profitable research avenue to try and replicate them :)
In this case, I think this is an already-known problem, because detecting out-of-distribution images in a way that matches human requirements requires the AI’s distribution to be similar to human distribution (and conversely, mismatches in distribution allow for adversarial examples). But maybe there’s something different in part 4 where I think there’s some kind of “break down actions in obvious ways” power that might not be as well-analyzed elsewhere (though it’s probably related to self-supervised learning of hierarchical planning problems).
I think that might be a generally good critique, but I don’t think it applies to this post (it may apply better to post #3 in the series).
I used “metal with knobs” and “beefy arm” as human-parsable examples, but the main point is detecting when something is out-off-distribution, which relies on the image being different in AI-detectable ways, not on the specifics of the categories I mentioned.
I don’t think this is necessarily a critique—after all, it’s inevitable that AI-you is going to inherit some anthropomorphic powers. The trick is figuring out what they are and seeing if it seems like a profitable research avenue to try and replicate them :)
In this case, I think this is an already-known problem, because detecting out-of-distribution images in a way that matches human requirements requires the AI’s distribution to be similar to human distribution (and conversely, mismatches in distribution allow for adversarial examples). But maybe there’s something different in part 4 where I think there’s some kind of “break down actions in obvious ways” power that might not be as well-analyzed elsewhere (though it’s probably related to self-supervised learning of hierarchical planning problems).
I don’t think critiques are necessarily bad ^_^