I haven’t specified anything about the algorithms, but they will maybe somehow have to be different. The point is that the format of the human feedback is different. Really this post is about the format in which humans provide feedback rather than about the structure of the AI systems (i.e. a difference in method of generating the training signal rather than a difference in learning algorithm).
I haven’t specified anything about the algorithms, but they will maybe somehow have to be different. The point is that the format of the human feedback is different. Really this post is about the format in which humans provide feedback rather than about the structure of the AI systems (i.e. a difference in method of generating the training signal rather than a difference in learning algorithm).