This seems non-impossible. On the other hand, humans have categories not just because of simplicity, but also because of usefulness. In order to have the best shot at regenerating human categorizations, our world-modeler would have to have human resources and use concepts for human activities.
And of course, even if you manage to make a bunch of categories, many of which correspond to human categories, you still have to pick out specific categories in order to communicate or set up a goal system. And you have to pick it out without knowing what the categories are beforehand, or else you’d just write that.
This seems non-impossible. On the other hand, humans have categories not just because of simplicity, but also because of usefulness.
Good point, but it seems like some categories (like person) are useful even for paperclip maximizers. I really don’t see how you could completely understand media and documents from human society yet be confused by a categorization between people and non-people.
And of course, even if you manage to make a bunch of categories, many of which correspond to human categories, you still have to pick out specific categories in order to communicate or set up a goal system.
Right, you can “index” a category by providing some positive and negative examples. If I gave you some pictures of oranges and some pictures of non-oranges, you could figure out the true categorization because you consider the categorization of oranges/non-oranges to be simple. There’s probably a more robust way of doing this.
This seems non-impossible. On the other hand, humans have categories not just because of simplicity, but also because of usefulness. In order to have the best shot at regenerating human categorizations, our world-modeler would have to have human resources and use concepts for human activities.
And of course, even if you manage to make a bunch of categories, many of which correspond to human categories, you still have to pick out specific categories in order to communicate or set up a goal system. And you have to pick it out without knowing what the categories are beforehand, or else you’d just write that.
Good point, but it seems like some categories (like person) are useful even for paperclip maximizers. I really don’t see how you could completely understand media and documents from human society yet be confused by a categorization between people and non-people.
Right, you can “index” a category by providing some positive and negative examples. If I gave you some pictures of oranges and some pictures of non-oranges, you could figure out the true categorization because you consider the categorization of oranges/non-oranges to be simple. There’s probably a more robust way of doing this.