Mostly agree. A sophisticated user might have some great feedback on how a website ranks its products, but shouldn’t and doesn’t want to have access to the internals of the algorithm. So giving some users a slightly better surface area for interaction aside from “being part of the training data” seems like an important problem to solve. Could be useful all the way toward AGI as we would need a story of how a particular person still has some capacity for future influence.
Yeah, good point. Some balance needs to be struck in such a scenario where people are given the power to do some sort of customization, but not so much that they can warp the intended purpose of the model into being directly harmful or helping them be harmful.
Yeah defining “harm”, or more formally, a “non-beneficial modification to a human being” is a hard task and is in many ways the core problem I am pointing at. Allowing people to take part in defining what is “harmful” to themselves is both potentially helpful as it brings local information and tricky because people may have already been ensnared by a hostile narrow AI to misunderstand “harm.”
Mostly agree. A sophisticated user might have some great feedback on how a website ranks its products, but shouldn’t and doesn’t want to have access to the internals of the algorithm. So giving some users a slightly better surface area for interaction aside from “being part of the training data” seems like an important problem to solve. Could be useful all the way toward AGI as we would need a story of how a particular person still has some capacity for future influence.
Yeah, good point. Some balance needs to be struck in such a scenario where people are given the power to do some sort of customization, but not so much that they can warp the intended purpose of the model into being directly harmful or helping them be harmful.
Yeah defining “harm”, or more formally, a “non-beneficial modification to a human being” is a hard task and is in many ways the core problem I am pointing at. Allowing people to take part in defining what is “harmful” to themselves is both potentially helpful as it brings local information and tricky because people may have already been ensnared by a hostile narrow AI to misunderstand “harm.”