Yeah, good point. Some balance needs to be struck in such a scenario where people are given the power to do some sort of customization, but not so much that they can warp the intended purpose of the model into being directly harmful or helping them be harmful.
Yeah defining “harm”, or more formally, a “non-beneficial modification to a human being” is a hard task and is in many ways the core problem I am pointing at. Allowing people to take part in defining what is “harmful” to themselves is both potentially helpful as it brings local information and tricky because people may have already been ensnared by a hostile narrow AI to misunderstand “harm.”
Yeah, good point. Some balance needs to be struck in such a scenario where people are given the power to do some sort of customization, but not so much that they can warp the intended purpose of the model into being directly harmful or helping them be harmful.
Yeah defining “harm”, or more formally, a “non-beneficial modification to a human being” is a hard task and is in many ways the core problem I am pointing at. Allowing people to take part in defining what is “harmful” to themselves is both potentially helpful as it brings local information and tricky because people may have already been ensnared by a hostile narrow AI to misunderstand “harm.”