We have a points system in our family to incentivize the kids to do their chores. But we have to regularly update the rules because it turns out that there are ways to optimize for the points that we didn’t anticipate and that don’t really reflect what we actually want the kids to be incentivized to do. Every time this happens I think—ha, alignment failure!
I forgot about downvotes. I’m going to add this in to the guidelines.
here we are, a concrete example of failure of alignment
We have a points system in our family to incentivize the kids to do their chores. But we have to regularly update the rules because it turns out that there are ways to optimize for the points that we didn’t anticipate and that don’t really reflect what we actually want the kids to be incentivized to do. Every time this happens I think—ha, alignment failure!