As I work towards becoming less confused about what we mean when we talk about values, I find that it feels a lot like I’m working on a jigsaw puzzle where I don’t know what the picture is. Also all the pieces have been scattered around the room and I have to find the pieces first, digging between couch cushions and looking under the rug and behind the bookcase, let alone figure out how they fit together or what they fit together to describe.
Yes, we have some pieces already and others think they know (infer, guess) what the picture is from those (it’s a bear! it’s a cat! it’s a woman in a fur coat!), and as I work I find it helpful to keep updating my own guess because even when it’s wrong it sometimes helps me think of new ways to try combining the pieces or to know what pieces might be missing that I should go look for, but it also often feels like I’m failing all the time because I’m updating rapidly based on new information and that keeps changing my best guess.
I suspect this is a common experience for folks working on problems in AI safety and many other complex problems, so I figured I’d share this metaphor I recently hit on for making sense of what it is like to do this kind of work.
As I work towards becoming less confused about what we mean when we talk about values, I find that it feels a lot like I’m working on a jigsaw puzzle where I don’t know what the picture is. Also all the pieces have been scattered around the room and I have to find the pieces first, digging between couch cushions and looking under the rug and behind the bookcase, let alone figure out how they fit together or what they fit together to describe.
Yes, we have some pieces already and others think they know (infer, guess) what the picture is from those (it’s a bear! it’s a cat! it’s a woman in a fur coat!), and as I work I find it helpful to keep updating my own guess because even when it’s wrong it sometimes helps me think of new ways to try combining the pieces or to know what pieces might be missing that I should go look for, but it also often feels like I’m failing all the time because I’m updating rapidly based on new information and that keeps changing my best guess.
I suspect this is a common experience for folks working on problems in AI safety and many other complex problems, so I figured I’d share this metaphor I recently hit on for making sense of what it is like to do this kind of work.