I’d definitely be interested in your thoughts about preferences when you get them into a shareable shape.
In some sense, what humans “really” have is just atoms moving around, all talk of mental states and so on is some level of convenient approximation. So when you say you want to talk about a different sort of approximation from Stuart, my immediate thing I’m curious about is “how can you make your way of talking about humans convenient for getting an AI to behave well?”
You can get some clues on my thoughts I think. I used to take an approach much like Stuart, but I now that that’s the wrong abstraction. The thing I’ve recently written that most points towards my thinking is “Let Values Drift”, which I wrote mostly because it was the first topic that really started to catalyze my thinking about human values.
I’d definitely be interested in your thoughts about preferences when you get them into a shareable shape.
In some sense, what humans “really” have is just atoms moving around, all talk of mental states and so on is some level of convenient approximation. So when you say you want to talk about a different sort of approximation from Stuart, my immediate thing I’m curious about is “how can you make your way of talking about humans convenient for getting an AI to behave well?”
You can get some clues on my thoughts I think. I used to take an approach much like Stuart, but I now that that’s the wrong abstraction. The thing I’ve recently written that most points towards my thinking is “Let Values Drift”, which I wrote mostly because it was the first topic that really started to catalyze my thinking about human values.