Gordon Seidoh Worley comments on Research Agenda in reverse: what would a solution look like?

Gordon Seidoh Worley 8 Jul 2019 17:34 UTC
LW: 4 AF: 2
AF
I endorse what you propose in the first paragraph. I do think a theory of human preferences is necessary and that at least someone should work on it (and if I didn’t think this I probably wouldn’t be doing it myself), although not necessarily that someone should switch to it all else equal, and I wouldn’t say we should encourage folks to work on it more than other problems as a general policy since there’s a lot to be done and I remain uncertain about prioritization so can’t make a strong recommendation there beyond “let’s make sure we don’t fail to work on as much as seems relevant as possible”.
So it sounds like we only disagree on the necessity aspect, and that seems to be the result of an inferential gap I’m not sure how to bridge yet, i.e. why it is I believe it to be necessary hinges in part on deeper beliefs we may not share and haven’t figured out to make explicit. That’s good to know, because it points towards something worth thinking about and addressing so that existing and new entrants to AI safety work may more accept it as important and useful work.

Gordon Seidoh Worley comments on Research Agenda in reverse: what *would* a solution look like?