paulfchristiano comments on AMA: Paul Christiano, alignment researcher

paulfchristiano 1 May 2021 20:50 UTC
LW: 5 AF: 4
0
AF
I think most people have expectations regarding e.g. how explicitly will systems represent their preferences, how much will they have preferences, how will that relate to optimization objectives used in ML training, how well will they be understood by humans, etc.
Then there’s a bunch of different things you might want: articulations of particular views on some of those questions, stories that (in virtue of being concrete) show a whole set of guesses and how they can lead to a bad or good outcome, etc. My bullet points were mostly regarding the exercise of fleshing out a particular story (which is therefore most likely to be wrong), rather than e.g. thinking about particular questions about the future.