Perhaps I’m missing something (I don’t work in AI research), but isn’t the obvious first stop Christiano et al’s Concrete Problems in AI Safety? Apologies if you already know about this paper and meant something else.
Perhaps I’m missing something (I don’t work in AI research), but isn’t the obvious first stop Christiano et al’s Concrete Problems in AI Safety? Apologies if you already know about this paper and meant something else.