I wrote up a bunch of my high-level views on the MIRI dialogues in this review, so let me say some things that are more specific to this post.
Since the dialogues are written, I keep coming back to the question of the degree to which consequentialism is a natural abstraction that will show up in AI systems we train, and while this dialogue had some frustrating parts where communication didn’t go perfectly, I still think it has some of the best intuition pumps for how to think about consequentialism in AI systems.
The other part I liked the most were actually the points about epistemology. I think in-particular the point about “look, most correct theories will not make amazing novel predictions, instead they will just unify existing sets of predictions that we’ve made for shallower reasons” is one where the explanation clicked for me better than for previous explanations that covered similar topics.
I wrote up a bunch of my high-level views on the MIRI dialogues in this review, so let me say some things that are more specific to this post.
Since the dialogues are written, I keep coming back to the question of the degree to which consequentialism is a natural abstraction that will show up in AI systems we train, and while this dialogue had some frustrating parts where communication didn’t go perfectly, I still think it has some of the best intuition pumps for how to think about consequentialism in AI systems.
The other part I liked the most were actually the points about epistemology. I think in-particular the point about “look, most correct theories will not make amazing novel predictions, instead they will just unify existing sets of predictions that we’ve made for shallower reasons” is one where the explanation clicked for me better than for previous explanations that covered similar topics.