Curated. This post gives me a lot of context on your prior writing (unaligned benchmark, strategy stealing assumption, iterated amplification, imitative generalization), it helps me understand your key intuitions behind the plausibility of alignment, and it helps me understand where your research is headed.
When I read Embedded Agency, I felt like I then knew how to think productively about the main problems MIRI is working on by myself. This post leaves me feeling similarly about the problems you’ve been working on for the past 6+ years.
So thanks for that.
I’d like to read a version of this post where each example is 10x the length and analyzes it more thoroughly… I could just read all of your previous posts on each subject, though they’re fairly technical. (Perhaps Mark Xu will write it such a post, he did a nice job previously on Solomonoff Induction.)
I’d also be pretty interested in people writing more posts presenting arguments for/against plausible stories for the failure of Imitative Generalization, or fleshing out the details of a plausible story such that we can see more clearly if the story is indeed plausible. Basically, making contributions in the ways you outline.
Aside: Since the post was initially published, some of the heading formatting was lost in an edit, so I fixed that before curating it.
Edit: Removed the line “After reading it I have a substantially higher probability of us solving the alignment problem.” Understanding Paul’s research is a big positive, but I’m not actually sure I stand by it leading to a straightforward change in my probability.
Curated. This post gives me a lot of context on your prior writing (unaligned benchmark, strategy stealing assumption, iterated amplification, imitative generalization), it helps me understand your key intuitions behind the plausibility of alignment, and it helps me understand where your research is headed.
When I read Embedded Agency, I felt like I then knew how to think productively about the main problems MIRI is working on by myself. This post leaves me feeling similarly about the problems you’ve been working on for the past 6+ years.
So thanks for that.
I’d like to read a version of this post where each example is 10x the length and analyzes it more thoroughly… I could just read all of your previous posts on each subject, though they’re fairly technical. (Perhaps Mark Xu will write it such a post, he did a nice job previously on Solomonoff Induction.)
I’d also be pretty interested in people writing more posts presenting arguments for/against plausible stories for the failure of Imitative Generalization, or fleshing out the details of a plausible story such that we can see more clearly if the story is indeed plausible. Basically, making contributions in the ways you outline.
Aside: Since the post was initially published, some of the heading formatting was lost in an edit, so I fixed that before curating it.
Edit: Removed the line “After reading it I have a substantially higher probability of us solving the alignment problem.” Understanding Paul’s research is a big positive, but I’m not actually sure I stand by it leading to a straightforward change in my probability.