Much ink has been spilled on the difficulty of trying to solve a problem ahead of time and without any feedback loop; I won’t rehash those arguments at length.
Can you point me to some readings, especially alignment-related stuff? (No need to rehash anything.) I’ve been reading LW on and off since ~2013 and have somehow missed every post related to this, which is kind of embarrassing.
Unfortunately, this feels like a subject that’s often discussed in asides. I have a feeling this came up more than once during the 2021 MIRI Conversations, but could be misremembering.
Can you point me to some readings, especially alignment-related stuff? (No need to rehash anything.) I’ve been reading LW on and off since ~2013 and have somehow missed every post related to this, which is kind of embarrassing.
Here are some of mine:
https://www.alignmentforum.org/posts/72scWeZRta2ApsKja/epistemological-vigilance-for-alignment
https://www.alignmentforum.org/posts/FQqcejhNWGG8vHDch/on-solving-problems-before-they-appear-the-weird
Thanks!
Unfortunately, this feels like a subject that’s often discussed in asides. I have a feeling this came up more than once during the 2021 MIRI Conversations, but could be misremembering.