Usually I bounce off of Reframing Superintelligence and your recent work. I don’t know what’s different about this article, but I managed to understand what you were getting at without having to loop over each paragraph five times. It’s like I’m reading Drexler circa 1992, and I love it.
RE simulators: Isn’t the whole problem with simulators that we’re not sure how to extract the information we want from the system? If we knew how to get it to simulate alignment researchers working on the problem for a long time until it gets solved, then that seems like a major alignment advance. Similairly, with my hazy conception of narrow AI systems, I have the impression that getting weak AI systems to do substantial alignment work seems like that’d constitute a major alignment advance in itself. I guess I want to know how much work you estimate is needed before we could design a more detailed version of what you’ve outlined and expect it to work.
Usually I bounce off of Reframing Superintelligence and your recent work. I don’t know what’s different about this article, but I managed to understand what you were getting at without having to loop over each paragraph five times. It’s like I’m reading Drexler circa 1992, and I love it.
RE simulators: Isn’t the whole problem with simulators that we’re not sure how to extract the information we want from the system? If we knew how to get it to simulate alignment researchers working on the problem for a long time until it gets solved, then that seems like a major alignment advance. Similairly, with my hazy conception of narrow AI systems, I have the impression that getting weak AI systems to do substantial alignment work seems like that’d constitute a major alignment advance in itself. I guess I want to know how much work you estimate is needed before we could design a more detailed version of what you’ve outlined and expect it to work.