Just to return for a moment to what I wrote, I don’t mean to be making an assessment here on “dangerous”, but instead to provide this service for things people themselves think are dangerous. Figuring out where to draw the line in what capabilities research is so dangerous it should not be published is a thing I have only very weak opinions on. For example, if you figured out how to make recursive self improvement work in a way that doesn’t immediately result in wild divergence and could stablely produce better results over many iterations I’d say that’s dangerous, but less than that I’m not sure where you might draw the line.
Just to return for a moment to what I wrote, I don’t mean to be making an assessment here on “dangerous”, but instead to provide this service for things people themselves think are dangerous. Figuring out where to draw the line in what capabilities research is so dangerous it should not be published is a thing I have only very weak opinions on. For example, if you figured out how to make recursive self improvement work in a way that doesn’t immediately result in wild divergence and could stablely produce better results over many iterations I’d say that’s dangerous, but less than that I’m not sure where you might draw the line.