Even if humanity isn’t like, having a huge mood shift, I do still expect the next 10 years to have a lot more people working on stuff that actually helps than the previous 10 years.
What kinds of things are you imagining, here? I’m worried that on the current margin people coming into safety will predominately go into interpretability/evals/etc because that’s the professional/legible thing we have on offer, even though by my lights the rate of progress and the methods/aims/etc of these fields are not nearly enough to get us to alignment in ~10 years (in worlds where alignment is not trivially easy, which is also the world I suspect we’re in). My own hope for another ten years is more like “that gives us some space and time to develop a proper science here,” which at the current stage doesn’t feel very bottlenecked by number of people. But I’m curious what your thoughts are on the “adding more people pushes us closer to alignment” question.
What kinds of things are you imagining, here? I’m worried that on the current margin people coming into safety will predominately go into interpretability/evals/etc because that’s the professional/legible thing we have on offer, even though by my lights the rate of progress and the methods/aims/etc of these fields are not nearly enough to get us to alignment in ~10 years (in worlds where alignment is not trivially easy, which is also the world I suspect we’re in). My own hope for another ten years is more like “that gives us some space and time to develop a proper science here,” which at the current stage doesn’t feel very bottlenecked by number of people. But I’m curious what your thoughts are on the “adding more people pushes us closer to alignment” question.