I’m annoyed by EY (and maybe MIRI’s?) dismissal of every other alignment work, and how seriously it seems to be taken here, given their track record of choosing research agendas with very indirect impact on alignment
For what it’s worth, my sense is that EY’s track record is best in 1) identifying problems and 2) understanding the structure of the alignment problem.
And, like, I think it is possible that you end up in situations where the people who understand the situation best end up the most pessimistic about it. If you’re trying to build a bridge to the moon, in fact it’s not going to work, and any determination applied there is going to get wasted. I think I see how a “try to understand things and cut to the heart of them” notices when it’s in situations like that, and I don’t see how “move the ball forward from where it is now” notices when it’s in situations like that.
Agreed on the track record, which is part of why that’s so frustrating he doesn’t give more details and feedback on why all these approaches are doomed in his view.
That being said, I disagree for the second part, probably because we don’t mean the same thing by “moving the ball”?
In your bridge example, “moving the ball” looks to me like trying to see what problems the current proposal could have, how you could check them, what would be your unknown unknowns. And I definitely expect such an approach to find the problems you mention.
Maybe you could give me a better model of what you mean by “moving the ball”?
Oh, I was imagining something like “well, our current metals aren’t strong enough, what if we developed stronger ones?”, and then focusing on metallurgy. And this is making forward progress—you can build a taller tower out of steel than out of iron—but it’s missing more fundamental issues like “you’re not going to be able to drive on a bridge that’s perpendicular to gravity, and the direction of gravity will change over the course of the trip” or “the moon moves relative to the earth, such that your bridge won’t be able to be one object”, which will sink the project even if you can find a supremely strong metal.
For example, let’s consider Anca Dragan’s research direction that I’m going to summarize as “getting present-day robots to understand what humans around them want and are doing so that they can achieve their goals / cooperate more effectively.” (In mildly adversarial situations like driving, you don’t want to make a cooperatebot, but rather something that follows established norms / prevents ‘cutting’ and so on, but when you have a human-robot team you do care mostly about effective cooperation.)
My guess is this 1) will make the world a better place in the short run under ‘normal’ conditions (most obviously through speeding up adoption of autonomous vehicles and making them more effective) and 2) does not represent meaningful progress towards aligning transformative AI systems. [My model of Eliezer notes that actually he’s making a weaker claim, which is something more like “he’s not surprised by the results of her papers”, which still allows for them to be “progress in the published literature”.]
When I imagine “how do I move the ball forward now?” I find myself drawn towards projects like those, and less to projects like “stare at the nature of cognition until I see a way through the constraints”, which feels like the sort of thing that I would need to do to actually shift my sense of doom.
For what it’s worth, my sense is that EY’s track record is best in 1) identifying problems and 2) understanding the structure of the alignment problem.
And, like, I think it is possible that you end up in situations where the people who understand the situation best end up the most pessimistic about it. If you’re trying to build a bridge to the moon, in fact it’s not going to work, and any determination applied there is going to get wasted. I think I see how a “try to understand things and cut to the heart of them” notices when it’s in situations like that, and I don’t see how “move the ball forward from where it is now” notices when it’s in situations like that.
Agreed on the track record, which is part of why that’s so frustrating he doesn’t give more details and feedback on why all these approaches are doomed in his view.
That being said, I disagree for the second part, probably because we don’t mean the same thing by “moving the ball”?
In your bridge example, “moving the ball” looks to me like trying to see what problems the current proposal could have, how you could check them, what would be your unknown unknowns. And I definitely expect such an approach to find the problems you mention.
Maybe you could give me a better model of what you mean by “moving the ball”?
Oh, I was imagining something like “well, our current metals aren’t strong enough, what if we developed stronger ones?”, and then focusing on metallurgy. And this is making forward progress—you can build a taller tower out of steel than out of iron—but it’s missing more fundamental issues like “you’re not going to be able to drive on a bridge that’s perpendicular to gravity, and the direction of gravity will change over the course of the trip” or “the moon moves relative to the earth, such that your bridge won’t be able to be one object”, which will sink the project even if you can find a supremely strong metal.
For example, let’s consider Anca Dragan’s research direction that I’m going to summarize as “getting present-day robots to understand what humans around them want and are doing so that they can achieve their goals / cooperate more effectively.” (In mildly adversarial situations like driving, you don’t want to make a cooperatebot, but rather something that follows established norms / prevents ‘cutting’ and so on, but when you have a human-robot team you do care mostly about effective cooperation.)
My guess is this 1) will make the world a better place in the short run under ‘normal’ conditions (most obviously through speeding up adoption of autonomous vehicles and making them more effective) and 2) does not represent meaningful progress towards aligning transformative AI systems. [My model of Eliezer notes that actually he’s making a weaker claim, which is something more like “he’s not surprised by the results of her papers”, which still allows for them to be “progress in the published literature”.]
When I imagine “how do I move the ball forward now?” I find myself drawn towards projects like those, and less to projects like “stare at the nature of cognition until I see a way through the constraints”, which feels like the sort of thing that I would need to do to actually shift my sense of doom.