I mean, it might be tractable? It’s hard to be certain that something isn’t tractable until there is a lot clearer picture of what’s going on than what alignment theory gives so far. Usually knowledge of whether something is big-picture tractable isn’t that relevant for working on it. You just find out eventually, sometimes.
I accept that trying to figure out the overall tractability of the problem far enough in advance isn’t a useful thing to dedicate resources to. But nevertheless, researchers seem to have expectations when it comes to alignment difficulty regardless, despite not having a “clearer picture”. For the researchers who think that alignment is probably tractable, I would love to hear about why they think so.
To be clear, I’m talking about researchers who are worried about AI x-risk but aren’t doomers. I would like to gain more insight into what they are hoping for, and why their expectations are reasonable.
I mean, it might be tractable? It’s hard to be certain that something isn’t tractable until there is a lot clearer picture of what’s going on than what alignment theory gives so far. Usually knowledge of whether something is big-picture tractable isn’t that relevant for working on it. You just find out eventually, sometimes.
I accept that trying to figure out the overall tractability of the problem far enough in advance isn’t a useful thing to dedicate resources to. But nevertheless, researchers seem to have expectations when it comes to alignment difficulty regardless, despite not having a “clearer picture”. For the researchers who think that alignment is probably tractable, I would love to hear about why they think so.
To be clear, I’m talking about researchers who are worried about AI x-risk but aren’t doomers. I would like to gain more insight into what they are hoping for, and why their expectations are reasonable.