I agree that we generally shouldn’t trade off risk of permanent civilization-ending catastrophe for Earth-scale AI welfare, but I just really would defend the line that addressing short-term AI welfare is important for both long-term existential risk and long-term AI welfare. One reason as to why that you don’t mention: AIs are extremely influenced by what they’ve seen other AIs in their training data do and how they’ve seen those AIs be treated—cf. some of Janus’s writing or Conditioning Predictive Models.
Sure, good point. But it’s far from obvious that the best interventions long-term-wise are the best short-term-wise, and I believe people are mostly just thinking about short-term stuff. I’d feel better if people talked about training data or whatever rather than just “protect any interests that warrant protecting” and “make interventions and concessions for model welfare.”
(As far as I remember, nobody’s published a list of how short-term AI welfare stuff can boost long-term AI welfare stuff that includes the training-data thing you mention. This shows that people aren’t thinking about long-term stuff. Actually there hasn’t been much published on short-term stuff either, so: shrug.)
I agree that we generally shouldn’t trade off risk of permanent civilization-ending catastrophe for Earth-scale AI welfare, but I just really would defend the line that addressing short-term AI welfare is important for both long-term existential risk and long-term AI welfare. One reason as to why that you don’t mention: AIs are extremely influenced by what they’ve seen other AIs in their training data do and how they’ve seen those AIs be treated—cf. some of Janus’s writing or Conditioning Predictive Models.
Sure, good point. But it’s far from obvious that the best interventions long-term-wise are the best short-term-wise, and I believe people are mostly just thinking about short-term stuff. I’d feel better if people talked about training data or whatever rather than just “protect any interests that warrant protecting” and “make interventions and concessions for model welfare.”
(As far as I remember, nobody’s published a list of how short-term AI welfare stuff can boost long-term AI welfare stuff that includes the training-data thing you mention.
This shows that people aren’t thinking about long-term stuff.Actually there hasn’t been much published on short-term stuff either, so: shrug.)