Excellent question. Current AIs are not very strong-consequentialist[1], and I expect/hope that we probably won’t get AIs like that either this year (2025) nor next year (2026). However, people here are interested in how an extremely competent AI would behave. Most people here model them as instrumentally-rational agents that are usefully described as having a closed-form utility function. Here goes a seminal formalization of this model by Legg and Hutter: link.
Are these models of future super-competent AIs wrong? Somewhat. All models are wrong. I personally trust them less than the average person who has spent a lot of time in here. I still find them a useful tool for thinking about limits and worst case scenarios: the sort of AI system actually capable of single-handedly taking over the world, for instance. However, I think it is also very useful to think about how AIs (and the people making them) are likely to act before these ultra-competent AIs show up, or in the case they don’t.
Excellent question. Current AIs are not very strong-consequentialist[1], and I expect/hope that we probably won’t get AIs like that either this year (2025) nor next year (2026). However, people here are interested in how an extremely competent AI would behave. Most people here model them as instrumentally-rational agents that are usefully described as having a closed-form utility function. Here goes a seminal formalization of this model by Legg and Hutter: link.
Are these models of future super-competent AIs wrong? Somewhat. All models are wrong. I personally trust them less than the average person who has spent a lot of time in here. I still find them a useful tool for thinking about limits and worst case scenarios: the sort of AI system actually capable of single-handedly taking over the world, for instance. However, I think it is also very useful to think about how AIs (and the people making them) are likely to act before these ultra-competent AIs show up, or in the case they don’t.
Term i just made up and choose to define like this: that reasons like a naive utilitarian, independently of its goals.