The fact that AI alignment research is 99% about control, and 1% (maybe less?) about metaethics (In the context of how do we even aggregate the utility function of all humanity) hints at what is really going on, and that’s enough said.
Have you heard about CEV and Fun Theory? In an earlier, more optimistic time, this was indeed a major focus. What changed is we became more pessimistic and decided to focus more on first things first—if you can’t control the AI at all, it doesn’t matter what metaethics research you’ve done. Also, the longtermist EA community still thinks a lot about metaethics relative to literally every other community I know of, on par with and perhaps slightly more than my philosophy grad student friends. (That’s my take at any rate, I haven’t been around that long.)
CEV was written in 2004, fun theory 13 years ago. I couldn’t find any recent MIRI paper that was about metaethics (Granted I haven’t gone through all of them). The metaethics question is important just as much as the control question for any utilitarian (What good will it be to control an AI only for it to be aligned with some really bad values, an AI-controlled by a sadistic sociopath is infinitely worse than a paper-clip-maximizer). Yet all the research is focused on control, and it’s very hard not to be cynical about it. If some people believe they are creating a god, it’s selfishly prudent to make sure you’re the one holding the reigns to this god. I don’t get why having some blind trust in the benevolence of Peter Thiel (who finances this) or other people who will suddenly have godly powers to care for all humanity seems naive with all we know about how power corrupts and how competitive and selfish people are. Most people are not utilitarians, so as a quasi-utilitarian I’m pretty terrified of what kind of world will be created with an AI-controlled by the typical non-utilitarian person.
My claim was not that MIRI is doing lots of work on metaethics. As far as I know they are focused on the control/alignment problem. This is not because they think it’s the only problem that needs solving; it’s just the most dire, the biggest bottleneck, in their opinion.
You may be interested to know that I share your concerns about what happens after (if) we succeed at solving alignment. So do many other people in the community, I assure you. (Though I agree on the margin more quiet awareness-raising about this would plausibly be good.)
The fact that AI alignment research is 99% about control, and 1% (maybe less?) about metaethics (In the context of how do we even aggregate the utility function of all humanity) hints at what is really going on, and that’s enough said.
Have you heard about CEV and Fun Theory? In an earlier, more optimistic time, this was indeed a major focus. What changed is we became more pessimistic and decided to focus more on first things first—if you can’t control the AI at all, it doesn’t matter what metaethics research you’ve done. Also, the longtermist EA community still thinks a lot about metaethics relative to literally every other community I know of, on par with and perhaps slightly more than my philosophy grad student friends. (That’s my take at any rate, I haven’t been around that long.)
CEV was written in 2004, fun theory 13 years ago. I couldn’t find any recent MIRI paper that was about metaethics (Granted I haven’t gone through all of them). The metaethics question is important just as much as the control question for any utilitarian (What good will it be to control an AI only for it to be aligned with some really bad values, an AI-controlled by a sadistic sociopath is infinitely worse than a paper-clip-maximizer). Yet all the research is focused on control, and it’s very hard not to be cynical about it. If some people believe they are creating a god, it’s selfishly prudent to make sure you’re the one holding the reigns to this god. I don’t get why having some blind trust in the benevolence of Peter Thiel (who finances this) or other people who will suddenly have godly powers to care for all humanity seems naive with all we know about how power corrupts and how competitive and selfish people are. Most people are not utilitarians, so as a quasi-utilitarian I’m pretty terrified of what kind of world will be created with an AI-controlled by the typical non-utilitarian person.
My claim was not that MIRI is doing lots of work on metaethics. As far as I know they are focused on the control/alignment problem. This is not because they think it’s the only problem that needs solving; it’s just the most dire, the biggest bottleneck, in their opinion.
You may be interested to know that I share your concerns about what happens after (if) we succeed at solving alignment. So do many other people in the community, I assure you. (Though I agree on the margin more quiet awareness-raising about this would plausibly be good.)
http://www.metaethical.ai is the state of the art as far as I’m concerned…