Thanks for sharing this; it’s a really helpful window into the world of AI ethics. I most of all liked this comment you made early on, however: ”...making modern-day systems behave ethically involves a bunch of bespoke solutions only suitable to the domain of operation of that system, not allowing for cross-comparison in any useful way.”
What this conjures in my mind is the hypothetical alternative of a transformer-like model that could perform zero-shot evaluation of ethical quandaries, and return answers that we humans would consider “ethical”, across a wide range of settings and scenarios. But I’m sure that someone has tried this before, e.g. training a BERT-type text classifier to distinguish between ethical and unethical completions of moral dilemma setups based on human-labeled data, and I guess I want to know why that doesn’t work (as I’m sure it doesn’t, or else we would have heard about it).
The problem is that it doesn’t interface well with decision-making systems in e.g. cars or hospitals. Those specialized systems have no English-language interface, and at the same time they’re making decisions in complicated, highly-specific situations that might be difficult for a generalist language model to parse.
Thanks for sharing this; it’s a really helpful window into the world of AI ethics. I most of all liked this comment you made early on, however: ”...making modern-day systems behave ethically involves a bunch of bespoke solutions only suitable to the domain of operation of that system, not allowing for cross-comparison in any useful way.”
What this conjures in my mind is the hypothetical alternative of a transformer-like model that could perform zero-shot evaluation of ethical quandaries, and return answers that we humans would consider “ethical”, across a wide range of settings and scenarios. But I’m sure that someone has tried this before, e.g. training a BERT-type text classifier to distinguish between ethical and unethical completions of moral dilemma setups based on human-labeled data, and I guess I want to know why that doesn’t work (as I’m sure it doesn’t, or else we would have heard about it).
https://delphi.allenai.org/
Definitely been done :D
The problem is that it doesn’t interface well with decision-making systems in e.g. cars or hospitals. Those specialized systems have no English-language interface, and at the same time they’re making decisions in complicated, highly-specific situations that might be difficult for a generalist language model to parse.