Humanity itself is so diverse that the question of what it would mean to be aligned with humanity is a tough one. You can’t hope for unanimous support of any proposal; even for something that would help a bunch of people and cost nothing at all, you could probably find some people who would say “the current system is bad and needs to collapse, and this would delay the collapse and be net bad”.
Additionally, I think the majority of humanity has not studied or thought through some important basics, for example in economics, which lead them to support plenty of things (e.g. price-gouging laws) that I consider mind-bogglingly stupid. I could enumerate a long list of policies that I think are (a) probably correct and (b) opposed by >90% of the humans who have an opinion on it.
So my views are not “aligned”. My actions are another matter. Of course, in this context we’re somewhat interested in what happens if I get the power to put my prescriptions into practice.
If I had magical powers… I’ve been thinking that the “aligned” thing to do would be to help humans grow, via unambiguously good changes, until they’re smart and educated enough to engage with my views on their merits (and, I expect, end up agreeing with a decent number of them). Changes like: making brains simply work better (I’m sure there would be tradeoffs eventually, but I suspect there are a lot of mutations that just make brains slightly worse with no benefits, and that eliminating those would be worth a lot of IQ points), making lifespans and healthspans longer (doubling them would be a nice start), ameliorating or eliminating lots of problems that interfere with people growing (e.g. sleep issues, psychological issues)… I’m sure if I looked into it, I could find a lot more.
Once the majority of people are emotionally healthy geniuses, things should get a lot better, and then I could reevaluate the situation and negotiate with the new people. As long as it hasn’t resulted in some dystopia or rule-by-dictator or in the geniuses blowing up the world, I don’t think I’d be tempted into any forceful interventions.
I think that’s the most important part of the intersection between “aligned with humanity as carefully as you’re going to get” and “transforming society in an enormously positive way”. If that were an option, I think I’d take it. In that respect, I could call myself “aligned”. (Though if I were only part god and the tools available to me had severe downsides people wouldn’t agree to… that might be a problem.)
I think that’s what an ideally “aligned” god would do… along with possibly taking forceful actions to prevent the world from getting destroyed in the meantime, such as by other nascent gods—which is unfortunately hard to distinguish from unaligned “take over the world” behavior. It would be nice if the world were such that, once one god was created and people proved its abilities, other people would voluntarily stop trying to create their own gods in exchange for getting some value from the first god. It seems like prior agreements to do that would help.
I think the majority of humanity has not studied or thought through some important basics, for example in economics, which lead them to support plenty of things (e.g. price-gouging laws) that I consider mind-bogglingly stupid.
Interestingly, have just discussed similar issue with a friend and came up with a solution. Obviously, aligned AI cares about people’s subjective opinion, but that doesn’t mean it’s not allowed to talk/persuade them. Imagine a list of TED-style videos tailored specifically for you on each pressing issue that requires you changing your mind.
On the one hand, it presumes that people trust the AI enough to be persuaded, but keep in mind that we’re dealing with a smart restless agent. The only thing it asks is that you keep talking to it.
The last resort would be to press people on “if you think that’s a bad idea, are you ready to bet that this implemented is going to make the world worse?” and create a virtual prediction market between supporters
P.S. This all implies that AI is non-violent communicator. There are many ways to pull people’s strings to persuade them, I presume that we know how to distinguish between manipulative and informative persuasion.
A hint on how to do that is that AI should care about people making INFORMED decisions about THEIR subjective future, not about getting their opinions “objectively” right.
Several people have suggested that a sufficiently smart AI, with the ability to talk to a human as much as it wanted, could persuade the human to “let it out of the box” and give it access to the things it needs to take over the world. This seems plausible to me, say at least 10% probability, which is high enough that it’s worth trying to avoid. And it seems to me that, if you know how to make an AI that’s smart enough to be very useful but will voluntarily restrain itself from persuading humans to hand over the keys to the kingdom, then you must have already solved some of the most difficult parts of alignment. Which means this isn’t a useful intermediate state that can help us reach alignment.
Separately, I’ll mention my opinion that the name of the term “non-violent communication” is either subtle trolling or rank hypocrisy. Because a big chunk of the idea seems to be that you should stick to raw observations and avoid making accusations that would tend to put someone on the defensive… and implying that someone else is committing violence (by communicating in a different style) is one of the most accusatory and putting-them-on-the-defensive things you can do. I’m curious, how many adherents of NVC are aware of this angle on it?
I don’t think NVC tries to put down an opponent, it’s mostly about how you present your ideas. I think it models an opponent as “he tries to win the debate without thinking about my goals. let me think of both mine and theirs goals, so i’m one step ahead”. Which is a bit prerogative and looking down, but not exactly accusatory
Humanity itself is so diverse that the question of what it would mean to be aligned with humanity is a tough one. You can’t hope for unanimous support of any proposal; even for something that would help a bunch of people and cost nothing at all, you could probably find some people who would say “the current system is bad and needs to collapse, and this would delay the collapse and be net bad”.
Additionally, I think the majority of humanity has not studied or thought through some important basics, for example in economics, which lead them to support plenty of things (e.g. price-gouging laws) that I consider mind-bogglingly stupid. I could enumerate a long list of policies that I think are (a) probably correct and (b) opposed by >90% of the humans who have an opinion on it.
So my views are not “aligned”. My actions are another matter. Of course, in this context we’re somewhat interested in what happens if I get the power to put my prescriptions into practice.
If I had magical powers… I’ve been thinking that the “aligned” thing to do would be to help humans grow, via unambiguously good changes, until they’re smart and educated enough to engage with my views on their merits (and, I expect, end up agreeing with a decent number of them). Changes like: making brains simply work better (I’m sure there would be tradeoffs eventually, but I suspect there are a lot of mutations that just make brains slightly worse with no benefits, and that eliminating those would be worth a lot of IQ points), making lifespans and healthspans longer (doubling them would be a nice start), ameliorating or eliminating lots of problems that interfere with people growing (e.g. sleep issues, psychological issues)… I’m sure if I looked into it, I could find a lot more.
Once the majority of people are emotionally healthy geniuses, things should get a lot better, and then I could reevaluate the situation and negotiate with the new people. As long as it hasn’t resulted in some dystopia or rule-by-dictator or in the geniuses blowing up the world, I don’t think I’d be tempted into any forceful interventions.
I think that’s the most important part of the intersection between “aligned with humanity as carefully as you’re going to get” and “transforming society in an enormously positive way”. If that were an option, I think I’d take it. In that respect, I could call myself “aligned”. (Though if I were only part god and the tools available to me had severe downsides people wouldn’t agree to… that might be a problem.)
I think that’s what an ideally “aligned” god would do… along with possibly taking forceful actions to prevent the world from getting destroyed in the meantime, such as by other nascent gods—which is unfortunately hard to distinguish from unaligned “take over the world” behavior. It would be nice if the world were such that, once one god was created and people proved its abilities, other people would voluntarily stop trying to create their own gods in exchange for getting some value from the first god. It seems like prior agreements to do that would help.
Interestingly, have just discussed similar issue with a friend and came up with a solution. Obviously, aligned AI cares about people’s subjective opinion, but that doesn’t mean it’s not allowed to talk/persuade them. Imagine a list of TED-style videos tailored specifically for you on each pressing issue that requires you changing your mind.
On the one hand, it presumes that people trust the AI enough to be persuaded, but keep in mind that we’re dealing with a smart restless agent. The only thing it asks is that you keep talking to it.
The last resort would be to press people on “if you think that’s a bad idea, are you ready to bet that this implemented is going to make the world worse?” and create a virtual prediction market between supporters
P.S. This all implies that AI is non-violent communicator. There are many ways to pull people’s strings to persuade them, I presume that we know how to distinguish between manipulative and informative persuasion.
A hint on how to do that is that AI should care about people making INFORMED decisions about THEIR subjective future, not about getting their opinions “objectively” right.
Several people have suggested that a sufficiently smart AI, with the ability to talk to a human as much as it wanted, could persuade the human to “let it out of the box” and give it access to the things it needs to take over the world. This seems plausible to me, say at least 10% probability, which is high enough that it’s worth trying to avoid. And it seems to me that, if you know how to make an AI that’s smart enough to be very useful but will voluntarily restrain itself from persuading humans to hand over the keys to the kingdom, then you must have already solved some of the most difficult parts of alignment. Which means this isn’t a useful intermediate state that can help us reach alignment.
Separately, I’ll mention my opinion that the name of the term “non-violent communication” is either subtle trolling or rank hypocrisy. Because a big chunk of the idea seems to be that you should stick to raw observations and avoid making accusations that would tend to put someone on the defensive… and implying that someone else is committing violence (by communicating in a different style) is one of the most accusatory and putting-them-on-the-defensive things you can do. I’m curious, how many adherents of NVC are aware of this angle on it?
I don’t think NVC tries to put down an opponent, it’s mostly about how you present your ideas. I think it models an opponent as “he tries to win the debate without thinking about my goals. let me think of both mine and theirs goals, so i’m one step ahead”. Which is a bit prerogative and looking down, but not exactly accusatory