I think this post may be what you’re referring to. I really like this comment in that post:
The Ring is stronger and smarter than you, and doesn’t want what you want. If you think you can use it, you’re wrong. It’s using you, whether you can see how or not.
Providing for material needs is less than 0.0000001% of the range of powers and possibilities that an AGI/ASI offers.
Consider the trans debate. Disclaimer: I’m not trying to take any side in this debate, and am using it for illustrative purposes only. A hundred years ago someone saying “I feel like I’m in the wrong body and feel suicidal” could only be met with one compassionate response, which is to seek psychological or spiritual help. Now scientific progress has advanced enough that it can be hard to determine what the compassionate response is. Do we have enough evidence to determine whether puberty blockers are safe? Are hospitals holding the best interests of the patients at heart or trying to maximize profit from expensive surgeries? If a person is prevented from getting the surgery and kills themselves, should the person who kept them from getting the surgery be held liable? If a person does get the surgery, but later regrets it, should the doctors who encouraged them be held liable? Should doctors who argue against trans surgery lose their medical licenses?
ASI will open up a billion possibilities that will got up to such a scale that if the difficulty of determining whether eating human babies is moral is a 1.0 and the difficulty of determining whether encouraging trans surgeries is moral is a 2.0, each of those possibilities will be in the millions. Our sense of morality will just not apply, and we won’t be able to reason ourselves into a right or wrong course of action. That which makes us human will drown in the seas of black infinity.
I’m sorry, I just don’t have time to engage on these points right now. You’re talking about the alignment problem. It’s the biggest topic on LessWrong. You’re assuming it won’t be solved, but that’s hotly debated among people like me who spend tons of time on the details of the debate.
My recommended starting point is my Cruxes of disagreement on alignment difficulty post. It explains why some people think it’s nearly impossible, some think it’s outright easy, and people like me who think it’s possible but not easy are working like mad to solve it before people actually build AGI.
Providing for material needs is less than 0.0000001% of the range of powers and possibilities that an AGI/ASI offers.
Imagine a scenario where we are driving from Austin to Fort Worth. The range of outcomes where we arrive at our destination is perhaps less than 0.0000001% of the total range of outcomes. There are countless potential interruptions that might prevent us from arriving at Fort Worth: traffic accidents, vehicle breakdowns, family emergencies, sudden illness, extreme weather, or even highly improbable events like alien encounters. The universe of possible outcomes is vast, and arriving safely in Fort Worth represents just one specific path through this possibility space.
Yet despite this, we reasonably expect to complete such journeys under normal circumstances. We don’t let the theoretical multitude of failure modes prevent us from making the trip. Drives like this typically succeed when basic conditions are met—a functioning vehicle, decent weather, and an alert driver.
So, as Seth Herd correctly points out, it all ends up depending on whether we can manage to align AGI (and deal with other issues such as governance or economy). And that’s a very large topic with a very wide range of opinions.
I think this post may be what you’re referring to. I really like this comment in that post:
Providing for material needs is less than 0.0000001% of the range of powers and possibilities that an AGI/ASI offers.
Consider the trans debate. Disclaimer: I’m not trying to take any side in this debate, and am using it for illustrative purposes only. A hundred years ago someone saying “I feel like I’m in the wrong body and feel suicidal” could only be met with one compassionate response, which is to seek psychological or spiritual help. Now scientific progress has advanced enough that it can be hard to determine what the compassionate response is. Do we have enough evidence to determine whether puberty blockers are safe? Are hospitals holding the best interests of the patients at heart or trying to maximize profit from expensive surgeries? If a person is prevented from getting the surgery and kills themselves, should the person who kept them from getting the surgery be held liable? If a person does get the surgery, but later regrets it, should the doctors who encouraged them be held liable? Should doctors who argue against trans surgery lose their medical licenses?
ASI will open up a billion possibilities that will got up to such a scale that if the difficulty of determining whether eating human babies is moral is a 1.0 and the difficulty of determining whether encouraging trans surgeries is moral is a 2.0, each of those possibilities will be in the millions. Our sense of morality will just not apply, and we won’t be able to reason ourselves into a right or wrong course of action. That which makes us human will drown in the seas of black infinity.
I’m sorry, I just don’t have time to engage on these points right now. You’re talking about the alignment problem. It’s the biggest topic on LessWrong. You’re assuming it won’t be solved, but that’s hotly debated among people like me who spend tons of time on the details of the debate.
My recommended starting point is my Cruxes of disagreement on alignment difficulty post. It explains why some people think it’s nearly impossible, some think it’s outright easy, and people like me who think it’s possible but not easy are working like mad to solve it before people actually build AGI.
Imagine a scenario where we are driving from Austin to Fort Worth. The range of outcomes where we arrive at our destination is perhaps less than 0.0000001% of the total range of outcomes. There are countless potential interruptions that might prevent us from arriving at Fort Worth: traffic accidents, vehicle breakdowns, family emergencies, sudden illness, extreme weather, or even highly improbable events like alien encounters. The universe of possible outcomes is vast, and arriving safely in Fort Worth represents just one specific path through this possibility space.
Yet despite this, we reasonably expect to complete such journeys under normal circumstances. We don’t let the theoretical multitude of failure modes prevent us from making the trip. Drives like this typically succeed when basic conditions are met—a functioning vehicle, decent weather, and an alert driver.
So, as Seth Herd correctly points out, it all ends up depending on whether we can manage to align AGI (and deal with other issues such as governance or economy). And that’s a very large topic with a very wide range of opinions.