This comment is just a few related thoughts I’ve had on the subject. Hopefully it’s better than nothing (the karma count of my previous comments makes me doubtful)
I’m having a harder time coming up with counter-examples than examples of paperclip maximizers. Weeds in my garden multiplying, cancer tumors growing, internet memes spreading. Pretty much any system has a ‘direction’ in which it evolves, so everything from simple mathematical laws to complex human situations seems to have similar behaviour and risks as paperclip maximizers. A maximizing agent is aligned with itself and every dependency of its goal, so in a sense, every dependency is “protected” by it. You could consider everything in this dependency-chain (e.g. Fnargl’s goal depends on humans, so he doesn’t kill them) to be part of a single system, and every independent factor to be “outside” of this system. An “us vs them” mentality is not as harmful to life if all intelligent life is included in this “us”.
Sadly, I don’t think corporations are a good example, since lobbying and corruption can help them to bend the law in their favour. We will likely see a similar problem with humanity soon, that instead of aligning society to human values, and instead of this being necessary, we will start engineering humans to be aligned with the system. Healthy behaviour which is harmful to society is not allowed, and if anyone doesn’t fit into the modern society (which isn’t very well aligned with human needs), we give them medicine which increase their tolerance to modern society, rather than attempting to create a society in which a wider variety of people can thrive. I don’t think that any complex system is aligned with human *values*, though,just human well-being in some form, so any effective means will suffice, including deception. Also, that which matters to us seems to exist only in the micro-states, and the macro states, due to regression to the mean, seem to effectively delete our individual desires by reducing them to the average. Macro-states can still be said to be a results of average human behaviour, but if we start handing over control to algorithms, then human influence disappears. It’s not only our imperfection which is optimized away but our morality as well.
As a side note, we can’t always be as lucky as we would be in a Fnargl-ocracy, which plans for the future. I’d argue that most of us plan on smaller time-spans, leading to worse outcomes than longer time-spans would. If we worked on much larger time-spans, I think that most problems will be solved, including pressing ones like global warming. This may be sufficient, depending on Moloch.
I mainly agree with the post, to the degree that I don’t see why intelligent people on here are so preoccupied with super-intelligence. In my view, humans are doomed even if we don’t succeed in creating super-intelligence. We seem to be handing control over to something inorganic, while creating things with much more fitness than ourselves, so that we ultimately become superfluous. and I’d argue that society is misaligned with humanity already, which explains most of the unnecessary suffering in the world. Also, I believe that human agency is inversely proportional to technological level (hence the ever-increasing regulations)
Finally, I just had a crazy idea. Make the utility function of the first GAI be the extent to which it can solve the alignment problem (we’re making AIs to solve our other problems already, and these are essentially just smaller alignment problems which hurt everything outside the scope of said problems, like with the Fnargl example, which almost has a high enough scope to preserve the whole)
This comment is just a few related thoughts I’ve had on the subject. Hopefully it’s better than nothing (the karma count of my previous comments makes me doubtful)
I’m having a harder time coming up with counter-examples than examples of paperclip maximizers.
Weeds in my garden multiplying, cancer tumors growing, internet memes spreading. Pretty much any system has a ‘direction’ in which it evolves, so everything from simple mathematical laws to complex human situations seems to have similar behaviour and risks as paperclip maximizers.
A maximizing agent is aligned with itself and every dependency of its goal, so in a sense, every dependency is “protected” by it. You could consider everything in this dependency-chain (e.g. Fnargl’s goal depends on humans, so he doesn’t kill them) to be part of a single system, and every independent factor to be “outside” of this system. An “us vs them” mentality is not as harmful to life if all intelligent life is included in this “us”.
Sadly, I don’t think corporations are a good example, since lobbying and corruption can help them to bend the law in their favour.
We will likely see a similar problem with humanity soon, that instead of aligning society to human values, and instead of this being necessary, we will start engineering humans to be aligned with the system. Healthy behaviour which is harmful to society is not allowed, and if anyone doesn’t fit into the modern society (which isn’t very well aligned with human needs), we give them medicine which increase their tolerance to modern society, rather than attempting to create a society in which a wider variety of people can thrive.
I don’t think that any complex system is aligned with human *values*, though,just human well-being in some form, so any effective means will suffice, including deception. Also, that which matters to us seems to exist only in the micro-states, and the macro states, due to regression to the mean, seem to effectively delete our individual desires by reducing them to the average. Macro-states can still be said to be a results of average human behaviour, but if we start handing over control to algorithms, then human influence disappears. It’s not only our imperfection which is optimized away but our morality as well.
As a side note, we can’t always be as lucky as we would be in a Fnargl-ocracy, which plans for the future. I’d argue that most of us plan on smaller time-spans, leading to worse outcomes than longer time-spans would. If we worked on much larger time-spans, I think that most problems will be solved, including pressing ones like global warming. This may be sufficient, depending on Moloch.
I mainly agree with the post, to the degree that I don’t see why intelligent people on here are so preoccupied with super-intelligence. In my view, humans are doomed even if we don’t succeed in creating super-intelligence. We seem to be handing control over to something inorganic, while creating things with much more fitness than ourselves, so that we ultimately become superfluous. and I’d argue that society is misaligned with humanity already, which explains most of the unnecessary suffering in the world.
Also, I believe that human agency is inversely proportional to technological level (hence the ever-increasing regulations)
Finally, I just had a crazy idea. Make the utility function of the first GAI be the extent to which it can solve the alignment problem (we’re making AIs to solve our other problems already, and these are essentially just smaller alignment problems which hurt everything outside the scope of said problems, like with the Fnargl example, which almost has a high enough scope to preserve the whole)