Several points that might counter balance some of your claims, and I hope make you think about the issue from new perspectives.
“We know what’s going on there at the micro level. We know how the systems learn.”
We don’t only know how those systems learn but what exactly they are learning. Lets say you take a photograph, you don’t only know how each pixel is formed, you also know what exactly is that you are taking a picture of. You can’t always predict how this or that specific pixel will end up, as you have lots of noise, but this doesn’t mean you don’t know what the picture represents. Asking a network designer—oh you didn’t know exactly how the network reacts to this specific question, is like coming to a photographer and asking him the exact RGB of a very specific pixel. Such small details are impossible to know.
Networks are basically approximators of functions based on the dataset provided. In case of RL the networks are generalizing a reward function. All those cases are showing that we are trying to “capture a picture” of generalizing the provided data. You can always miss a spot here or there, and the networks might ignore some of the data because they are small for example. But in general, we know how resources are allocated inside the network to represent concepts in order to predict the data or rewards. We can “steer” the network to focus more on this or that aspect of its outputs by providing more data of the sort that we want and respond to the network weaknesses.
“if you look at the evolution of mankind from the perspective of a chimpanzee or a mammoth...If a system is much more intelligent than I am, it can naturally develop ways to limit or threaten me that I can’t even imagine.”
I would suggest trying and avoid anthropomorphism (or properties of biological systems as a whole). Instead of trying to make parallels, I would suggest looking at some math—and see what we can promise about those systems. Let’s take a chess engine—just to keep it neutral for a moment, although the chess moves that it provides are superhuman, that means we don’t know how it came up with those moves, and we can’t be explained why this particular chess move is good, and even though sometimes those networks will do subhuman moves too, generally speaking, the network is promised to be trained to provide the best chess moves. A way smarter than human chess engine, it will still do a task that it was trained on. At the moment the network becomes superhuman, it doesn’t start to want to make some strange chess moves, that will be more fun to play, or seen more desirable by the network. It will just make the best chess moves. Why? Because we trained it on a reward function that was generalizing its winning chances and nothing else. The reward function of LLMs is to provide a response most desired by some group of humans using RLHF method. Even superhuman networks, are promised to be trained and converge to provide such responses. Humans while evolving, weren’t promised by mathematical theorems to provide best actions to benefit chimpanzees or mammoths (or some group of them). So although you can be oblivious to a threat made by superhuman networks, you can be sure that with correct training procedure, the networks will give outputs to maximize a reward function, which in our case would be in alignment with some human collective (a pretty large collective, as small groups have limited resources to train the best networks). So although a general superhuman AGI, can’t be promised to act in our interest, those LLMs as long as they are trained in a certain procedure with a certain data, can be promised to maximize a well being of some human group. I would say it’s much more than chimpanzees had, when humans were evolving.
“we’ll get to a state where we don’t understand what’s going on with the world, and we won’t be able to influence it. Or we’ll get a sense of some unrealistic picture of the world in which we’re happy and we won’t complain, but we won’t decide anything.”
Just like in case with chess, I would prefer that a chess engine will make the decisions, because the engine is doing it much better than myself, regarding human prosperity and happiness, if I am promised by the creator, using math theorems and testing experience he gained during development, that those systems are made to optimize humanity well-being, and it will do it much better than any human—I will gladly give up my control to a system that understands much more than any human how to do that. If for those systems, providing ideas to a policy maker, is the equivalence of providing chess players with best chess moves, I see no reason to stick to human decisions, they will make way more mistakes.
In case of humans there is a small possibility, that the networks will decide the value system, based on their own perception of human well-being, as they were trained by a small group of people, and ignore the wider range of well-being that is more nuanced to different people. But I don’t think the current social structures are so nuanced too, so if some system has a chance to be more aligned with each individual, is not the current political system, but a superhuman network.
“We can find ourselves in the role of a herd of cows, whose fate is being determined by the farmer.”
Once again—the farmer is a biological entity and is not promised by any math theorem to act in the benefit of a herd. But if a herd could train an AI, that would be promised by math theorems, to act on their behalf in their favor, they will be in a better situation than without such an AI.
I would argue that the amount of control can be settled, just like people settle the amount of control for their life with politicians and governments. I would also claim that the current political system is already such a herd situation, and we can do very little about it, while the current political decision making is more subhuman than even could be provided by the best and brightest of humans. So personally, I will feel much less of a herd, if the decisions would be made not by politicians but by some system, based on mathematical theorems and superhuman analysis of data, rather than elected officials.
-----
Generally speaking, I see some amount of anthropomorphism in your claims, and you are somehow ignoring the mathematically established theorems, that promise convergence to a state of the networks that will be aligned with some value system provided to them, and those mathematical theorems holds for superhuman networks as well.
I can sympathize with the fear of losing control, and once the systems would be that advanced that we don’t understand their decisions at all, although most of them will work in our favor, I would be engaged in a discussion of making such decisions or not. For now, we have a great tool in our hands, that promises to solve a lot of our current problems as humanity, I would not throw this tool now, just because in the future we might lose control. As I said previously, I am willing to lose a lot of control to computers, for example when I need to make a complex calculation, I would prefer not to make the computation by hand, but to use a calculator. The exact amount of lost control to feel comfortable can be debated, I think I will belong to the camp that we should not let humans make almost any decisions, and let those systems, as long as we can ensure their alignment make most of the decisions for us. The amount of understanding we should have to allow this or that decision, should be an open question for a relatively far future. For now, we still have people dying from hunger, working in factories, air pollution and other climate change issues, people dying on roads in car accidents, and a lot of deceases that kill us, and most of us (80% worldwide) work in a meaningless jobs just for survival. As long as those problems are not solved, I see no reason to give up our chance to way smarter systems that can provide a set of decisions that will be able to solve all those problems, then we can discuss how much more control we want to give them or take some of it back at some point. And yes, I would agree we could lose control without noticing, and it could be a problematic issue in a long run. I would claim in our current situation, until pretty far advanced systems like say GPT10, we should not fear of losing control to those systems, instead we should be afraid of control we already lost to the current political system, and the control some decision makers have, and what they do with it, and generally the current problems the world has, over losing control to aligned superhuman networks, that give us paradise but we don’t make decisions at all—which maybe even a good thing.
One of the best replies I’ve seen and calmed much of my fears about AI.
My pushback is this. The things you list below as reasons to justify advancing AGI are either already solvable with narrow AI or not solution problems but implementation and alignment problems.
“dying from hunger, working in factories, air pollution and other climate change issues, people dying on roads in car accidents, and a lot of deceases that kill us, and most of us (80% worldwide) work in a meaningless jobs just for survival. “
Developing an intelligence that has 2-5x general human intelligence would need to have a much larger justification. Something like asteroids, unstoppable virus or sudden corrosion of atmosphere would justify the use of bringing out an equally existential technology like superhuman AGI.
What I can’t seem to wrap my head around is why a majority has not emerged that sees the imminent dangers in creating programs that are 5x generally smarter than us at everything. If you don’t fear this I would suggest anthropomorphizing more not less.
Regardless I think politics and regulations crush the AGI pursuits before GPT even launches its next iteration.
AGI enthusiasts have revealed their hand and so far the public has made it loud and clear that no one actually wants this, no one wants displacement or loss of their jobs no matter how sucky they might be. These black boxes scare the crap out of people and people don’t like what they don’t know.
Bad news and fear spreads rapidly in todays era, the world is run by Instagram moms more than anyone else. If it’s the will of the people, Google and Microsoft will find out just how much they are at the mercy of the “essential worker” class.
Several points that might counter balance some of your claims, and I hope make you think about the issue from new perspectives.
“We know what’s going on there at the micro level. We know how the systems learn.”
We don’t only know how those systems learn but what exactly they are learning. Lets say you take a photograph, you don’t only know how each pixel is formed, you also know what exactly is that you are taking a picture of. You can’t always predict how this or that specific pixel will end up, as you have lots of noise, but this doesn’t mean you don’t know what the picture represents. Asking a network designer—oh you didn’t know exactly how the network reacts to this specific question, is like coming to a photographer and asking him the exact RGB of a very specific pixel. Such small details are impossible to know.
Networks are basically approximators of functions based on the dataset provided. In case of RL the networks are generalizing a reward function. All those cases are showing that we are trying to “capture a picture” of generalizing the provided data. You can always miss a spot here or there, and the networks might ignore some of the data because they are small for example. But in general, we know how resources are allocated inside the network to represent concepts in order to predict the data or rewards. We can “steer” the network to focus more on this or that aspect of its outputs by providing more data of the sort that we want and respond to the network weaknesses.
“if you look at the evolution of mankind from the perspective of a chimpanzee or a mammoth...If a system is much more intelligent than I am, it can naturally develop ways to limit or threaten me that I can’t even imagine.”
I would suggest trying and avoid anthropomorphism (or properties of biological systems as a whole). Instead of trying to make parallels, I would suggest looking at some math—and see what we can promise about those systems. Let’s take a chess engine—just to keep it neutral for a moment, although the chess moves that it provides are superhuman, that means we don’t know how it came up with those moves, and we can’t be explained why this particular chess move is good, and even though sometimes those networks will do subhuman moves too, generally speaking, the network is promised to be trained to provide the best chess moves. A way smarter than human chess engine, it will still do a task that it was trained on. At the moment the network becomes superhuman, it doesn’t start to want to make some strange chess moves, that will be more fun to play, or seen more desirable by the network. It will just make the best chess moves. Why? Because we trained it on a reward function that was generalizing its winning chances and nothing else. The reward function of LLMs is to provide a response most desired by some group of humans using RLHF method. Even superhuman networks, are promised to be trained and converge to provide such responses. Humans while evolving, weren’t promised by mathematical theorems to provide best actions to benefit chimpanzees or mammoths (or some group of them). So although you can be oblivious to a threat made by superhuman networks, you can be sure that with correct training procedure, the networks will give outputs to maximize a reward function, which in our case would be in alignment with some human collective (a pretty large collective, as small groups have limited resources to train the best networks). So although a general superhuman AGI, can’t be promised to act in our interest, those LLMs as long as they are trained in a certain procedure with a certain data, can be promised to maximize a well being of some human group. I would say it’s much more than chimpanzees had, when humans were evolving.
“we’ll get to a state where we don’t understand what’s going on with the world, and we won’t be able to influence it. Or we’ll get a sense of some unrealistic picture of the world in which we’re happy and we won’t complain, but we won’t decide anything.”
Just like in case with chess, I would prefer that a chess engine will make the decisions, because the engine is doing it much better than myself, regarding human prosperity and happiness, if I am promised by the creator, using math theorems and testing experience he gained during development, that those systems are made to optimize humanity well-being, and it will do it much better than any human—I will gladly give up my control to a system that understands much more than any human how to do that. If for those systems, providing ideas to a policy maker, is the equivalence of providing chess players with best chess moves, I see no reason to stick to human decisions, they will make way more mistakes.
In case of humans there is a small possibility, that the networks will decide the value system, based on their own perception of human well-being, as they were trained by a small group of people, and ignore the wider range of well-being that is more nuanced to different people. But I don’t think the current social structures are so nuanced too, so if some system has a chance to be more aligned with each individual, is not the current political system, but a superhuman network.
“We can find ourselves in the role of a herd of cows, whose fate is being determined by the farmer.”
Once again—the farmer is a biological entity and is not promised by any math theorem to act in the benefit of a herd. But if a herd could train an AI, that would be promised by math theorems, to act on their behalf in their favor, they will be in a better situation than without such an AI.
I would argue that the amount of control can be settled, just like people settle the amount of control for their life with politicians and governments. I would also claim that the current political system is already such a herd situation, and we can do very little about it, while the current political decision making is more subhuman than even could be provided by the best and brightest of humans. So personally, I will feel much less of a herd, if the decisions would be made not by politicians but by some system, based on mathematical theorems and superhuman analysis of data, rather than elected officials.
-----
Generally speaking, I see some amount of anthropomorphism in your claims, and you are somehow ignoring the mathematically established theorems, that promise convergence to a state of the networks that will be aligned with some value system provided to them, and those mathematical theorems holds for superhuman networks as well.
I can sympathize with the fear of losing control, and once the systems would be that advanced that we don’t understand their decisions at all, although most of them will work in our favor, I would be engaged in a discussion of making such decisions or not. For now, we have a great tool in our hands, that promises to solve a lot of our current problems as humanity, I would not throw this tool now, just because in the future we might lose control. As I said previously, I am willing to lose a lot of control to computers, for example when I need to make a complex calculation, I would prefer not to make the computation by hand, but to use a calculator. The exact amount of lost control to feel comfortable can be debated, I think I will belong to the camp that we should not let humans make almost any decisions, and let those systems, as long as we can ensure their alignment make most of the decisions for us. The amount of understanding we should have to allow this or that decision, should be an open question for a relatively far future. For now, we still have people dying from hunger, working in factories, air pollution and other climate change issues, people dying on roads in car accidents, and a lot of deceases that kill us, and most of us (80% worldwide) work in a meaningless jobs just for survival. As long as those problems are not solved, I see no reason to give up our chance to way smarter systems that can provide a set of decisions that will be able to solve all those problems, then we can discuss how much more control we want to give them or take some of it back at some point. And yes, I would agree we could lose control without noticing, and it could be a problematic issue in a long run. I would claim in our current situation, until pretty far advanced systems like say GPT10, we should not fear of losing control to those systems, instead we should be afraid of control we already lost to the current political system, and the control some decision makers have, and what they do with it, and generally the current problems the world has, over losing control to aligned superhuman networks, that give us paradise but we don’t make decisions at all—which maybe even a good thing.
One of the best replies I’ve seen and calmed much of my fears about AI. My pushback is this. The things you list below as reasons to justify advancing AGI are either already solvable with narrow AI or not solution problems but implementation and alignment problems.
“dying from hunger, working in factories, air pollution and other climate change issues, people dying on roads in car accidents, and a lot of deceases that kill us, and most of us (80% worldwide) work in a meaningless jobs just for survival. “
Developing an intelligence that has 2-5x general human intelligence would need to have a much larger justification. Something like asteroids, unstoppable virus or sudden corrosion of atmosphere would justify the use of bringing out an equally existential technology like superhuman AGI.
What I can’t seem to wrap my head around is why a majority has not emerged that sees the imminent dangers in creating programs that are 5x generally smarter than us at everything. If you don’t fear this I would suggest anthropomorphizing more not less.
Regardless I think politics and regulations crush the AGI pursuits before GPT even launches its next iteration.
AGI enthusiasts have revealed their hand and so far the public has made it loud and clear that no one actually wants this, no one wants displacement or loss of their jobs no matter how sucky they might be. These black boxes scare the crap out of people and people don’t like what they don’t know.
Bad news and fear spreads rapidly in todays era, the world is run by Instagram moms more than anyone else. If it’s the will of the people, Google and Microsoft will find out just how much they are at the mercy of the “essential worker” class.