Some really intriguing insights and persuasive arguments in this post, but I feel like we are just talking about the problems that often come with significant technological innovations.
It seems like, for the purposes of this post, AGI is defined loosely as a “strong AI” which is technological breakthrough that is dangerous enough to be a genuine threat to human survival. Many potential technological breakthroughs can have this property and in this post it feels as if AGI is being reduced to some sort of potentially dangerous and uncontrollable software virus.
While this question is important and I get why the community is so anxiously focused on it, I don’t find it to be a the most interesting question.
The more interesting question to me is how and when will these systems become true AGI’s that are conscious in some similar way to us and capable of creating new knowledge (with universal reach) in the way we do?
I think we get there, (and maybe sooner rather than later?) but how we do and the explanations uncovered will be one of the among the most fascinating and revelatory discoveries in history.
Many potential technological breakthroughs can have this property and in this post it feels as if AGI is being reduced to some sort of potentially dangerous and uncontrollable software virus.
The wording may have understated my concern. The level of capability I’m talking about is “if this gets misused, or if it is the kind of thing that goes badly even if not misused, everyone dies.”
No other technological advancement has had this property to this degree. To phrase it in another way, let’s describe technological leverage L as the amount of change C a technology can cause, divided by the amount of work W required to cause that change: L=CW
For example, it’s pretty clear that L for steam turbines is much smaller than for nuclear power or nuclear weapons. Trying to achieve the same level of change with steam would require far more work.
But how much work would it take to kill all humans with nuclear weapons? It looks like a lot. Current arsenals almost certainly wouldn’t do it. We could build far larger weapons, but building enough would be extremely difficult and expensive. Maybe with a coordinated worldwide effort we could extinguish ourselves this way.
In contrast, if Googetasoft had knowledge of how to build an unaligned AGI of this level of capability, it would take almost no effort at all. A bunch of computers and maybe a few months. Even if you had to spend tens of billions of dollars on training, the L is ridiculously high.
Things like “creating new knowledge” would be a trivial byproduct of this kind of process. It will certainly be interesting, but my interest is currently overshadowed by the whole dying thing.
Interesting and useful concept, technological leverage.
I’m curious what Googetasoft is?
OK I can see a strong AI algorithm being able to do many things we consider intelligence, and I can see how the technological leverage it would have in our increasingly digital / networked world would be far greater than many previous technologies.
This is the story of all new technological advancements, bigger benefits as well as bigger problems and dangers that need to be addressed or solved or else bigger bad things can happen. There will be no end to these types of problems going forward if we are to continue to progress, and there is no guarantee we can solve them, but there is no law of physics saying we can’t.
The efforts on this front are good, necessary, and should demand our attention, but I think this whole effort isn’t really about AGI.
I guess I don’t understand how scaling up or tweaking the current approach will lead AI’s that are uncontrollable or “run away” from us? I’m actually rather skeptical of this.
I agree regular AI can generate new knowledge but only an AGI will do so creatively and and recognize it as so. I don’t think we are close to creating that kind of AGI yet with the current approach as we don’t really understand how creativity works.
That being said, it can’t be that hard if evolution was able to figure it out.
The unholy spiritual merger of Google, Meta, Microsoft, and all the other large organizations pushing capabilities.
I guess I don’t understand how scaling up or tweaking the current approach will lead AI’s that are uncontrollable or “run away” from us? I’m actually rather skeptical of this.
It’s possible that the current approach (that is, token predicting large language models using transformers like we use them now) won’t go somewhere potentially dangerous, because they won’t be capable enough. It’s hard to make this claim with high certainty, though- GPT-3 already does a huge amount with very little. If Chinchilla was 1,000x larger and trained across 1,000x more data (say, the entirety of youtube), what is it going to be able to do? It wouldn’t be surprising if it could predict a video of two humans sitting down in a restaurant having a conversation. It probably would have a decent model of how newtonian physics works, since everything filmed in the real world would benefit from that understanding. Might it also learn more subtle things? Detailed mental models of humans, because it needs to predict tokens from the slightest quirk of an eyebrow, or a tremor in a person’s voice? How much of chemistry, nuclear physics, or biology could it learn? I don’t know, but I really can’t assign a significant probability to it just failing completely given what we’ve already observed.
Critically, we cannot make assumptions about what it can and can’t learn based on what we think its dataset is about. Consider that GPT-3′s dataset didn’t have a bunch of text about how to predict tokens- it learned to predict tokens because of the loss function. Everything it knows, everything it can do, was learned because it increased the probability that the next predicted token will be correct. If there’s some detail- maybe something about physics, or how humans work- that helps it predict tokens better, we should not just assume that it will be inaccessible to even simple token predictors. Remember, the AI is much, much better than you at predicting tokens, and you’re not doing the same thing it is.
In other words...
I don’t think we are close to creating that kind of AGI yet with the current approach as we don’t really understand how creativity works.
We don’t have a good understanding of how any of this works. We don’t need to have a good understanding of how it works to make it happen, apparently. This is the ridiculous truth of machine learning that’s slapped me in the face several times over the last 5-10 years. And yes, evolution managing to solve it definitely doesn’t give me warm fuzzies about it being hard.
And we’re not even slightly bottlenecked on… anything, really. Transformers and token predictors aren’t the endgame. There are really obvious steps forward, and even tiny changes to how we use existing architectures massively increase capability (just look at prompt engineering, or how Minerva worked, and so on).
Going back to the idea of the AI being uncontrollable- we just don’t know how to do it yet. Token predictors just predict tokens, but even there, we struggle to figure out what the AI can actually do because it’s not “interested” in giving you correct answers. It just predicts tokens. So we get the entire subfield of prompt engineering that tries to elicit its skills and knowledge by… asking it nicely???
(It may seem like a token predictor is safer in some ways, which I’d agree with in principle. The outer behavior of the AI isn’t agentlike. But it can predict tokens associated with agents. And the more capable it is, the more capable the simulated agents are. This is just one trivial example of how an oracle/tool AI can easily get turned into something dangerous.)
An obvious guess might be something like reinforcement learning. Just reward it for doing the things you want, right? Not a bad first stab at the problem… but it doesn’t really work. This isn’t just at theoretical concern- it fails in practice. And we don’t know how to fix it rigorously yet.
It could be that the problem is easy, and there’s a natural basin of safe solution space that AI will fall into as they become more capable. That would be very helpful, since it would mean there are far more paths to good outcomes. But we don’t know if that’s how reality actually works, and the very obvious theoretical and practical failure modes of some architectures (like maximizers) are worrying. I definitely don’t want to bet humanity’s survival on “we happen to live in a reality where the problem is super easy.”
I’d feel a lot better about our chances if anyone ever outlined how we would actually, concretely, do it. So far, every proposal seems either obviously broken, or it relies on reality being on easymode. (Edit: or they’re still being worked on!)
Advances in ML over the next few years as being no different than advances (over the next few years) of any other technology VS the hard leap into something that is right out of science fiction. There is a gap, and a very large one at that. What I have posted for this “prize” (and personally as a regular course of action in calling out the ability gap) is about looking for milestones of development of that sci-fi stuff, while giving less weight to flashy demo’s that don’t reflect core methods (only incremental advancement of existing methods).
*under current group think, risk from ML is going to happen faster than can be planned for, while AGI risk sneaks-up on you because you were looking in the wrong direction. At least, mitigation policies for AGI risk will target ML methods, and won’t even apply to AGI fundamentals.
Some really intriguing insights and persuasive arguments in this post, but I feel like we are just talking about the problems that often come with significant technological innovations.
It seems like, for the purposes of this post, AGI is defined loosely as a “strong AI” which is technological breakthrough that is dangerous enough to be a genuine threat to human survival. Many potential technological breakthroughs can have this property and in this post it feels as if AGI is being reduced to some sort of potentially dangerous and uncontrollable software virus.
While this question is important and I get why the community is so anxiously focused on it, I don’t find it to be a the most interesting question.
The more interesting question to me is how and when will these systems become true AGI’s that are conscious in some similar way to us and capable of creating new knowledge (with universal reach) in the way we do?
I think we get there, (and maybe sooner rather than later?) but how we do and the explanations uncovered will be one of the among the most fascinating and revelatory discoveries in history.
The wording may have understated my concern. The level of capability I’m talking about is “if this gets misused, or if it is the kind of thing that goes badly even if not misused, everyone dies.”
No other technological advancement has had this property to this degree. To phrase it in another way, let’s describe technological leverage L as the amount of change C a technology can cause, divided by the amount of work W required to cause that change: L=CW
For example, it’s pretty clear that L for steam turbines is much smaller than for nuclear power or nuclear weapons. Trying to achieve the same level of change with steam would require far more work.
But how much work would it take to kill all humans with nuclear weapons? It looks like a lot. Current arsenals almost certainly wouldn’t do it. We could build far larger weapons, but building enough would be extremely difficult and expensive. Maybe with a coordinated worldwide effort we could extinguish ourselves this way.
In contrast, if Googetasoft had knowledge of how to build an unaligned AGI of this level of capability, it would take almost no effort at all. A bunch of computers and maybe a few months. Even if you had to spend tens of billions of dollars on training, the L is ridiculously high.
Things like “creating new knowledge” would be a trivial byproduct of this kind of process. It will certainly be interesting, but my interest is currently overshadowed by the whole dying thing.
Interesting and useful concept, technological leverage.
I’m curious what Googetasoft is?
OK I can see a strong AI algorithm being able to do many things we consider intelligence, and I can see how the technological leverage it would have in our increasingly digital / networked world would be far greater than many previous technologies.
This is the story of all new technological advancements, bigger benefits as well as bigger problems and dangers that need to be addressed or solved or else bigger bad things can happen. There will be no end to these types of problems going forward if we are to continue to progress, and there is no guarantee we can solve them, but there is no law of physics saying we can’t.
The efforts on this front are good, necessary, and should demand our attention, but I think this whole effort isn’t really about AGI.
I guess I don’t understand how scaling up or tweaking the current approach will lead AI’s that are uncontrollable or “run away” from us? I’m actually rather skeptical of this.
I agree regular AI can generate new knowledge but only an AGI will do so creatively and and recognize it as so. I don’t think we are close to creating that kind of AGI yet with the current approach as we don’t really understand how creativity works.
That being said, it can’t be that hard if evolution was able to figure it out.
The unholy spiritual merger of Google, Meta, Microsoft, and all the other large organizations pushing capabilities.
It’s possible that the current approach (that is, token predicting large language models using transformers like we use them now) won’t go somewhere potentially dangerous, because they won’t be capable enough. It’s hard to make this claim with high certainty, though- GPT-3 already does a huge amount with very little. If Chinchilla was 1,000x larger and trained across 1,000x more data (say, the entirety of youtube), what is it going to be able to do? It wouldn’t be surprising if it could predict a video of two humans sitting down in a restaurant having a conversation. It probably would have a decent model of how newtonian physics works, since everything filmed in the real world would benefit from that understanding. Might it also learn more subtle things? Detailed mental models of humans, because it needs to predict tokens from the slightest quirk of an eyebrow, or a tremor in a person’s voice? How much of chemistry, nuclear physics, or biology could it learn? I don’t know, but I really can’t assign a significant probability to it just failing completely given what we’ve already observed.
Critically, we cannot make assumptions about what it can and can’t learn based on what we think its dataset is about. Consider that GPT-3′s dataset didn’t have a bunch of text about how to predict tokens- it learned to predict tokens because of the loss function. Everything it knows, everything it can do, was learned because it increased the probability that the next predicted token will be correct. If there’s some detail- maybe something about physics, or how humans work- that helps it predict tokens better, we should not just assume that it will be inaccessible to even simple token predictors. Remember, the AI is much, much better than you at predicting tokens, and you’re not doing the same thing it is.
In other words...
We don’t have a good understanding of how any of this works. We don’t need to have a good understanding of how it works to make it happen, apparently. This is the ridiculous truth of machine learning that’s slapped me in the face several times over the last 5-10 years. And yes, evolution managing to solve it definitely doesn’t give me warm fuzzies about it being hard.
And we’re not even slightly bottlenecked on… anything, really. Transformers and token predictors aren’t the endgame. There are really obvious steps forward, and even tiny changes to how we use existing architectures massively increase capability (just look at prompt engineering, or how Minerva worked, and so on).
Going back to the idea of the AI being uncontrollable- we just don’t know how to do it yet. Token predictors just predict tokens, but even there, we struggle to figure out what the AI can actually do because it’s not “interested” in giving you correct answers. It just predicts tokens. So we get the entire subfield of prompt engineering that tries to elicit its skills and knowledge by… asking it nicely???
(It may seem like a token predictor is safer in some ways, which I’d agree with in principle. The outer behavior of the AI isn’t agentlike. But it can predict tokens associated with agents. And the more capable it is, the more capable the simulated agents are. This is just one trivial example of how an oracle/tool AI can easily get turned into something dangerous.)
An obvious guess might be something like reinforcement learning. Just reward it for doing the things you want, right? Not a bad first stab at the problem… but it doesn’t really work. This isn’t just at theoretical concern- it fails in practice. And we don’t know how to fix it rigorously yet.
It could be that the problem is easy, and there’s a natural basin of safe solution space that AI will fall into as they become more capable. That would be very helpful, since it would mean there are far more paths to good outcomes. But we don’t know if that’s how reality actually works, and the very obvious theoretical and practical failure modes of some architectures (like maximizers) are worrying. I definitely don’t want to bet humanity’s survival on “we happen to live in a reality where the problem is super easy.”
I’d feel a lot better about our chances if anyone ever outlined how we would actually, concretely, do it. So far, every proposal seems either obviously broken, or it relies on reality being on easymode. (Edit: or they’re still being worked on!)
Advances in ML over the next few years as being no different than advances (over the next few years) of any other technology VS the hard leap into something that is right out of science fiction. There is a gap, and a very large one at that. What I have posted for this “prize” (and personally as a regular course of action in calling out the ability gap) is about looking for milestones of development of that sci-fi stuff, while giving less weight to flashy demo’s that don’t reflect core methods (only incremental advancement of existing methods).
*under current group think, risk from ML is going to happen faster than can be planned for, while AGI risk sneaks-up on you because you were looking in the wrong direction. At least, mitigation policies for AGI risk will target ML methods, and won’t even apply to AGI fundamentals.