From a totally amateur point of view, I’m starting to feel (based on following news and reading the occasional paper) that the biggest limitation on AI development is hardware computing power. If so, this good news for safety since it implies a relative lack of exploitable “overhang”. Agree/disagree?
Where could you have possibly gotten that idea? Seriously, can you point out some references for context?
Pretty much universally within the AGI community it is agreed that the roadblock to AGI is software, not hardware. Even on the whole-brain emulation route, the most powerful supercomputer built today is sufficient to do WBE of a human. The most powerful hardware actually in use by a real AGI or WBE research programme is orders of magnitude less powerful, of course. But if that were the only holdup then it’d be very easily fixable.
Not necessarily. For all we know, we might not need to simulate a human brain on an atomic level to get accurate results. Simulating a brain on a neuron level might be sufficient.
Even if you approximate each neuron to a neural network node (which is probably not good enough for a WBE), we still don’t have enough processing power to do a WBE in close to real time. Not even close. We’re many orders of magnitude off even with the fastest supercomputers. And each biological neuron is much more complex than a neural node in function not just in structure.
Hmm, mostly just articles where they get better results with more NN layers/more examples, which are both limited by hardware capacity and have seen large gains from things like using GPUs. Current algos still have far fewer “neurons” than the actual brain AFAIK. Plus, in general, faster hardware allows for faster/cheaper experimentation with different algorithms.
I’ve seen some AI researchers (eg Yann Lecun on Facebook) emphasizing that fundamental techniques haven’t changed that much in decades, yet results continue to improve with more computation.
Current algos still have far fewer “neurons” than the actual brain AFAIK.
This is not primarily because of limitations in computing power. The relevant limitation is on the complexity of the model you can train, without overfitting, in comparison to the volume of data you have (a larger data set permits a more complex model).
Besides what fezziwig said, which is correct, the other issue is the fundamental capabilities of the domain you are looking at. I figured something like this was the source of the error, which is why I asked for context.
Neural networks, deep or otherwise, are basically just classifiers. The reason we’ve seen large advancements recently in machine learning is chiefly because of the immense volumes of data available to these classifier-learning programs. Machine learning is particularly good at taking heaps of structured or unstructured data and finding clusters, then coming up with ways to classify new data into one of those identified clusters. The more data you have, the most detail that can be identified, and the better your classifiers become. Certainly you need a lot of hardware to process the mind boggling amounts of data that are being pushed through these machine learning tools, but hardware is not the limiter, available data is. Giant companies like Google and Facebook are building better and better classifiers not because they have more hardware available, but because they have more data available (chiefly because we are choosing to escrow our personal lives to these companies servers, but that’s an aside).
In as much as machine learning tends to dominate current approaches to narrow AI, you could be excused for saying “the biggest limitation on AI development is availabilities of data.” But you mentioned safety, and AI safety around here is a codeword for general AI, and general AI is truly a software problem that has very little to do with neural networks, data availability, or hardware speeds. “But human brains are networks of neurons!” you reply. True. But the field of computer algorithms called neural networks is a total misnomer. A “neural network” is an algorithm inspired by an over simplification of a misconception of how brains worked that dates back to the 1950′s / 1960′s.
Developing algorithms that are actually capable of performing general intelligence tasks, either bio-inspired or de novo, is the field of artificial general intelligence. And that field is currently software limited. We suspect we have the computational capability to run a human-level AGI today, if only we had the know-how to write one.
I already know all this (from a combination of intro-to-ML course and reading writing along the same lines by Yann Lecun and Andrew Ng), and I’m still leaning towards hardware being the limiting factor (ie I currently don’t think your last sentence is true).
I think you have the right idea, but it’s a mistake to conflate “needs a big corpus of data” and “needs lots of hardware”. Hardware helps, the faster the training goes the more experiments you can do, but a lot of the time the gating factor is the corpus itself.
For example, if you’re trying to train a neural net to solve the “does this photo contain a bird?” problem, you need a bunch of photos which vary at random on the bird/not-bird axis, and you need human raters to go through and tag each photo as bird/not-bird. There are many ways to lose here. For example, your variable of interest might be correlated to something boring (maybe all the bird photos were taken in the morning, and all the not-bird photos were taken in the afternoon), or your raters have to spend a lot of time with each photo (imagine you want to do beak detection, instead of just bird/not-bird: then your raters have to attach a bunch of metadata to each training image, describing the beak position in each bird photo).
The difference between hardware that’s fast enough to fit many iterations into a time span suitable for writing a paper vs. hardware that is slow enough that feedback is infrequent seems fairly relevant to how fast the software can progress.
New insights depend crucially on feedback gotten from trying out the old insights.
the most powerful supercomputer built today is sufficient to do WBE of a human.
I assume you mean at a miniscule fraction of real time and assuming that you can extract all the (unknown) relevant properties of every piece of every neuron?
As with almost any such question, meaning is not inherent in the thing itself, but is given by various people, with no guarantee that anyone will agree.
In other words, it depends on who you ask. :)
For at least some people, who subscribe to the information-pattern theory of identity, a whole brain emulation based on their own brains is at least as good a continuation of their own selves as their original brain would have been, and there are certain advantages to existing in the form of software, such as being able to have multiple off-site backups. Others, who may be focused on the risks of Unfriendly AI, may deem WBEs to be the closest that we’ll be able to get to a Friendly AI before an Unfriendly one starts making paperclips. Others may just want to have the technology available to solve certain scientific mysteries with. There are plenty more such points.
You’d have to ask someone else, I consider it a waste of time. De novo AGI will arrive far, far before we come anywhere close to achieving real-time whole-brain emulation.
And I don’t subscribe to the information-pattern theory of identity for what to me seems obvious experimental reasons, so I don’t see that as a viable route to personal longevity.
De novo AGI will arrive far, far before we come anywhere close to achieving real-time whole-brain emulation.
What’s the best current knowledge for estimating the effort needed for de novo AGI? I find the unknown unknowns with the whole thing where we still don’t seem to really have an idea how everything is supposed to go together worrying for blanket statements like this. We do have a roadmap for whole-brain emulation, but I haven’t seen anything like that for de novo AGI.
And that’s the problem I have. WBE looks like a thing that’ll probably take decades, but we know that the specific solution exists and from neuroscience we have a lot of information about its general properties.
With de novo AGI, beyond knowing that the WBE solution exists, what do we know about solutions we could come up on our own? It seems to me like this could be solved in 10 years or in 100 years, and you can’t really make an informed judgment that the 10 years timeframe is much more probable.
But if you want to discount the WBE approach as not worth the time, you’d pretty much want to claim reason to believe that a 10-20 year timeframe for de novo AGI is exceedingly probable. Beyond that, you’re up against 50-year projects of focused study on WBE with present-day and future computing power, and that sort of thing does look like something where you should assign a significant probability to it producing results.
The thing is, artificial general intelligence is a fairly dead field, even by the standards of AI. There has been a lack of progress, but that is due perhaps more to lack of activity than any inherent difficulty of the problem (although it is a difficult problem). So estimating the effort needed for de novo AI with a presumption of adequate funding cannot be done by fitting curves to past performance. The outside view fails us here, and we need to take the inside view and look at the details.
De novo AGI is not as tightly contstrained a problem as whole-brain emulation. For whole brain emulation, the only seriously considered approach is to scan the brain at sufficient detail, and then perform a sufficiently accurate simulation. There’s a lot of room to quibble about what “sufficient” means in those contexts, destructive vs non-destructive scanning, and other details, but there is a certain amount of unity around the overall idea. You can define the end-state goal in the form of a roadmap, and measure your progress towards it as the entire field has alignment towards the roadmap.
Such a roadmap does not and really cannot exist for AGI (although there have been attempts to do so). The problem is the nature of “de novo AGI”: “de novo” means new, without reference to existing intelligences, and if you open up your problem space like that there are an indefinate number of possible solutions with various tradeoffs and people value those tradeoffs differently. So the field is fractured and it’s really hard to get everybody to agree on a single roadmap.
Pat Langly thinks that good old-fashioned AI has the solution, and we just just need to learn how to constrain inference. Pei Wang thinks that new probabalistic reasoning systems is what is required. Paul Rosenbloom thinks that representation is what matters, and the core of AGI is a framework for reasoning about graphical models. Jeff Hawkins thinks that a hierarchical network of deep learning agents is all that’s required and that it’s mostly a scaling and data structuring problem. Ray Kurzweil has similar biologically inspired ideas. Ben Goertzel thinks they’re all correct and the key is having a common shared framework for moderately intelligent implementations of all of these ideas to collaborate together, and human-level intelligence is achieved from the union.
Goertzel has an approachable collection of essays out on the subject based on a talk he gave sadly almost 10 years ago titled “10 years to the singularity if we really, really try” (spoiler: over the the last 10 years we didn’t really try). It is available as a free PDF here. He also has an actual technical roadmap to achieving AGI which was published as a two-volume book, linked to on LW here. I admit to being much more partial to Goertzel’s approach. And while 10 years seems optimistic for all except the Apollo Program / Manhatten Project funding assumptions, it could be doable under that model. And there are shortcut paths for the less safety-inclined.
Without a common roadmap for AGI it is difficult to get an outsider to agree that AGI could be achieved in a particular timeframe with a particular resource allocation. And it seems particularly impossible to get the entire AGI community to agree on a single roadmap given the diversity of opinions over what approaches we should take and the lack of centralized funding resources. But the best I can fall back on is if you ask any single competent person in this space how quickly a sufficiently advanced AGI could be obtained if sufficient resources were instantly allocated to their favored approach, the answer you’d get would be in the range of 5 to 15 years. “10 years to the singularity if we really, really try” is not a bad summary. We may disagree greatly on the details, and that disunity is keeping us back, but the outcome seems reasonable if coordination and funding problems were solved.
And yes, ~10 years is far less time than the WBE roadmap predicts. So there’s no question as to where I hang my hat in that debate. AGI is a leapfrog technology that has the potential to bring about a singularity event much earlier than any emulative route. Although my day job is currently unrelated (bitcoin), so I can’t profess that I am part of the solution yet, in all honesty.
Can you recommend an artile that argues that our current paradigms are suitable for AI? By paradigms I mean like, software and hardware being different things, or that software is algorithms executed from top to bottom unless control structures say otherwise, or that software is a bunch of text written in human-friendly pseudo-English by beating a keyboard, the process essentially not being so different from writing math-poetry on a typewriter 150 years ago, and then it gets compiled, bytecode compiled, interpreted, or bytecode-compiled before immediate interpretation, and similar paradigms? Doesn’t computing need to be much more imaginative before this happens?
I haven’t seen anyone claim that explicitly, but I think you are also misunderstanding/misrepresenting how modern AI techniques actually work. The bulk of the information in the resulting program is not “hard coded” by humans in the way that you are implying. Generally there are relatively short typed-in programs which then use millions of examples to automatically learn the actual information in a relatively “organic” way. And even the human brain has a sort of short ‘digital’ source code in DNA.
Interesing. My professional bias is showing, part of my job is programming, I respect elite programmers who are able to deal with algorithmic complexity, I thought if AI is the hardest programming problem then it is just more of that.
From a totally amateur point of view, I’m starting to feel (based on following news and reading the occasional paper) that the biggest limitation on AI development is hardware computing power. If so, this good news for safety since it implies a relative lack of exploitable “overhang”. Agree/disagree?
Where could you have possibly gotten that idea? Seriously, can you point out some references for context?
Pretty much universally within the AGI community it is agreed that the roadblock to AGI is software, not hardware. Even on the whole-brain emulation route, the most powerful supercomputer built today is sufficient to do WBE of a human. The most powerful hardware actually in use by a real AGI or WBE research programme is orders of magnitude less powerful, of course. But if that were the only holdup then it’d be very easily fixable.
Why do you think this? We can’t even simulate proteins interactions accurately on an atomic level. Simulating a whole brain seems very far off.
Not necessarily. For all we know, we might not need to simulate a human brain on an atomic level to get accurate results. Simulating a brain on a neuron level might be sufficient.
Even if you approximate each neuron to a neural network node (which is probably not good enough for a WBE), we still don’t have enough processing power to do a WBE in close to real time. Not even close. We’re many orders of magnitude off even with the fastest supercomputers. And each biological neuron is much more complex than a neural node in function not just in structure.
And creating the abstraction is a software problem. :/
Hmm, mostly just articles where they get better results with more NN layers/more examples, which are both limited by hardware capacity and have seen large gains from things like using GPUs. Current algos still have far fewer “neurons” than the actual brain AFAIK. Plus, in general, faster hardware allows for faster/cheaper experimentation with different algorithms.
I’ve seen some AI researchers (eg Yann Lecun on Facebook) emphasizing that fundamental techniques haven’t changed that much in decades, yet results continue to improve with more computation.
This is not primarily because of limitations in computing power. The relevant limitation is on the complexity of the model you can train, without overfitting, in comparison to the volume of data you have (a larger data set permits a more complex model).
Besides what fezziwig said, which is correct, the other issue is the fundamental capabilities of the domain you are looking at. I figured something like this was the source of the error, which is why I asked for context.
Neural networks, deep or otherwise, are basically just classifiers. The reason we’ve seen large advancements recently in machine learning is chiefly because of the immense volumes of data available to these classifier-learning programs. Machine learning is particularly good at taking heaps of structured or unstructured data and finding clusters, then coming up with ways to classify new data into one of those identified clusters. The more data you have, the most detail that can be identified, and the better your classifiers become. Certainly you need a lot of hardware to process the mind boggling amounts of data that are being pushed through these machine learning tools, but hardware is not the limiter, available data is. Giant companies like Google and Facebook are building better and better classifiers not because they have more hardware available, but because they have more data available (chiefly because we are choosing to escrow our personal lives to these companies servers, but that’s an aside).
In as much as machine learning tends to dominate current approaches to narrow AI, you could be excused for saying “the biggest limitation on AI development is availabilities of data.” But you mentioned safety, and AI safety around here is a codeword for general AI, and general AI is truly a software problem that has very little to do with neural networks, data availability, or hardware speeds. “But human brains are networks of neurons!” you reply. True. But the field of computer algorithms called neural networks is a total misnomer. A “neural network” is an algorithm inspired by an over simplification of a misconception of how brains worked that dates back to the 1950′s / 1960′s.
Developing algorithms that are actually capable of performing general intelligence tasks, either bio-inspired or de novo, is the field of artificial general intelligence. And that field is currently software limited. We suspect we have the computational capability to run a human-level AGI today, if only we had the know-how to write one.
I already know all this (from a combination of intro-to-ML course and reading writing along the same lines by Yann Lecun and Andrew Ng), and I’m still leaning towards hardware being the limiting factor (ie I currently don’t think your last sentence is true).
I think you have the right idea, but it’s a mistake to conflate “needs a big corpus of data” and “needs lots of hardware”. Hardware helps, the faster the training goes the more experiments you can do, but a lot of the time the gating factor is the corpus itself.
For example, if you’re trying to train a neural net to solve the “does this photo contain a bird?” problem, you need a bunch of photos which vary at random on the bird/not-bird axis, and you need human raters to go through and tag each photo as bird/not-bird. There are many ways to lose here. For example, your variable of interest might be correlated to something boring (maybe all the bird photos were taken in the morning, and all the not-bird photos were taken in the afternoon), or your raters have to spend a lot of time with each photo (imagine you want to do beak detection, instead of just bird/not-bird: then your raters have to attach a bunch of metadata to each training image, describing the beak position in each bird photo).
The difference between hardware that’s fast enough to fit many iterations into a time span suitable for writing a paper vs. hardware that is slow enough that feedback is infrequent seems fairly relevant to how fast the software can progress.
New insights depend crucially on feedback gotten from trying out the old insights.
I assume you mean at a miniscule fraction of real time and assuming that you can extract all the (unknown) relevant properties of every piece of every neuron?
A miniscule fraction of real time, but a meaningful speed for research purposes.
Can you expand on your reasoning to conclude this? This isn’t obvious to me.
A little off-topic—what’s the point of whole-brain emulation?
As with almost any such question, meaning is not inherent in the thing itself, but is given by various people, with no guarantee that anyone will agree.
In other words, it depends on who you ask. :)
For at least some people, who subscribe to the information-pattern theory of identity, a whole brain emulation based on their own brains is at least as good a continuation of their own selves as their original brain would have been, and there are certain advantages to existing in the form of software, such as being able to have multiple off-site backups. Others, who may be focused on the risks of Unfriendly AI, may deem WBEs to be the closest that we’ll be able to get to a Friendly AI before an Unfriendly one starts making paperclips. Others may just want to have the technology available to solve certain scientific mysteries with. There are plenty more such points.
You’d have to ask someone else, I consider it a waste of time. De novo AGI will arrive far, far before we come anywhere close to achieving real-time whole-brain emulation.
And I don’t subscribe to the information-pattern theory of identity for what to me seems obvious experimental reasons, so I don’t see that as a viable route to personal longevity.
What’s the best current knowledge for estimating the effort needed for de novo AGI? I find the unknown unknowns with the whole thing where we still don’t seem to really have an idea how everything is supposed to go together worrying for blanket statements like this. We do have a roadmap for whole-brain emulation, but I haven’t seen anything like that for de novo AGI.
And that’s the problem I have. WBE looks like a thing that’ll probably take decades, but we know that the specific solution exists and from neuroscience we have a lot of information about its general properties.
With de novo AGI, beyond knowing that the WBE solution exists, what do we know about solutions we could come up on our own? It seems to me like this could be solved in 10 years or in 100 years, and you can’t really make an informed judgment that the 10 years timeframe is much more probable.
But if you want to discount the WBE approach as not worth the time, you’d pretty much want to claim reason to believe that a 10-20 year timeframe for de novo AGI is exceedingly probable. Beyond that, you’re up against 50-year projects of focused study on WBE with present-day and future computing power, and that sort of thing does look like something where you should assign a significant probability to it producing results.
The thing is, artificial general intelligence is a fairly dead field, even by the standards of AI. There has been a lack of progress, but that is due perhaps more to lack of activity than any inherent difficulty of the problem (although it is a difficult problem). So estimating the effort needed for de novo AI with a presumption of adequate funding cannot be done by fitting curves to past performance. The outside view fails us here, and we need to take the inside view and look at the details.
De novo AGI is not as tightly contstrained a problem as whole-brain emulation. For whole brain emulation, the only seriously considered approach is to scan the brain at sufficient detail, and then perform a sufficiently accurate simulation. There’s a lot of room to quibble about what “sufficient” means in those contexts, destructive vs non-destructive scanning, and other details, but there is a certain amount of unity around the overall idea. You can define the end-state goal in the form of a roadmap, and measure your progress towards it as the entire field has alignment towards the roadmap.
Such a roadmap does not and really cannot exist for AGI (although there have been attempts to do so). The problem is the nature of “de novo AGI”: “de novo” means new, without reference to existing intelligences, and if you open up your problem space like that there are an indefinate number of possible solutions with various tradeoffs and people value those tradeoffs differently. So the field is fractured and it’s really hard to get everybody to agree on a single roadmap.
Pat Langly thinks that good old-fashioned AI has the solution, and we just just need to learn how to constrain inference. Pei Wang thinks that new probabalistic reasoning systems is what is required. Paul Rosenbloom thinks that representation is what matters, and the core of AGI is a framework for reasoning about graphical models. Jeff Hawkins thinks that a hierarchical network of deep learning agents is all that’s required and that it’s mostly a scaling and data structuring problem. Ray Kurzweil has similar biologically inspired ideas. Ben Goertzel thinks they’re all correct and the key is having a common shared framework for moderately intelligent implementations of all of these ideas to collaborate together, and human-level intelligence is achieved from the union.
Goertzel has an approachable collection of essays out on the subject based on a talk he gave sadly almost 10 years ago titled “10 years to the singularity if we really, really try” (spoiler: over the the last 10 years we didn’t really try). It is available as a free PDF here. He also has an actual technical roadmap to achieving AGI which was published as a two-volume book, linked to on LW here. I admit to being much more partial to Goertzel’s approach. And while 10 years seems optimistic for all except the Apollo Program / Manhatten Project funding assumptions, it could be doable under that model. And there are shortcut paths for the less safety-inclined.
Without a common roadmap for AGI it is difficult to get an outsider to agree that AGI could be achieved in a particular timeframe with a particular resource allocation. And it seems particularly impossible to get the entire AGI community to agree on a single roadmap given the diversity of opinions over what approaches we should take and the lack of centralized funding resources. But the best I can fall back on is if you ask any single competent person in this space how quickly a sufficiently advanced AGI could be obtained if sufficient resources were instantly allocated to their favored approach, the answer you’d get would be in the range of 5 to 15 years. “10 years to the singularity if we really, really try” is not a bad summary. We may disagree greatly on the details, and that disunity is keeping us back, but the outcome seems reasonable if coordination and funding problems were solved.
And yes, ~10 years is far less time than the WBE roadmap predicts. So there’s no question as to where I hang my hat in that debate. AGI is a leapfrog technology that has the potential to bring about a singularity event much earlier than any emulative route. Although my day job is currently unrelated (bitcoin), so I can’t profess that I am part of the solution yet, in all honesty.
Can you recommend an artile that argues that our current paradigms are suitable for AI? By paradigms I mean like, software and hardware being different things, or that software is algorithms executed from top to bottom unless control structures say otherwise, or that software is a bunch of text written in human-friendly pseudo-English by beating a keyboard, the process essentially not being so different from writing math-poetry on a typewriter 150 years ago, and then it gets compiled, bytecode compiled, interpreted, or bytecode-compiled before immediate interpretation, and similar paradigms? Doesn’t computing need to be much more imaginative before this happens?
I haven’t seen anyone claim that explicitly, but I think you are also misunderstanding/misrepresenting how modern AI techniques actually work. The bulk of the information in the resulting program is not “hard coded” by humans in the way that you are implying. Generally there are relatively short typed-in programs which then use millions of examples to automatically learn the actual information in a relatively “organic” way. And even the human brain has a sort of short ‘digital’ source code in DNA.
Interesing. My professional bias is showing, part of my job is programming, I respect elite programmers who are able to deal with algorithmic complexity, I thought if AI is the hardest programming problem then it is just more of that.