I had thoughts of doing something very like this a few years ago, back when I still thought we had around 20 years until AGI.
Now I think we have <5 years until AGI, and I suspect you don’t have time for this.
Do you also have a plan in mind for delaying the deployment of dangerous AGI to give humanity more time for working on alignment?
I do not ask this question rhetorically. I have thoughts along this line and would like to discuss them with you.
Thanks for your interest Igor. Let me try better to explain my position. Basically, I am in agreement that ‘brain-like AGI’ or CogEms is the best fastest path towards a safe-enough AGI to at least help us make faster progress towards a more complete alignment solution. I am worried that this project will take about 10 −15 years, and that mainstream ML is going to become catastrophically dangerous within about 5 years.
So, to bridge this gap I think we need to manufacture a delay. We need to stretch the time we have between inventing dangerously capable AGI systems and when that invention leads to catastrophe. We also need to be pursuing alignment (in many ways, including via developing brain-like AGI), or the delay will be squandered. My frustration with Conjecture’s post here is that they talk about pursuing brain-like AGI without at least mentioning that time might be too short for that and that in order for them to be successful they need someone to be working on buying them time.
My focus over the past few months has been on how we might manufacture this delay. My current best answer is that we will have to make do with something like a combination of social and governmental forces and better monitoring tools (compute governance), better safety evaluations (e.g. along the lines of ARC’s safety evals, but even more diverse and thorough), and use of narrow AI tools to monitor and police the internet, using cyberweapons and perhaps official State police force or military might (in the case of international dispute) to stomp out rogue AGI before it can recursively self-improve to catastrophically strong intelligence. This is a tricky subject, potentially quite politically charged and frightening, an unpleasant scenario to talk about. Nevertheless, I think this is where we are and we must face that reality.
I believe that there will be several years before we have any kind of alignment solution, but where we have the ability to build rapidly recursively self-improving AGI which cannot be controlled. Our main concern during this period will be that many humans will not believe that their AGI cannot be controlled, and will see a path to great personal power by building and launching their own AGI. Perhaps also, terrorists and state actors will deliberately attempt to manufacture it as a weapon. How do we address this strategic landscape?
Well, because a lot of scientists have been working on this for quite a while, and the brain is quite complex. On the plus side, there’s a lot of existing work. On the negative side, there’s not a lot of overlap between the group of people who know enough about programming and machine learning and large scale computing vs the group of people who know a lot about neuroscience and the existing partial emulations of the brain and existing detailed explanations of the circuits of the brain.
I mean, it does seem like the sort of project which could be tackled if a large well-funded determined set of experts with clear metrics worked on in parallel. I think I more despair of the idea of organizing such an effort successfully without it drowning in bureaucracy and being dragged down by the heel-dragging culture of current academia.
Basically, I estimate that Conjecture has a handful of smart determined people and maybe a few million dollars to work with, and I estimate this project being accomplished in a reasonable timeframe (like 2-3 years) as an effort that would cost hundreds of millions or billions of dollars and involve hundreds or thousands of people. Maybe my estimates are too pessimistic. I’m a lot less confident about my estimates of the cost of this project than I am in my estimates of how much time we have available to work with before strong AGI capable of recursive self-improvement gets built. I’m less confident about how long we will have between dangerous AGI is built and it actually gets out of control and causes a catastrophe. Another 2 years? 4? 5? I dunno. I doubt very much that it’ll be 10 years. Before then, some kind of action to reduce the threat needs to be taken. Plans which don’t seem to take this into account seem to me to be unhelpfully missing the point.
Eh. It’s sad if this problem is really so complex.
Thank you. At this point, I feel like I have to stick to some way to align AGI, even if it has not that big chance to succeed, because it looks like there are not that many options.
Well, there is the possibility that some wealthy entities (individuals, governments, corporations) will become convinced that they are truly at risk as AGI enters the Overton window. In which case, they might be willing to drop a billion of funding on the project, just in case. The lure of developing uploading as a path to immortality and superpowers may help convince some billionaires.
Also, as AGI becomes more believable and the risk becomes more clear, top neuroscientists and programmers may be willing to drop their current projects and switch to working on uploading.
If both those things happen, I think there’s a good chance it would work out. If not, I am doubtful.
I had thoughts of doing something very like this a few years ago, back when I still thought we had around 20 years until AGI. Now I think we have <5 years until AGI, and I suspect you don’t have time for this. Do you also have a plan in mind for delaying the deployment of dangerous AGI to give humanity more time for working on alignment?
I do not ask this question rhetorically. I have thoughts along this line and would like to discuss them with you.
Can you elaborate on your comment?
It seems so intriguing to me, and I would love to learn more about “Why it’s a bad strategy if our AGI timeline is 5 years or less”?
Thanks for your interest Igor. Let me try better to explain my position. Basically, I am in agreement that ‘brain-like AGI’ or CogEms is the best fastest path towards a safe-enough AGI to at least help us make faster progress towards a more complete alignment solution. I am worried that this project will take about 10 −15 years, and that mainstream ML is going to become catastrophically dangerous within about 5 years.
So, to bridge this gap I think we need to manufacture a delay. We need to stretch the time we have between inventing dangerously capable AGI systems and when that invention leads to catastrophe. We also need to be pursuing alignment (in many ways, including via developing brain-like AGI), or the delay will be squandered. My frustration with Conjecture’s post here is that they talk about pursuing brain-like AGI without at least mentioning that time might be too short for that and that in order for them to be successful they need someone to be working on buying them time.
My focus over the past few months has been on how we might manufacture this delay. My current best answer is that we will have to make do with something like a combination of social and governmental forces and better monitoring tools (compute governance), better safety evaluations (e.g. along the lines of ARC’s safety evals, but even more diverse and thorough), and use of narrow AI tools to monitor and police the internet, using cyberweapons and perhaps official State police force or military might (in the case of international dispute) to stomp out rogue AGI before it can recursively self-improve to catastrophically strong intelligence. This is a tricky subject, potentially quite politically charged and frightening, an unpleasant scenario to talk about. Nevertheless, I think this is where we are and we must face that reality.
I believe that there will be several years before we have any kind of alignment solution, but where we have the ability to build rapidly recursively self-improving AGI which cannot be controlled. Our main concern during this period will be that many humans will not believe that their AGI cannot be controlled, and will see a path to great personal power by building and launching their own AGI. Perhaps also, terrorists and state actors will deliberately attempt to manufacture it as a weapon. How do we address this strategic landscape?
Thanks for your elaborate response!
But why do you think that this project will take so much time? Why can’t it be implemented faster?
Well, because a lot of scientists have been working on this for quite a while, and the brain is quite complex. On the plus side, there’s a lot of existing work. On the negative side, there’s not a lot of overlap between the group of people who know enough about programming and machine learning and large scale computing vs the group of people who know a lot about neuroscience and the existing partial emulations of the brain and existing detailed explanations of the circuits of the brain.
I mean, it does seem like the sort of project which could be tackled if a large well-funded determined set of experts with clear metrics worked on in parallel. I think I more despair of the idea of organizing such an effort successfully without it drowning in bureaucracy and being dragged down by the heel-dragging culture of current academia.
Basically, I estimate that Conjecture has a handful of smart determined people and maybe a few million dollars to work with, and I estimate this project being accomplished in a reasonable timeframe (like 2-3 years) as an effort that would cost hundreds of millions or billions of dollars and involve hundreds or thousands of people. Maybe my estimates are too pessimistic. I’m a lot less confident about my estimates of the cost of this project than I am in my estimates of how much time we have available to work with before strong AGI capable of recursive self-improvement gets built. I’m less confident about how long we will have between dangerous AGI is built and it actually gets out of control and causes a catastrophe. Another 2 years? 4? 5? I dunno. I doubt very much that it’ll be 10 years. Before then, some kind of action to reduce the threat needs to be taken. Plans which don’t seem to take this into account seem to me to be unhelpfully missing the point.
Eh. It’s sad if this problem is really so complex.
Thank you. At this point, I feel like I have to stick to some way to align AGI, even if it has not that big chance to succeed, because it looks like there are not that many options.
Well, there is the possibility that some wealthy entities (individuals, governments, corporations) will become convinced that they are truly at risk as AGI enters the Overton window. In which case, they might be willing to drop a billion of funding on the project, just in case. The lure of developing uploading as a path to immortality and superpowers may help convince some billionaires. Also, as AGI becomes more believable and the risk becomes more clear, top neuroscientists and programmers may be willing to drop their current projects and switch to working on uploading. If both those things happen, I think there’s a good chance it would work out. If not, I am doubtful.