But I didn’t see you as presenting preventing fully general, self-improving AGI as a delaying tactic. I saw you as presenting it as a solution.
Actually, my point in this post is that we don’t NEED AGI for a great future, because often people equate Not AGI = Not amazing future (or even a terrible one) and I think this is wrong. The point of this post is not to argue that preventing AGI is easy.
However, it’s actually very simple: If we build a misaligned AGI, we’re dead. So there are only two options: A) solve alignment, B) not build AGI. If not A), then there’s only B), however “impossible” that may be.
Yet lots of people DID (and do) take hydroxychloroquine and ivermectin for COVID, a nontrivial number of people do in fact eat random mushrooms, and the others aren’t unheard-of.
Yes. My hope is not that 100% of mankind will be smart enough to not build an AGI, but that maybe 90+% will be good enough, because we can prevent the rest from getting there, at least for a while. Currently, you need a lot of compute to train a Sub-AGI LLM. Maybe we can put the lid on who’s getting how much compute, at least for a time. And maybe the top guys at the big labs are among the 90% non-insane people. Doesn’t look very hopeful, I admit.
Anyway, I haven’t seen you offer an alternative. Once again, I’m not saying not developing AGI is an easy task. But saying it’s impossible (while not having solved alignment) is saying “we’ll all die anyway”. If that’s the case, then we can as well try the “impossible” things and at least die with dignity.
Actually, my point in this post is that we don’t NEED AGI for a great future, because often people equate Not AGI = Not amazing future (or even a terrible one) and I think this is wrong.
I don’t have so much of a problem with that part.
It would prevent my personal favorite application for fully generally strongly superhuman AGI… which is to have it take over the world and keep humans from screwing things up more. I’m not sure I’d want humans to have access to some of the stuff non-AGI could do… but I don’t think here’s any way to prevent that.
If we build a misaligned AGI, we’re dead. So there are only two options: A) solve alignment, B) not build AGI. If not A), then there’s only B), however “impossible” that may be.
C) Give up.
Anyway, I haven’t seen you offer an alternative.
You’re not going to like it...
Personally, if made king of the world, I would try to discourage at least large scale efforts to develop either generalized agents or “narrow AI”, especially out of opaque technology like ML. Thats because narrow AI could easily become parts or tools for a generalized agent, because many kinds of narrow AI are too dangerous in human hands, and because the tools and expertise for narrow AI are too close to those for generalized AGI,. It would be extremely difficult to suppress one in practice without suppressing the other.
I’d probably start by making it as unprofitable as I could by banning likely applications. That’s relatively easy to enforce because many applications are visible. A lot of the current narrow AI applications need bannin’ anyhow. Then I’d start working on a list of straight-up prohibitions.
Then I’d dump a bunch of resources into research on assuring behavior in general and on more transparent architectures. I would not actually expect it to work, but it has enough of a chance to be worth a try,. That work would be a lot more public than most people on Less Wrong would be comfortable with, because I’m afraid of nasty knock-on effects from trying to make it secret. And I’d be a little looser about capability work in service of that goal than in service of any other.
I would think very hard about banning large aggregations of vector compute hardware, and putting various controls on smaller ones, and would almost certainly end up doing it for some size thresholds. I’m not sure what the thresholds would be, nor exactly what the controls would be. This part would be very hard to enforce regardless.
I would not do anything that relied on perfect enforcement for its effecitveness, and I would not try to ramp up enforcement to the point where it was absolutely impossible to break my rules, because I would fail and make people miserable. I would titrate enforcement and stick with measures that seemed to be working without causing horrible pain.
I’d hope to get a few years out of that, and maybe a breakthrough on safety if I were tremendously lucky. Given oerfect confidence in a real breakthrough, I would try to abdicate in favor of the AGI.
If made king of only part of the world, I would try to convince the other parts to collaborate with me in imposing roughly the same regime. How I reacted if they didn’t do that would depend on how much leverage I had and what they did seem to be doing. I would try really, really hard not to start any wars over it. Regardless of what they said they were doing I would assume that they were engaging in AGI research under the table. Not quite sure what I’d do with that assumption, though.
But I am not king of the world, and I do not think it’s feasible for me to become king of the world.
I also doubt that the actual worldwide political system, or even the political systems of most large countries, can actually be made to take any very effective measures within any useful amount of time. There are too many people out there with too many different opinions, too many power centers with contrary interests, too much mutal distrust, and too many other people with too much skill at deflecting any kind of policy initiative down ways that sort of look like they serve the original purpose, but mostly don’t. The devil is often in the details.
If it is possible to get the system to do that, I know that I am not capable of doing so. I mean, I’ll vote for it, maybe make write some letters, but I know from experience that I have nearly no ability to persuade the sorts of people who’d need to be persuaded.
I am also not capable of solving the technical problem myself and doing some “pivotal act”. In fact I’m pretty sure I have no technical ideas for things to try that aren’t obvious to most specialists. And I don’t much buy any of the the ideas I’ve heard from other people.
My only real hopes are things that neither I nor anybody else can influence, especially not in any predictable direction, like limitations on intelligence and uncertainty about doom.
So my personal solution is to read random stuff, study random things, putter around in my workshop, spend time with my kid, and generally have a good time.
We’re not as far apart as you probably think. I’d agree with most of your decisions. I’d even vote for you to become king! :) Like I wrote, I think we must also be cautious with narrow AI as well, and I agree with your points about opaqueness and the potential of narrow AI turning into AGI. Again, the purpose of my post was not to argue how we could make AI safe, but to point out that we could have a great future without AGI. And I still see a lot of beneficial potential in narrow AI, IF we’re cautious enough.
Independent of potential for growing into AGI and {S,X}-risk resulting from that?
With the understanding that these are very rough descriptions that need much more clarity and nuance, that one or two of them might be flat out wrong, that some of them might turn out to be impossible to codify usefully in practice, that there there might be specific exceptions for some of them, and that the list isn’t necessarily complete--
Recommendation systems that optimize for “engagement” (or proxy measures thereof).
Anything that identifies or tracks people, or proxies like vehicles, in spaces open to the public. Also collection of data that would be useful for this.
Anything that mass-classifies private communications, including closed group communications, for any use by anybody not involved in the communication.
Anything specifically designed to produce media showing real people in false situations or to show them saying or doing things they have not actually done.
Anything that adaptively tries to persuade anybody to buy anything or give anybody money, or to hold or not hold any opinion of any person or organization.
Anything that tries to make people anthropomorphize it or develop affection for it.
Anything that tries to classify humans into risk groups based on, well, anything.
Anything that purports to read minds or act as a lie detector, live or on recorded or written material.
Good list. Another one that caught my attention that I saw in the EU act was AIs specialised into subliminal messages. people’s choices can be somewhat conditioned in favor or against things in certain ways by feeding them sensory data even if it’s not consciously perceptible, it can also affect their emotional states more broadly.
I don’t know how effective this stuff is in real life, but I know that it at least works.
Anything that tries to classify humans into risk groups based on, well, anything.
A particular example of that one is systems of social scoring, which are surely gonna be used by authoritarian regimes. You can screw people up in so many ways when social control is centralised with AI systems. It’s great to punish people for not being chauvinists
Fine. I’ll take it.
Actually, my point in this post is that we don’t NEED AGI for a great future, because often people equate Not AGI = Not amazing future (or even a terrible one) and I think this is wrong. The point of this post is not to argue that preventing AGI is easy.
However, it’s actually very simple: If we build a misaligned AGI, we’re dead. So there are only two options: A) solve alignment, B) not build AGI. If not A), then there’s only B), however “impossible” that may be.
Yes. My hope is not that 100% of mankind will be smart enough to not build an AGI, but that maybe 90+% will be good enough, because we can prevent the rest from getting there, at least for a while. Currently, you need a lot of compute to train a Sub-AGI LLM. Maybe we can put the lid on who’s getting how much compute, at least for a time. And maybe the top guys at the big labs are among the 90% non-insane people. Doesn’t look very hopeful, I admit.
Anyway, I haven’t seen you offer an alternative. Once again, I’m not saying not developing AGI is an easy task. But saying it’s impossible (while not having solved alignment) is saying “we’ll all die anyway”. If that’s the case, then we can as well try the “impossible” things and at least die with dignity.
I don’t have so much of a problem with that part.
It would prevent my personal favorite application for fully generally strongly superhuman AGI… which is to have it take over the world and keep humans from screwing things up more. I’m not sure I’d want humans to have access to some of the stuff non-AGI could do… but I don’t think here’s any way to prevent that.
C) Give up.
You’re not going to like it...
Personally, if made king of the world, I would try to discourage at least large scale efforts to develop either generalized agents or “narrow AI”, especially out of opaque technology like ML. Thats because narrow AI could easily become parts or tools for a generalized agent, because many kinds of narrow AI are too dangerous in human hands, and because the tools and expertise for narrow AI are too close to those for generalized AGI,. It would be extremely difficult to suppress one in practice without suppressing the other.
I’d probably start by making it as unprofitable as I could by banning likely applications. That’s relatively easy to enforce because many applications are visible. A lot of the current narrow AI applications need bannin’ anyhow. Then I’d start working on a list of straight-up prohibitions.
Then I’d dump a bunch of resources into research on assuring behavior in general and on more transparent architectures. I would not actually expect it to work, but it has enough of a chance to be worth a try,. That work would be a lot more public than most people on Less Wrong would be comfortable with, because I’m afraid of nasty knock-on effects from trying to make it secret. And I’d be a little looser about capability work in service of that goal than in service of any other.
I would think very hard about banning large aggregations of vector compute hardware, and putting various controls on smaller ones, and would almost certainly end up doing it for some size thresholds. I’m not sure what the thresholds would be, nor exactly what the controls would be. This part would be very hard to enforce regardless.
I would not do anything that relied on perfect enforcement for its effecitveness, and I would not try to ramp up enforcement to the point where it was absolutely impossible to break my rules, because I would fail and make people miserable. I would titrate enforcement and stick with measures that seemed to be working without causing horrible pain.
I’d hope to get a few years out of that, and maybe a breakthrough on safety if I were tremendously lucky. Given oerfect confidence in a real breakthrough, I would try to abdicate in favor of the AGI.
If made king of only part of the world, I would try to convince the other parts to collaborate with me in imposing roughly the same regime. How I reacted if they didn’t do that would depend on how much leverage I had and what they did seem to be doing. I would try really, really hard not to start any wars over it. Regardless of what they said they were doing I would assume that they were engaging in AGI research under the table. Not quite sure what I’d do with that assumption, though.
But I am not king of the world, and I do not think it’s feasible for me to become king of the world.
I also doubt that the actual worldwide political system, or even the political systems of most large countries, can actually be made to take any very effective measures within any useful amount of time. There are too many people out there with too many different opinions, too many power centers with contrary interests, too much mutal distrust, and too many other people with too much skill at deflecting any kind of policy initiative down ways that sort of look like they serve the original purpose, but mostly don’t. The devil is often in the details.
If it is possible to get the system to do that, I know that I am not capable of doing so. I mean, I’ll vote for it, maybe make write some letters, but I know from experience that I have nearly no ability to persuade the sorts of people who’d need to be persuaded.
I am also not capable of solving the technical problem myself and doing some “pivotal act”. In fact I’m pretty sure I have no technical ideas for things to try that aren’t obvious to most specialists. And I don’t much buy any of the the ideas I’ve heard from other people.
My only real hopes are things that neither I nor anybody else can influence, especially not in any predictable direction, like limitations on intelligence and uncertainty about doom.
So my personal solution is to read random stuff, study random things, putter around in my workshop, spend time with my kid, and generally have a good time.
We’re not as far apart as you probably think. I’d agree with most of your decisions. I’d even vote for you to become king! :) Like I wrote, I think we must also be cautious with narrow AI as well, and I agree with your points about opaqueness and the potential of narrow AI turning into AGI. Again, the purpose of my post was not to argue how we could make AI safe, but to point out that we could have a great future without AGI. And I still see a lot of beneficial potential in narrow AI, IF we’re cautious enough.
Which? I wonder.
Independent of potential for growing into AGI and {S,X}-risk resulting from that?
With the understanding that these are very rough descriptions that need much more clarity and nuance, that one or two of them might be flat out wrong, that some of them might turn out to be impossible to codify usefully in practice, that there there might be specific exceptions for some of them, and that the list isn’t necessarily complete--
Recommendation systems that optimize for “engagement” (or proxy measures thereof).
Anything that identifies or tracks people, or proxies like vehicles, in spaces open to the public. Also collection of data that would be useful for this.
Anything that mass-classifies private communications, including closed group communications, for any use by anybody not involved in the communication.
Anything specifically designed to produce media showing real people in false situations or to show them saying or doing things they have not actually done.
Anything that adaptively tries to persuade anybody to buy anything or give anybody money, or to hold or not hold any opinion of any person or organization.
Anything that tries to make people anthropomorphize it or develop affection for it.
Anything that tries to classify humans into risk groups based on, well, anything.
Anything that purports to read minds or act as a lie detector, live or on recorded or written material.
Good list. Another one that caught my attention that I saw in the EU act was AIs specialised into subliminal messages. people’s choices can be somewhat conditioned in favor or against things in certain ways by feeding them sensory data even if it’s not consciously perceptible, it can also affect their emotional states more broadly.
I don’t know how effective this stuff is in real life, but I know that it at least works.
A particular example of that one is systems of social scoring, which are surely gonna be used by authoritarian regimes. You can screw people up in so many ways when social control is centralised with AI systems. It’s great to punish people for not being chauvinists
This is already beginning in China.