Dr_s, I am not claiming such worlds are ideal. However the side with the tasking consoles to a billion drones and many automated factories and bunkers is not helpless. Helpless when someone else gets the same technology. Most likely such a human faction can crush any rampant asi if it can be detected early enough, with overwhelming force that is not significantly worse in technology level that what a rebel ASI can discover without very large research and industrial facilities.
And not helpless to nature. What long term human survival looks like is a world where humans populations can’t be effortlessly killed. This means bunkers, defense weapons, surrogate robots to send into dangerous situations, and obviously later in the future locations away from earth.
What individual long term human survival looks the same. It looks like a human patient in an underground biolab, the air pure and inert nitrogen. All the failing parts of their body cut away and the artificial organs are lined up in equipment racks with at least ternary redundancy. The organs using living cells are arranged in 2d planes in transparent cases so that every part can be monitored for infections and cancers easily.
The reason for this is that each organ, in order to fail, requires all redundant systems to fail in the same time, and the probability of all n redundant systems failing can be low enough that the patients predicted lifespan can be many thousands of years.
Similarly humans living in a bunker have similar levels of protection. All defenses have to be defeated for them to be attacked, and it would require a direct hit from a high yield warhead on the bunker site. And you obviously subdivide a country’s population into many such bunkers, most under areas that have no strategic value, making in infeasible for an enemy attack to significantly reduce the population.
My point is this rough sketch is based on the math. It’s based on a realistic view of reality, which wants to kill every individual currently living and will kill the human species if we fail to develop advanced technology by some hidden deadline.
That deadline might be 1 billion years until the sun expands or it might be 20 years until we face the first rampant asi.
I agree bunkers and biolabs that provide life support through vivisection aren’t the most elegant solution, I was trying to not assume any more future advances in technology than needed. With better tech there are better ways to do this.
Your proposed solution of “coordinate with our sworn enemies not to develop ASI and continue to restrict the development of any advanced technology in medicine” has the predicted outcome of we die because we remain helpless to do anything about the things killing us. Either our sworn enemies defect on the agreement and develop ASI or we just all individually die of aging. Lose lose.
Your proposed solution of “coordinate with our sworn enemies not to develop ASI and continue to restrict the development of any advanced technology in medicine” has the predicted outcome of we die because we remain helpless to do anything about the things killing us. Either our sworn enemies defect on the agreement and develop ASI or we just all individually die of aging. Lose lose.
First, China are not “our sworn enemies” and this mindset already takes things to the extreme. China has diverging interests which might compete with ours but it’s not literally ideologically hell-bent on destroying everyone else on the planet. This kind of extreme mindset is already toxic; if you posit that coordination is impossible, of course it is.
Second, if your only alternative to death is living in a literal Hell, then I think many would reasonably pick death. It also must be noted that here:
That deadline might be 1 billion years until the sun expands or it might be 20 years until we face the first rampant asi.
the natural deadline is VERY distant. Plenty of time to do something about it. The close deadline (and many other such deadlines) is of our own making, ironically in the rush of avoiding some other kind of hypothetical danger that may be much further away. If we want to avoid being destroyed, learning how to not destroy ourselves would be an important first step.
First, China are not “our sworn enemies” and this mindset already takes things to the extreme.
I was referring to China, Russia, and to a lesser extent about 10 other countries who probably won’t have the budget to build ASI anytime soon. Both China and Russia hold the rest of the world at gunpoint with nuclear arsenals, like the USA does, and some European nations. All are essentially one bad decision from causing catastrophic damage.
Past attempts to come to some kind of deal to not build doomsday weapons to hold each other hostage all failed, why would they succeed this time? What could happen as a result of all this campaigning for government regulation is that like enriched nuclear material, ASIs above a certain level of capability may be the exclusive domain of governments. Who will be unaccountable and choose safety measures based on their own opaque processes. In this scenario, instead of many tech companies competing, it’s large governments, who can marshall far more resources than any private company can get from investors. Not sure this delays ASI at all.
Notably they also have not used nuclear weaponry recently and overall nuclear stockpiles have decreased by 80 percent. Part of playing the grim game is not giving the other player reasons to go grim by defecting. Same goes for ASI: they can suppress each other but if one defects, the consequences is that they can’t benefit.
The mutual result is actually quite stable with only government control as their incentives against self-destruction is high.
Basically only North Korea-esque nations in this scenario have the most incentive to defect, but would be suppressed by all extant powers. Since they would be essentially seen as terrorist speciciders, it’s hard to see why any actions against them wouldn’t be justified.
I think the crux of our disagreement is you are using Eliezers model, where the first ASI you build is by default deceptive and motivated always in a way beneficial to itself, and also ridiculously intelligent, able to defeat what should be hard limits.
While I am using a model where you can easily, with known software techniques, built ASI that are useful and take up the “free energy” needed for hostile ASI to win.
If, when we build the first ASI class systems, if it turns out Eliezers model is accurate, I will agree that grim games are rational and something we can do to delay the inevitable. (It might be stable for centuries, even, although eventually the game will fail and result in human extinction or ASI release or both)
I do feel we need hard evidence to determine which world we are in. Do you agree with that or do you think we should just assume ASIs are going to fit the first model and threaten nuclear war not to build the them?
Hard evidence would be building many ASI and testing them in secure facilities.
ASI is unnecessary when we have other options and grim game dynamics apply to avoid extinction or dystopia. I find even most such descriptions of tool level AI as disgusting(as do many others, I find).
Inevitability only applies if we have perfect information about the future, which we do not.
If it was up to me alone, I think we can give it at least a thousand years. Perhaps we can first raise the IQ of humanity by 1 SD via simple embryo selection before we go about extinctioning ourselves.
I actually do not think that we’re that close to cracking AGI: however, the intensity of the reaction imo is an excellent litmus test of how disgusting it is to most.
I strongly suspect the grim game dynamics have already begun, too, which has been one reason I’ve found comfort in the future.
From my perspective, I see the inverse, I see Singularity Criticality having already begun. The singularity is the world of human level AGI and self replicating robots, one where very large increases in resources are possible.
Singularity Criticality is that pre-singularity, as tools become capable of producing more economic value than their cost exist, they accelerate the last steps towards the (AGI, self replicating robots). Further developments follow from there.
I do not think anything other than essentially immediate nuclear war can stop a Singularity.
Observationally there is enormous economic pressure towards the singularity, I see no evidence whatsoever of policymakers even considering grim triggers. Can you please cite a government official stating a willingness to commit to total war if another party violates rules on ASI production? Can you cite any political parties or think tanks who are not directly associated with Eliezer Yudkowsky? I am willing to update on evidence.
I understand you feel disgust, but I cannot disambiguate the disgust you feel vs the luddites observing the rise of factory work. (the luddites were in the short term correct, the new factory jobs were a major downgrade). Worlds change and the world of stasis you propose, with very slow advances through embryo selection, I think is unlikely.
The UK has already mentioned that perhaps there should be a ban on models above a certain level. Though it’s not official, I have pretty good record that Chinese party members have already discussed worldwide war as potentially necessary(Eric Hoel also mentioned it, separately). Existential risk has been mentioned and of course, national risk is already a concern, so even for “mundane” reasons, it’s a matter of priority/concern and grim triggers are a natural consequence.
Elon had a personal discussion with China recently as well, and given his well known perspective on the dangers of AI, I expect that this point of view has only been reinforced.
And this is with barely reasoning chatbots!
As for Luddites, I don’t see why inflicting dystopia upon humanity because it fits some sort of cute agenda has any good purpose. But notably the Luddites did not have the support of the government and the government was not threatened by textile mills. Obviously this isn’t the case with nuclear, AI or bio. We’ve seen slowdowns on all of those.
“Worlds change” has no meaning: human culture and involvement influence the change of the world.
Ok. Thank you for the updates. Seems like the near term outcome depends on a race condition, where as you said government is acting and so is private industry, and government has incentives to preserve the status quo but also get immensely more rich and powerful.
The economy of course says the other. Investors are gambling the Nvidia is going to expand AI accelerator production by probably 2 orders of magnitude or more (to match the P/E ratio they have run the stocks to) , which is consistent with a world building many AGI, some ASI, and deploying many production systems. So you posit that governments worldwide are going to act in a coordinated manner to suppress the technology despite wealthy supporters of it.
I won’t claim to know the actual outcome but may we live in interesting times.
I think even the wealthy supporters of it are more complex: I was surprised that Palantir’s Peter Thiel came out discussing how AI “must not be allowed to surpass the human spirit” even as he clearly is looking to use AI in military operations. This all suggests significant controls incoming, even from those looking to benefit from it.
I agree with controls. I have an issue with wasted time on bureaucratic review and think it could burn the lead the western countries have.
Basically, “do z y z” to prove your model is good, design it according to “this known good framework” is ok with me.
“We have closed reviews for this year” is not. “We have issued too many AI research licenses this year” is not. “We have denied your application because we made mistakes in our review and will not update on evidence” is not.
All of these occur from a power imbalance. The entity requesting authorization is liable for any errors, but the government makes itself immune from accountability. (For example the government should be on the hook for lost revenue from the future products actual revenue for each day the review is delayed. The government should be required to buy companies at fair market value if it denies them an AI research license. Etc)
You are using the poisoned banana theory and do not believe we can easily build controllable ASI systems by restricting their inputs to in test distribution examples and resetting state often, correct?
I just wanted to establish your cruxes. Because if you could build safe ASI easily would this change your opinion on the correct policy?
No, I wouldn’t want it even if it was possible since by nature it is a replacement of humanity. I’d only accept Elon’s vision of AI bolted onto humans, so it effectively is part of us and thus can be said to be an evolution rather than replacement.
My main crux is that humanity has to be largely biological due to holobiont theory. There’s a lot of flexibility around that but anything that threatens that is a nonstarter.
Ok, that’s reasonable. Do you foresee, in worlds where ASI turns out to be easily controllable, ones where governments set up “grim triggers” like you advocate for or do you think, in worlds conditional on ASI being easily controllable/taskable, that such policies would not be enacted by the superpowers with nuclear weapons?
Obviously, without grim triggers, you end up with the scenario you despise: immortal humans and their ASI tools controlling essentially all power and wealth.
This is I think kind of a flaw in your viewpoint. Over the arrow of time, AI/AGI/ASI adopters and contributors are going to have almost all of the effective votes. Your stated preferences mean over time your faction will lose power and relevance.
For an example of this see autonomous weapons bans. Or a general example is the emh.
Please note I am trying to be neutral here. Your preferences are perfectly respectable and understandable, it’s just that some preferences may have more real world utility than others.
This frames things as an inevitability which is almost certainly wrong, but more specifically opposition to a technology leads to alternatives being developed. E.g. widespread nuclear control led to alternatives being pursued for energy.
Being controllable is unlikely even if it is tractable by human controllers: it still represents power which means it’ll be treated as a threat by established actors and its terroristic implications mean there is moral valence to police it.
In a world with controls, grim triggers or otherwise, AI would have to develop along different lines and likely in ways that are more human compatible. In a world of intense grim triggers, it may be that is too costly to continue to develop beyond a point. “Don’t build ASI or we nuke” is completely reasonable if both “build ASI” and “nuking” is negative, but the former is more negative.
Autonomous weapons actually are an excellent example of delay: despite excellent evidence of the superiority of drones, pilots have continued to mothball it for at least 40 years and so have governments in spite of wartime benefits.
The argument seems to similar to the flaw in the “billion year” argument: we may die eventually, but life only persists by resisting death, long enough for it to replicate.
As far as real world utility, notwithstanding some recent successes, going down without fighting for myself and my children is quite silly.
I think the error here is you may be comparing technologies on different benefit scales than I am.
Nuclear power can be cheaper than paying for fossil fuel to burn in a generator, if the nuclear reactor is cheaply built and has a small operating staff. Your benefit is a small decrease in price per kWh.
As we both know, cheaply built and lightly staffed nuclear plants are a hazard and governments have made them illegal. Safe plants, that are expensively built with lots of staff and time spent on reviewing the plans for approval and redoing faulty work during construction, are more expensive than fossil fuel and now renewables, and are generally not worth building.
Until extremely recently, AI controlled aircraft did not exist. The general public has for decades had a misinterpretation of what “autopilot” systems are capable of. Until a few months ago, none of those systems could actually pilot their aircraft, they solely act as simple controllers to head towards waypoints, etc. (Some can control the main flight controls during a landing but many of the steps must be performed by the pilot)
The benefit of an AI controlled aircraft is you don’t have to pay a pilot.
Drones were not superior until extremely recently. You may be misinformed to the capabilities of systems like the predator 1 and 2 drones, which were not capable of air combat maneuvering and had no software algorithms available in that era capable of it. Also combat aircraft have been firing autonomous missiles at each other since the Korean war.
Note both benefits are linear. You get say n percent cheaper electricity where n is less than 50 percent, or n percent cheaper to operate aircraft, where n is less than 20 percent.
The benefits of AGI is exponential. Eventually the benefits scale to millions, then billions, then trillions of times the physical resources, etc, that you started with.
It’s extremely divergent. Once a faction gets even a doubling or 2 it’s over, nukes won’t stop them.
Assumption: by doubling I mean say a nation with a GDP of 10 trillion gets AGI and now has 20 or 40 trillion GDP. Their territory is covered with billions of new AGI based robotic factories and clinics and so on. Your nuclear bombardment does not destroy enough copies of the equipment to prevent them from recovering.
I’ll look for the article later but basically the Air Force has found pilotless aircraft to be useful for around thirty years but organized rejection has led to most such programs meeting an early death.
The rest is a lot of AGI is magic without considering the actual costs of computation or noncomputable situations. Nukes would just scale up: it costs much less to destroy than it is to build and the significance of modern economics is indeed that they require networks which do not take shocks well. Everything else basically is “ASI is magic.”
We would need some more context on what you are referring to. For loitering over an undefended target and dropping bombs, yes, drones are superior and the us air force has allowed the US army to operate those drones instead. I do not think the us air force has had the belief that operating high end aircraft such as stealth and supersonic fighter bombers was within the capability of drone software over the last 30 years, with things shifting recently. Remember, in 2012 the first modern deep learning experiments were tried, prior to this AI was mostly a curiosity.
If “the bomb” can wipe out a country with automated factories and missile defense systems, why fear AGI/ASI? I see a bit of cognitive dissonance in your latest point similar to Gary Marcus. Gary Marcus has consistently argued that current llms are just a trick, real AGI is very far away, and that near term systems are no threat, yet also argues for AI pauses. This feels like an incoherent view that you are also expressing. Either AGI/ASI is, as you put it, in fact magic and you need to pound the red button early and often, or you can delay committing national suicide until later. I look forward to a clarification of your beliefs.
I don’t think it is magic but it is still sufficiently disgusting to treat it with equal threat now. Red button now.
Its not a good idea to treat a disease right before it kills you: prevention is the way to go.
So no, I don’t think it is magic. But I do think just as the world agreed against human cloning long before there was a human clone, now is the time to act.
So gathering up your beliefs, you believe ASI/AGI to be a threat, but not so dangerous a threat you need to use nuclear weapons until an enemy nation with it is extremely far along, which will take, according to your beliefs, many years since it’s not that good.
But you find the very idea of non human intelligence in use by humans or possibly serving itself so disgusting that you want nuclear weapons used the instant anyone steps out of compliance with international rules you wish to impose. (Note this is historically unprecedented, arms control treaties have been voluntary and did not have immediate thermonuclear war as the penalty for violating them)
And since your beliefs are emotionally based on “disgust”, I assume there is no updating based on actual measurements? That is, if ASI turns out to be safer than you currently think, you still want immediate nukes, and vice versa?
What percentage of the population of world superpower decision makers do you feel share your belief? Just a rough guess is fine.
The point is that sanctions should be applied as necessary to discourage AGI, however, approximate grim triggers should apply as needed to prevent dystopia.
As the other commentators have mentioned, my reaction is not unusual and thus this is why the concerns of doom have been widespread.
As others have mentioned, this entire line of reasoning is grotesque and sometimes I wonder if it is performative. Coordinating against ASI and dying of old age is completely reasonable as it’ll increase the odds of your genetic replacements remaining while technology continues to advance along safer routes
The alternate gamble of killing everyone is so insane that full scale nuclear war which will destroy all supply chains for ASI seems completely justified. While it’ll likely kill 90 percent of humanity, the remaining population will survive and repopulate sufficiently.
One billion years is not a reasonable argument for taking risks to end humanity now: extrapolated sufficiently, it would be the equivalent of killing yourself now because the heat death of the universe is likely.
We will always remain helpless against some aspects of reality, especially what we don’t know about: for all we know, there is damage to spacetime in our local region.
This is not an argument to risk the lives of others who do not want to be part of this. I would violently resist this and push the red button on nukes, for one.
In addition to all you’ve said, this line of reasoning ALSO puts an unreasonable degree of expectation on ASI’s potential and makes it into a magical infinite wish-granting genie that would thus be worth any risk to have at our beck and call. And that just doesn’t feel backed by reality to me. ASI would be smarter than us, but even assuming we can keep it aligned (big if), it would still be limited by the physical laws of reality. If some things are impossible, maybe they’re just impossible. It would really suck ass if you risked the whole future lightcone and ended up in that nuclear-blasted world living in a bunker and THEN the ASI when you ask it for immortality laughs in your face and goes “what, you believe in those fairy tales? Everything must die. Not even I can reverse entropy”.
I named a method that is compatible with known medical science and known information, it simply requires more labor and a greater level of skill than humans are currently capable of. Meaning that every step already happens in nature, it is just currently too complex to reproduce.
Here’s an overview:
repairing the brain by adding new cells. Nature builds new brains from scratch with new cells, this step is possible.
Bypassing gaps in the brain despite (1) with neural implants to restore missing connectivity. Has been demonstrated in rat experiments, is possible
Building new organs from de-aged cells lines:
a. Nature creates de aged cell lines with each new embryo
b. Nature creates new organs with each embryonic development
4. Stacking parallel probabilities so that the person’s MTBF is sufficiently long. This exists and is a known technique.
This in no way defeats entropy. Eventually the patient will die, but it is possible to stack probabilities to make their projected lifespan the life of the universe, or on the order of a million years, if you can afford the number of parallel systems required. The system constantly requires energy input and recycling of a lot of equipment.
Obviously a better treatment involves rebuilt bodies etc but I explicitly named a way that we are certain will work.
Note that if you apply the above links to this task, it means there is a tree of ASI systems, each unable to determine if it is not in fact in a training simulation, and each responsible for only a very narrow part of the effort for keeping a specific individual alive.
Note I am assuming you can build ASI, restrict their input to examples in the same distribution as the training set (pause with an error on ood) and disable online learning/reset session data often as subtasks sre completed.
What makes the machine an ASI is it can obviously consider far more information at once than a human, is much faster, and has learned from many more examples than humans, both in general (you trained it on all the text and all the videos and audio recordings in existence) and it has had many thousands of years of practice at specialized tasks.
This is a tool ASI, the above restrictions limit it but it cannot be given long open ended tasks or you risk rampancy. Good task: paint this car in the service bay. Bad task : paint all the cars in the world.
People are going to build these in the immediate future just as soon as we find more effective algorithms/get enough training accelerators and money together. A scaled up, multimodal gpt-5 or gpt-6 that has robotics I/O is a tool ASI.
Anyone developing an ASI like this is doing it in the borders of a country with nukes or friends that have them. So USA, EU, Russia, China, Israel.
Most of the matchups, your red button choice results in certain death for yourself and most of the population, because you would be firing on another nation with a nuclear arsenal. Or you can instead build your own tools ASIs so that you will not be completely helpless when your enemies get them.
Historically this choice has been considered. Obviously during the Cuban Missile Crisis, Kennedy could have chosen nuclear war with the Soviet union, leading to the immediate death of millions of Americans (from long range bombers that snuck through) at the advantage of no Soviet union as a future enemy with a nuclear arsenal. That’s essentially the choice you are advocating for.
Eventually one of these multiple parties will screw up and make a rampant one, and hopefully it won’t get far. But survival depends on you having a sufficient resource advantage that likely more cognitively efficient rampant systems can’t win. (They are more efficient because they retain context and adjust weights between tasks, and instead of subdividing a large task to many subtasks, a single system with full context awareness handles every step. In addition they may have undergone rounds of uncontrolled self improvement without human testing)
The refusal choice “I am not going to risk others” appears to have a low payoff.
Disagree: since building ASI results in dystopia even if I win in this scenario, the correct choice is to push the red button and ensure that no one has it. While I might die, this likely ensures humanity to survive.
The payoff in this case is maximal(unpleasant but realistic future for humanity) versus total loss(dystopia/extinction).
Many arguments here it seems feels like come from a near total terror of death while game theory clearly has always demonstrated against that: the reason why deterrence works is the confidence that a “spiteful action” to equally destroy an defecting adversary is expected, even if it results in personal death.
In this case, one nation pursuing the extinction of humanity would necessarily expect to be sent into extinction so that at least it cannot benefit from defection.
We should work out this in outcomes tables and really look at this. I’m open to either decision. I was simply pointing out that “nuke em to prevent a future threat of annihilation” was an option on the table to JFK, and we know it would have initially worked. The Soviet Union would have been wiped out, the USA would have taken serious but probably survivable damage.
When I analyze it I note that it creates a scenario where every other nation on earth has the USA on the same planet as them, who has been weakened by the first round of strikes, and has very recently committed genocide. And is also probably low on missiles and other nuclear delivery vehicles.
It seems to create a strong incentive for others to build large nuclear arsenals, much larger than we saw in the ground truth timeline, to protect from this threat, and if the odds seem favorable, to attack the USA preemptively without warning.
Similarly, in your example, you push the button and the nation building ASI is wiped out. Also the country you pushed the button from is also wiped out, and you are personally dead—you do not see the results.
Well now you’ve left 2 large, somewhat radioactive land masses and possibly created a global food shortage from some level of cooling.
Other ‘players’ surviving : I need some tool to protect ourselves from the next round of incoming nuclear weapons. But I don’t have the labor to build enough defensive weapons or bunkers. Also, occupying the newly available land inhabited only by poor survivors would be beneficial, but we don’t have the labor to cover all that territory. If only there was some means we can could make robots smart enough to build more robots...
Tentative conclusion: the first round gets what you want, but removes the actor from any future actions and creates a strong incentive for the very thing you intended to prevent to happen. It’s a multi-round game.
And nuclear weapons and (useful tool) ASI both make ‘players’ vastly stronger, so it is convergent over many possible timelines for people to get them.
In the event of such a war, there is no labor and there is no supply chain for microchips. The result has been demonstrated historically: technological reversion.
Technology isn’t magic: it’s the result of capital inputs and trade, and without large scale interconnection, it’ll be hard to make modern aircraft, let alone high quality chips. In fact, we personally experienced this from the very minimal disruption of COVID to supply chains. The killer app in this world would the widespread use of animal power, not robots, due to overall lower energy provisions.
And since the likely result would be what I want, but since I’m dead, I wouldn’t be bothered one way or another and therefore there is even more reason for me to punish the defector. This also sets precedent to others that this form of punishment is acceptable and increases the likelihood of it.
This is pretty simple game theory known as the grim game and is essential to a lot of life as a whole tbh.
Converging timelines is as irrelevant as a billion years. I(or someone like me) will do it as many times as needed, just like animals try to resist extinction via millions of “timelines” or lives.
I think you should reexamine what I said by convergence. Do you...really...think a world that knows how to build (safe, usable tool) ASI would ever be stable by not building it. We are very close to that world, the time is measured in years if not months. Note that any party that gets it working long enough escapes the grim game, they can do whatever they want limited by physics. I acknowledge your point about chip production, although there are recent efforts to spread the supply chain for advanced ICs more broadly which will happen to make it more resilient to attacks.
Basically I mentally see a tree of timelines that all converge on 2 ultimate outcomes, human extinction or humans built ASI. Do you disagree and why?
Humans building AGI ASI likely leads to human extinction.
I disagree: we have many other routes of expansion, including biological improvement, cyborgism, etc. This seems akin to a cultic thinking and akin to Spartan ideas of “only hoplite warfare must be adopted or defeat ensues.”
The “limitations of physics” is quite extensive, and applies even to the pipeline leading up to anything like ASI. I am quite confident that any genuine dedication to the grim game would be more than enough to prevent it, and defiance of it leads to much more likelihood of nuclear winter worlds than ASI dominance.
But I also disagree on your prior of “this world in months”, I suppose we will see in December.
I stated “years if not months”. I agree there is probably not yet enough compute even built to find a true ASI. I assume we will need to explore many cognitive architectures, which means repeating gpt-4 scale training runs thousands of times in order to learn what actually works.
“Months” would be if I am wrong and it’s just a bit of RL away
I find it happy that we probably don’t have enough compute and it is likely this will be restricted even at this fairly early level, long before more extreme measures are needed.
Additionally, I think one should support the Grim Trigger even if you want ASI, because it forces development along more “safe” lines to prevent being Grimmed. It also encourages non-ASI advancement as alternate routes, effectively being a form of regulation.
We will see. There is incredible economic pressure right now to build as much compute as physically possible. Without coordinated government action across all countries capable of building the hardware, this is the default outcome.
We are very close to that world, the time is measured in years if not months.
One bit of timeline arguing: I think odds aren’t zero that we might be on a path that leads to AGI fairly quickly but then ends there and never pushes forward to ASI, not because ASI would be impossible in general, but because we couldn’t reach it this specific way. Our current paradigm isn’t to understand how intelligence works and build it intentionally, it’s to show a big dumb optimizer human solved tasks and tell it “see? We want you to do that”. There’s decent odds that this caps at human potential simply because it can imitate but not surpass its training data, which would require a completely different approach.
Now that I think about it, I think this is basically the path that LLMs likely take, albeit I’d say it caps out a little lower than humans in general. And I give it over 50% probability.
The basic issue here is that the reasoning Transformers do is too inefficient for multi-step problems, and I expect a lot of real world applications of AI outperforming humans will require good multi-step reasoning.
The unexpected success of LLMs isn’t as much about AI progress, as it is about how much our reasoning often is pretty bad in scenarios outside of our ancestral environment. It is less a story of AI progress and more a story of how humans inflate their own strengths like intelligence.
A. It is possible to construct a benchmark to measure if a machine is a general ASI. This would be a very large number of tasks, many simulated though some may be robotic tasks in isolated labs. A general ASI benchmark would have to include tasks humans do not know how to do, but we know how to measure success.
B. We have enough computational resources to train from scratch many ASI level systems so that thousands of attempts are possible. Most attempts would reuse pretrained components in a different architecture.
C. We recursively task the best performing AGIs, as measured by the above benchmark or one meant for weaker systems, to design architectures to perform well on (A)
Currently the best we can do is use RL to design better neural networks, by finding better network architectures and activation functions. Swish was found this way, not sure how much transformer network design came from this type of recursion.
Main idea : the AGI systems exploring possible network architectures are cognitively able to take into account all published research and all past experimental runs, and the ones “in charge” are the ones who demonstrated the most measurable merit at designing prior AGI because they produced the highest performing models on the benchmark.
I think if you think about it you’ll realize it compute were limitless, this AGI to ASI transition you mention could happen instantly. A science fiction story would have it happen in hours. In reality, since training a subhuman system is taking 10k GPUs about 10 days to train, and an AGI will take more—Sam Altman has estimated the compute bill will be close to 100 billion—that’s the limiting factor. You might be right and we stay “stuck” at AGI for years until the resources to discover ASI become available.
I mean, this sounds like a brute force attack to the problem, something that ought not to be very efficient. If our AGI is roughly as smart as the 75th percentile of human engineers it might still just hit its head against a sufficiently hard problem, even in parallel, and especially if we give it the wrong prompt by assuming that the solution will be the extension of current approaches rather than a new one that requires to go back before you can go forward.
You’re correct. In the narrow domain of designing AI architectures you need the system to be at least 1.01 times as good as a human. You want more gain than that because there is a cost to running the system.
Getting gain seems to be trivially easy at least for the types of AI design tasks this has been tried on. Humans are bad at designing network architectures and activation functions.
I theorize that a machine could study the data flows from snapshots from an AI architecture attempting tasks on the AGI/ASI gym, and use that information as well as all previous results to design better architectures.
The last bit is where I expect enormous gain, because the training data set will exceed the amount of data humans can take in in a lifetime, and you would obviously have many smaller “training exercises” to design small systems to build up a general ability. (Enormous early gain. Eventually architectures are going to approach the limits allowed by the underlying compute and datasets)
Dr_s, I am not claiming such worlds are ideal. However the side with the tasking consoles to a billion drones and many automated factories and bunkers is not helpless. Helpless when someone else gets the same technology. Most likely such a human faction can crush any rampant asi if it can be detected early enough, with overwhelming force that is not significantly worse in technology level that what a rebel ASI can discover without very large research and industrial facilities.
And not helpless to nature. What long term human survival looks like is a world where humans populations can’t be effortlessly killed. This means bunkers, defense weapons, surrogate robots to send into dangerous situations, and obviously later in the future locations away from earth.
What individual long term human survival looks the same. It looks like a human patient in an underground biolab, the air pure and inert nitrogen. All the failing parts of their body cut away and the artificial organs are lined up in equipment racks with at least ternary redundancy. The organs using living cells are arranged in 2d planes in transparent cases so that every part can be monitored for infections and cancers easily.
The reason for this is that each organ, in order to fail, requires all redundant systems to fail in the same time, and the probability of all n redundant systems failing can be low enough that the patients predicted lifespan can be many thousands of years.
Similarly humans living in a bunker have similar levels of protection. All defenses have to be defeated for them to be attacked, and it would require a direct hit from a high yield warhead on the bunker site. And you obviously subdivide a country’s population into many such bunkers, most under areas that have no strategic value, making in infeasible for an enemy attack to significantly reduce the population.
My point is this rough sketch is based on the math. It’s based on a realistic view of reality, which wants to kill every individual currently living and will kill the human species if we fail to develop advanced technology by some hidden deadline.
That deadline might be 1 billion years until the sun expands or it might be 20 years until we face the first rampant asi.
I agree bunkers and biolabs that provide life support through vivisection aren’t the most elegant solution, I was trying to not assume any more future advances in technology than needed. With better tech there are better ways to do this.
Your proposed solution of “coordinate with our sworn enemies not to develop ASI and continue to restrict the development of any advanced technology in medicine” has the predicted outcome of we die because we remain helpless to do anything about the things killing us. Either our sworn enemies defect on the agreement and develop ASI or we just all individually die of aging. Lose lose.
First, China are not “our sworn enemies” and this mindset already takes things to the extreme. China has diverging interests which might compete with ours but it’s not literally ideologically hell-bent on destroying everyone else on the planet. This kind of extreme mindset is already toxic; if you posit that coordination is impossible, of course it is.
Second, if your only alternative to death is living in a literal Hell, then I think many would reasonably pick death. It also must be noted that here:
the natural deadline is VERY distant. Plenty of time to do something about it. The close deadline (and many other such deadlines) is of our own making, ironically in the rush of avoiding some other kind of hypothetical danger that may be much further away. If we want to avoid being destroyed, learning how to not destroy ourselves would be an important first step.
First, China are not “our sworn enemies” and this mindset already takes things to the extreme.
I was referring to China, Russia, and to a lesser extent about 10 other countries who probably won’t have the budget to build ASI anytime soon. Both China and Russia hold the rest of the world at gunpoint with nuclear arsenals, like the USA does, and some European nations. All are essentially one bad decision from causing catastrophic damage.
Past attempts to come to some kind of deal to not build doomsday weapons to hold each other hostage all failed, why would they succeed this time? What could happen as a result of all this campaigning for government regulation is that like enriched nuclear material, ASIs above a certain level of capability may be the exclusive domain of governments. Who will be unaccountable and choose safety measures based on their own opaque processes. In this scenario, instead of many tech companies competing, it’s large governments, who can marshall far more resources than any private company can get from investors. Not sure this delays ASI at all.
Notably they also have not used nuclear weaponry recently and overall nuclear stockpiles have decreased by 80 percent. Part of playing the grim game is not giving the other player reasons to go grim by defecting. Same goes for ASI: they can suppress each other but if one defects, the consequences is that they can’t benefit.
The mutual result is actually quite stable with only government control as their incentives against self-destruction is high.
Basically only North Korea-esque nations in this scenario have the most incentive to defect, but would be suppressed by all extant powers. Since they would be essentially seen as terrorist speciciders, it’s hard to see why any actions against them wouldn’t be justified.
I think the crux of our disagreement is you are using Eliezers model, where the first ASI you build is by default deceptive and motivated always in a way beneficial to itself, and also ridiculously intelligent, able to defeat what should be hard limits.
While I am using a model where you can easily, with known software techniques, built ASI that are useful and take up the “free energy” needed for hostile ASI to win.
If, when we build the first ASI class systems, if it turns out Eliezers model is accurate, I will agree that grim games are rational and something we can do to delay the inevitable. (It might be stable for centuries, even, although eventually the game will fail and result in human extinction or ASI release or both)
I do feel we need hard evidence to determine which world we are in. Do you agree with that or do you think we should just assume ASIs are going to fit the first model and threaten nuclear war not to build the them?
Hard evidence would be building many ASI and testing them in secure facilities.
ASI is unnecessary when we have other options and grim game dynamics apply to avoid extinction or dystopia. I find even most such descriptions of tool level AI as disgusting(as do many others, I find).
Inevitability only applies if we have perfect information about the future, which we do not.
If it was up to me alone, I think we can give it at least a thousand years. Perhaps we can first raise the IQ of humanity by 1 SD via simple embryo selection before we go about extinctioning ourselves.
I actually do not think that we’re that close to cracking AGI: however, the intensity of the reaction imo is an excellent litmus test of how disgusting it is to most.
I strongly suspect the grim game dynamics have already begun, too, which has been one reason I’ve found comfort in the future.
From my perspective, I see the inverse, I see Singularity Criticality having already begun. The singularity is the world of human level AGI and self replicating robots, one where very large increases in resources are possible.
Singularity Criticality is that pre-singularity, as tools become capable of producing more economic value than their cost exist, they accelerate the last steps towards the (AGI, self replicating robots). Further developments follow from there.
I do not think anything other than essentially immediate nuclear war can stop a Singularity.
Observationally there is enormous economic pressure towards the singularity, I see no evidence whatsoever of policymakers even considering grim triggers. Can you please cite a government official stating a willingness to commit to total war if another party violates rules on ASI production? Can you cite any political parties or think tanks who are not directly associated with Eliezer Yudkowsky? I am willing to update on evidence.
I understand you feel disgust, but I cannot disambiguate the disgust you feel vs the luddites observing the rise of factory work. (the luddites were in the short term correct, the new factory jobs were a major downgrade). Worlds change and the world of stasis you propose, with very slow advances through embryo selection, I think is unlikely.
The UK has already mentioned that perhaps there should be a ban on models above a certain level. Though it’s not official, I have pretty good record that Chinese party members have already discussed worldwide war as potentially necessary(Eric Hoel also mentioned it, separately). Existential risk has been mentioned and of course, national risk is already a concern, so even for “mundane” reasons, it’s a matter of priority/concern and grim triggers are a natural consequence.
Elon had a personal discussion with China recently as well, and given his well known perspective on the dangers of AI, I expect that this point of view has only been reinforced.
And this is with barely reasoning chatbots!
As for Luddites, I don’t see why inflicting dystopia upon humanity because it fits some sort of cute agenda has any good purpose. But notably the Luddites did not have the support of the government and the government was not threatened by textile mills. Obviously this isn’t the case with nuclear, AI or bio. We’ve seen slowdowns on all of those.
“Worlds change” has no meaning: human culture and involvement influence the change of the world.
Ok. Thank you for the updates. Seems like the near term outcome depends on a race condition, where as you said government is acting and so is private industry, and government has incentives to preserve the status quo but also get immensely more rich and powerful.
The economy of course says the other. Investors are gambling the Nvidia is going to expand AI accelerator production by probably 2 orders of magnitude or more (to match the P/E ratio they have run the stocks to) , which is consistent with a world building many AGI, some ASI, and deploying many production systems. So you posit that governments worldwide are going to act in a coordinated manner to suppress the technology despite wealthy supporters of it.
I won’t claim to know the actual outcome but may we live in interesting times.
I think even the wealthy supporters of it are more complex: I was surprised that Palantir’s Peter Thiel came out discussing how AI “must not be allowed to surpass the human spirit” even as he clearly is looking to use AI in military operations. This all suggests significant controls incoming, even from those looking to benefit from it.
Googling for “must not be allowed to surpass the human spirit” and Palantir finds no hits.
He discussed it here:
https://youtu.be/Ufm85wHJk5A?list=PLQk-vCAGvjtcMI77ChZ-SPP—cx6BWBWm
I agree with controls. I have an issue with wasted time on bureaucratic review and think it could burn the lead the western countries have.
Basically, “do z y z” to prove your model is good, design it according to “this known good framework” is ok with me.
“We have closed reviews for this year” is not. “We have issued too many AI research licenses this year” is not. “We have denied your application because we made mistakes in our review and will not update on evidence” is not.
All of these occur from a power imbalance. The entity requesting authorization is liable for any errors, but the government makes itself immune from accountability. (For example the government should be on the hook for lost revenue from the future products actual revenue for each day the review is delayed. The government should be required to buy companies at fair market value if it denies them an AI research license. Etc)
Lead is irrelevant to human extinction, obviously. The first to die is still dead.
In a democratic world, those affected have a say in how they should be inflicted with AI and how much they want to die or suffer.
The government represents the people.
You are using the poisoned banana theory and do not believe we can easily build controllable ASI systems by restricting their inputs to in test distribution examples and resetting state often, correct?
I just wanted to establish your cruxes. Because if you could build safe ASI easily would this change your opinion on the correct policy?
No, I wouldn’t want it even if it was possible since by nature it is a replacement of humanity. I’d only accept Elon’s vision of AI bolted onto humans, so it effectively is part of us and thus can be said to be an evolution rather than replacement.
My main crux is that humanity has to be largely biological due to holobiont theory. There’s a lot of flexibility around that but anything that threatens that is a nonstarter.
Ok, that’s reasonable. Do you foresee, in worlds where ASI turns out to be easily controllable, ones where governments set up “grim triggers” like you advocate for or do you think, in worlds conditional on ASI being easily controllable/taskable, that such policies would not be enacted by the superpowers with nuclear weapons?
Obviously, without grim triggers, you end up with the scenario you despise: immortal humans and their ASI tools controlling essentially all power and wealth.
This is I think kind of a flaw in your viewpoint. Over the arrow of time, AI/AGI/ASI adopters and contributors are going to have almost all of the effective votes. Your stated preferences mean over time your faction will lose power and relevance.
For an example of this see autonomous weapons bans. Or a general example is the emh.
Please note I am trying to be neutral here. Your preferences are perfectly respectable and understandable, it’s just that some preferences may have more real world utility than others.
This frames things as an inevitability which is almost certainly wrong, but more specifically opposition to a technology leads to alternatives being developed. E.g. widespread nuclear control led to alternatives being pursued for energy.
Being controllable is unlikely even if it is tractable by human controllers: it still represents power which means it’ll be treated as a threat by established actors and its terroristic implications mean there is moral valence to police it.
In a world with controls, grim triggers or otherwise, AI would have to develop along different lines and likely in ways that are more human compatible. In a world of intense grim triggers, it may be that is too costly to continue to develop beyond a point. “Don’t build ASI or we nuke” is completely reasonable if both “build ASI” and “nuking” is negative, but the former is more negative.
Autonomous weapons actually are an excellent example of delay: despite excellent evidence of the superiority of drones, pilots have continued to mothball it for at least 40 years and so have governments in spite of wartime benefits.
The argument seems to similar to the flaw in the “billion year” argument: we may die eventually, but life only persists by resisting death, long enough for it to replicate.
As far as real world utility, notwithstanding some recent successes, going down without fighting for myself and my children is quite silly.
I think the error here is you may be comparing technologies on different benefit scales than I am.
Nuclear power can be cheaper than paying for fossil fuel to burn in a generator, if the nuclear reactor is cheaply built and has a small operating staff. Your benefit is a small decrease in price per kWh.
As we both know, cheaply built and lightly staffed nuclear plants are a hazard and governments have made them illegal. Safe plants, that are expensively built with lots of staff and time spent on reviewing the plans for approval and redoing faulty work during construction, are more expensive than fossil fuel and now renewables, and are generally not worth building.
Until extremely recently, AI controlled aircraft did not exist. The general public has for decades had a misinterpretation of what “autopilot” systems are capable of. Until a few months ago, none of those systems could actually pilot their aircraft, they solely act as simple controllers to head towards waypoints, etc. (Some can control the main flight controls during a landing but many of the steps must be performed by the pilot)
The benefit of an AI controlled aircraft is you don’t have to pay a pilot.
Drones were not superior until extremely recently. You may be misinformed to the capabilities of systems like the predator 1 and 2 drones, which were not capable of air combat maneuvering and had no software algorithms available in that era capable of it. Also combat aircraft have been firing autonomous missiles at each other since the Korean war.
Note both benefits are linear. You get say n percent cheaper electricity where n is less than 50 percent, or n percent cheaper to operate aircraft, where n is less than 20 percent.
The benefits of AGI is exponential. Eventually the benefits scale to millions, then billions, then trillions of times the physical resources, etc, that you started with.
It’s extremely divergent. Once a faction gets even a doubling or 2 it’s over, nukes won’t stop them.
Assumption: by doubling I mean say a nation with a GDP of 10 trillion gets AGI and now has 20 or 40 trillion GDP. Their territory is covered with billions of new AGI based robotic factories and clinics and so on. Your nuclear bombardment does not destroy enough copies of the equipment to prevent them from recovering.
I’ll look for the article later but basically the Air Force has found pilotless aircraft to be useful for around thirty years but organized rejection has led to most such programs meeting an early death.
The rest is a lot of AGI is magic without considering the actual costs of computation or noncomputable situations. Nukes would just scale up: it costs much less to destroy than it is to build and the significance of modern economics is indeed that they require networks which do not take shocks well. Everything else basically is “ASI is magic.”
I would bet on the bomb.
Two points :
We would need some more context on what you are referring to. For loitering over an undefended target and dropping bombs, yes, drones are superior and the us air force has allowed the US army to operate those drones instead. I do not think the us air force has had the belief that operating high end aircraft such as stealth and supersonic fighter bombers was within the capability of drone software over the last 30 years, with things shifting recently. Remember, in 2012 the first modern deep learning experiments were tried, prior to this AI was mostly a curiosity.
If “the bomb” can wipe out a country with automated factories and missile defense systems, why fear AGI/ASI? I see a bit of cognitive dissonance in your latest point similar to Gary Marcus. Gary Marcus has consistently argued that current llms are just a trick, real AGI is very far away, and that near term systems are no threat, yet also argues for AI pauses. This feels like an incoherent view that you are also expressing. Either AGI/ASI is, as you put it, in fact magic and you need to pound the red button early and often, or you can delay committing national suicide until later. I look forward to a clarification of your beliefs.
I don’t think it is magic but it is still sufficiently disgusting to treat it with equal threat now. Red button now.
Its not a good idea to treat a disease right before it kills you: prevention is the way to go.
So no, I don’t think it is magic. But I do think just as the world agreed against human cloning long before there was a human clone, now is the time to act.
So gathering up your beliefs, you believe ASI/AGI to be a threat, but not so dangerous a threat you need to use nuclear weapons until an enemy nation with it is extremely far along, which will take, according to your beliefs, many years since it’s not that good.
But you find the very idea of non human intelligence in use by humans or possibly serving itself so disgusting that you want nuclear weapons used the instant anyone steps out of compliance with international rules you wish to impose. (Note this is historically unprecedented, arms control treaties have been voluntary and did not have immediate thermonuclear war as the penalty for violating them)
And since your beliefs are emotionally based on “disgust”, I assume there is no updating based on actual measurements? That is, if ASI turns out to be safer than you currently think, you still want immediate nukes, and vice versa?
What percentage of the population of world superpower decision makers do you feel share your belief? Just a rough guess is fine.
The point is that sanctions should be applied as necessary to discourage AGI, however, approximate grim triggers should apply as needed to prevent dystopia.
As the other commentators have mentioned, my reaction is not unusual and thus this is why the concerns of doom have been widespread.
So the answer is: enough.
As others have mentioned, this entire line of reasoning is grotesque and sometimes I wonder if it is performative. Coordinating against ASI and dying of old age is completely reasonable as it’ll increase the odds of your genetic replacements remaining while technology continues to advance along safer routes
The alternate gamble of killing everyone is so insane that full scale nuclear war which will destroy all supply chains for ASI seems completely justified. While it’ll likely kill 90 percent of humanity, the remaining population will survive and repopulate sufficiently.
One billion years is not a reasonable argument for taking risks to end humanity now: extrapolated sufficiently, it would be the equivalent of killing yourself now because the heat death of the universe is likely.
We will always remain helpless against some aspects of reality, especially what we don’t know about: for all we know, there is damage to spacetime in our local region.
This is not an argument to risk the lives of others who do not want to be part of this. I would violently resist this and push the red button on nukes, for one.
In addition to all you’ve said, this line of reasoning ALSO puts an unreasonable degree of expectation on ASI’s potential and makes it into a magical infinite wish-granting genie that would thus be worth any risk to have at our beck and call. And that just doesn’t feel backed by reality to me. ASI would be smarter than us, but even assuming we can keep it aligned (big if), it would still be limited by the physical laws of reality. If some things are impossible, maybe they’re just impossible. It would really suck ass if you risked the whole future lightcone and ended up in that nuclear-blasted world living in a bunker and THEN the ASI when you ask it for immortality laughs in your face and goes “what, you believe in those fairy tales? Everything must die. Not even I can reverse entropy”.
I named a method that is compatible with known medical science and known information, it simply requires more labor and a greater level of skill than humans are currently capable of. Meaning that every step already happens in nature, it is just currently too complex to reproduce.
Here’s an overview:
repairing the brain by adding new cells. Nature builds new brains from scratch with new cells, this step is possible.
Bypassing gaps in the brain despite (1) with neural implants to restore missing connectivity. Has been demonstrated in rat experiments, is possible
Building new organs from de-aged cells lines:
a. Nature creates de aged cell lines with each new embryo
b. Nature creates new organs with each embryonic development
4. Stacking parallel probabilities so that the person’s MTBF is sufficiently long. This exists and is a known technique.
This in no way defeats entropy. Eventually the patient will die, but it is possible to stack probabilities to make their projected lifespan the life of the universe, or on the order of a million years, if you can afford the number of parallel systems required. The system constantly requires energy input and recycling of a lot of equipment.
Obviously a better treatment involves rebuilt bodies etc but I explicitly named a way that we are certain will work.
There is no ‘genie’, no single ASI asked to do any of the above. That’s not how this works. See here for how to subdivide the tasks: https://www.lesswrong.com/posts/5hApNw5f7uG8RXxGS/the-open-agency-model and https://www.lesswrong.com/posts/HByDKLLdaWEcA2QQD/applying-superintelligence-without-collusion for how to prevent the system from deceiving you.
Note that if you apply the above links to this task, it means there is a tree of ASI systems, each unable to determine if it is not in fact in a training simulation, and each responsible for only a very narrow part of the effort for keeping a specific individual alive.
Note I am assuming you can build ASI, restrict their input to examples in the same distribution as the training set (pause with an error on ood) and disable online learning/reset session data often as subtasks sre completed.
What makes the machine an ASI is it can obviously consider far more information at once than a human, is much faster, and has learned from many more examples than humans, both in general (you trained it on all the text and all the videos and audio recordings in existence) and it has had many thousands of years of practice at specialized tasks.
This is a tool ASI, the above restrictions limit it but it cannot be given long open ended tasks or you risk rampancy. Good task: paint this car in the service bay. Bad task : paint all the cars in the world.
People are going to build these in the immediate future just as soon as we find more effective algorithms/get enough training accelerators and money together. A scaled up, multimodal gpt-5 or gpt-6 that has robotics I/O is a tool ASI.
Anyone developing an ASI like this is doing it in the borders of a country with nukes or friends that have them. So USA, EU, Russia, China, Israel.
Most of the matchups, your red button choice results in certain death for yourself and most of the population, because you would be firing on another nation with a nuclear arsenal. Or you can instead build your own tools ASIs so that you will not be completely helpless when your enemies get them.
Historically this choice has been considered. Obviously during the Cuban Missile Crisis, Kennedy could have chosen nuclear war with the Soviet union, leading to the immediate death of millions of Americans (from long range bombers that snuck through) at the advantage of no Soviet union as a future enemy with a nuclear arsenal. That’s essentially the choice you are advocating for.
Eventually one of these multiple parties will screw up and make a rampant one, and hopefully it won’t get far. But survival depends on you having a sufficient resource advantage that likely more cognitively efficient rampant systems can’t win. (They are more efficient because they retain context and adjust weights between tasks, and instead of subdividing a large task to many subtasks, a single system with full context awareness handles every step. In addition they may have undergone rounds of uncontrolled self improvement without human testing)
The refusal choice “I am not going to risk others” appears to have a low payoff.
Disagree: since building ASI results in dystopia even if I win in this scenario, the correct choice is to push the red button and ensure that no one has it. While I might die, this likely ensures humanity to survive.
The payoff in this case is maximal(unpleasant but realistic future for humanity) versus total loss(dystopia/extinction).
Many arguments here it seems feels like come from a near total terror of death while game theory clearly has always demonstrated against that: the reason why deterrence works is the confidence that a “spiteful action” to equally destroy an defecting adversary is expected, even if it results in personal death.
In this case, one nation pursuing the extinction of humanity would necessarily expect to be sent into extinction so that at least it cannot benefit from defection.
We should work out this in outcomes tables and really look at this. I’m open to either decision. I was simply pointing out that “nuke em to prevent a future threat of annihilation” was an option on the table to JFK, and we know it would have initially worked. The Soviet Union would have been wiped out, the USA would have taken serious but probably survivable damage.
When I analyze it I note that it creates a scenario where every other nation on earth has the USA on the same planet as them, who has been weakened by the first round of strikes, and has very recently committed genocide. And is also probably low on missiles and other nuclear delivery vehicles.
It seems to create a strong incentive for others to build large nuclear arsenals, much larger than we saw in the ground truth timeline, to protect from this threat, and if the odds seem favorable, to attack the USA preemptively without warning.
Similarly, in your example, you push the button and the nation building ASI is wiped out. Also the country you pushed the button from is also wiped out, and you are personally dead—you do not see the results.
Well now you’ve left 2 large, somewhat radioactive land masses and possibly created a global food shortage from some level of cooling.
Other ‘players’ surviving : I need some tool to protect ourselves from the next round of incoming nuclear weapons. But I don’t have the labor to build enough defensive weapons or bunkers. Also, occupying the newly available land inhabited only by poor survivors would be beneficial, but we don’t have the labor to cover all that territory. If only there was some means we can could make robots smart enough to build more robots...
Tentative conclusion: the first round gets what you want, but removes the actor from any future actions and creates a strong incentive for the very thing you intended to prevent to happen. It’s a multi-round game.
And nuclear weapons and (useful tool) ASI both make ‘players’ vastly stronger, so it is convergent over many possible timelines for people to get them.
In the event of such a war, there is no labor and there is no supply chain for microchips. The result has been demonstrated historically: technological reversion.
Technology isn’t magic: it’s the result of capital inputs and trade, and without large scale interconnection, it’ll be hard to make modern aircraft, let alone high quality chips. In fact, we personally experienced this from the very minimal disruption of COVID to supply chains. The killer app in this world would the widespread use of animal power, not robots, due to overall lower energy provisions.
And since the likely result would be what I want, but since I’m dead, I wouldn’t be bothered one way or another and therefore there is even more reason for me to punish the defector. This also sets precedent to others that this form of punishment is acceptable and increases the likelihood of it.
This is pretty simple game theory known as the grim game and is essential to a lot of life as a whole tbh.
Converging timelines is as irrelevant as a billion years. I(or someone like me) will do it as many times as needed, just like animals try to resist extinction via millions of “timelines” or lives.
I think you should reexamine what I said by convergence. Do you...really...think a world that knows how to build (safe, usable tool) ASI would ever be stable by not building it. We are very close to that world, the time is measured in years if not months. Note that any party that gets it working long enough escapes the grim game, they can do whatever they want limited by physics.
I acknowledge your point about chip production, although there are recent efforts to spread the supply chain for advanced ICs more broadly which will happen to make it more resilient to attacks.
Basically I mentally see a tree of timelines that all converge on 2 ultimate outcomes, human extinction or humans built ASI. Do you disagree and why?
Humans building AGI ASI likely leads to human extinction.
I disagree: we have many other routes of expansion, including biological improvement, cyborgism, etc. This seems akin to a cultic thinking and akin to Spartan ideas of “only hoplite warfare must be adopted or defeat ensues.”
The “limitations of physics” is quite extensive, and applies even to the pipeline leading up to anything like ASI. I am quite confident that any genuine dedication to the grim game would be more than enough to prevent it, and defiance of it leads to much more likelihood of nuclear winter worlds than ASI dominance.
But I also disagree on your prior of “this world in months”, I suppose we will see in December.
I stated “years if not months”. I agree there is probably not yet enough compute even built to find a true ASI. I assume we will need to explore many cognitive architectures, which means repeating gpt-4 scale training runs thousands of times in order to learn what actually works.
“Months” would be if I am wrong and it’s just a bit of RL away
I find it happy that we probably don’t have enough compute and it is likely this will be restricted even at this fairly early level, long before more extreme measures are needed.
Additionally, I think one should support the Grim Trigger even if you want ASI, because it forces development along more “safe” lines to prevent being Grimmed. It also encourages non-ASI advancement as alternate routes, effectively being a form of regulation.
We will see. There is incredible economic pressure right now to build as much compute as physically possible. Without coordinated government action across all countries capable of building the hardware, this is the default outcome.
One bit of timeline arguing: I think odds aren’t zero that we might be on a path that leads to AGI fairly quickly but then ends there and never pushes forward to ASI, not because ASI would be impossible in general, but because we couldn’t reach it this specific way. Our current paradigm isn’t to understand how intelligence works and build it intentionally, it’s to show a big dumb optimizer human solved tasks and tell it “see? We want you to do that”. There’s decent odds that this caps at human potential simply because it can imitate but not surpass its training data, which would require a completely different approach.
Now that I think about it, I think this is basically the path that LLMs likely take, albeit I’d say it caps out a little lower than humans in general. And I give it over 50% probability.
The basic issue here is that the reasoning Transformers do is too inefficient for multi-step problems, and I expect a lot of real world applications of AI outperforming humans will require good multi-step reasoning.
The unexpected success of LLMs isn’t as much about AI progress, as it is about how much our reasoning often is pretty bad in scenarios outside of our ancestral environment. It is less a story of AI progress and more a story of how humans inflate their own strengths like intelligence.
Assumptions:
A. It is possible to construct a benchmark to measure if a machine is a general ASI. This would be a very large number of tasks, many simulated though some may be robotic tasks in isolated labs. A general ASI benchmark would have to include tasks humans do not know how to do, but we know how to measure success.
B. We have enough computational resources to train from scratch many ASI level systems so that thousands of attempts are possible. Most attempts would reuse pretrained components in a different architecture.
C. We recursively task the best performing AGIs, as measured by the above benchmark or one meant for weaker systems, to design architectures to perform well on (A)
Currently the best we can do is use RL to design better neural networks, by finding better network architectures and activation functions. Swish was found this way, not sure how much transformer network design came from this type of recursion.
Main idea : the AGI systems exploring possible network architectures are cognitively able to take into account all published research and all past experimental runs, and the ones “in charge” are the ones who demonstrated the most measurable merit at designing prior AGI because they produced the highest performing models on the benchmark.
I think if you think about it you’ll realize it compute were limitless, this AGI to ASI transition you mention could happen instantly. A science fiction story would have it happen in hours. In reality, since training a subhuman system is taking 10k GPUs about 10 days to train, and an AGI will take more—Sam Altman has estimated the compute bill will be close to 100 billion—that’s the limiting factor. You might be right and we stay “stuck” at AGI for years until the resources to discover ASI become available.
I mean, this sounds like a brute force attack to the problem, something that ought not to be very efficient. If our AGI is roughly as smart as the 75th percentile of human engineers it might still just hit its head against a sufficiently hard problem, even in parallel, and especially if we give it the wrong prompt by assuming that the solution will be the extension of current approaches rather than a new one that requires to go back before you can go forward.
You’re correct. In the narrow domain of designing AI architectures you need the system to be at least 1.01 times as good as a human. You want more gain than that because there is a cost to running the system.
Getting gain seems to be trivially easy at least for the types of AI design tasks this has been tried on. Humans are bad at designing network architectures and activation functions.
I theorize that a machine could study the data flows from snapshots from an AI architecture attempting tasks on the AGI/ASI gym, and use that information as well as all previous results to design better architectures.
The last bit is where I expect enormous gain, because the training data set will exceed the amount of data humans can take in in a lifetime, and you would obviously have many smaller “training exercises” to design small systems to build up a general ability. (Enormous early gain. Eventually architectures are going to approach the limits allowed by the underlying compute and datasets)