If I understand it correctly, Vitaliks main argument for accelerating (in any form) is that human defense has always exceeded expectations. But this is ignoring the whole “with ASI, we (might) have only one try” argument. All the examples he names, like solving smog, acid rain or the ozon layer, were reactions to problems, that were already existing for years. He even states it pretty directly: “version N of our civilization’s technology causes a problem, and version N+1 fixes it.” What if the problem of version N is already big enough to wipe out humanity fast?
The vitalik.ca page is down btw. Here is the link to the decentralized version.
AGI version N : produces m utility in the real world when faced with all the real world noise and obstacles.
Weak ASI version N+1 : produces f(s)*m utility in the real world. S is a term that represents scale times algorithmic gain.
Maximum runtime ASI version N+1: produces f(s)*m utility in the real world.
The doom concern is the thought that the benefits of giving a machine the maximum amount of compute humans are able to practically supply (note that that any given architecture saturates on interconnect bandwidth, you cannot simply rack current gen GPUS without bounds) will result in an ASI that has so much real world utility it’s unstoppable.
And the ASI can optimize itself and fit on more computers than the multi billion dollar cluster it was developed on.
If scaling is logarithmic, this would mean that
F(s) = log(s). This would mean that other human actors with their weaker, but stable “tool” AGI will be able to fight back effectively in a world with some amount of escaped or hostile superintelligence. Assuming the human actors (these are mostly militaries) have a large resource advantage they would win the campaign.
I think doomers like Yudnowsky assume it’s not logarithmic, and Geohot and vitalik and others assume some kind of sharply diminishing returns.
Diminishing returns means you just revert back to the last stable version and use that, or patch your ASIs container and use it to fight against the one that just escaped. Your “last stable version” or your “containerized” ASI are weaker in utility than the one that escaped. But assuming you control most of the compute and most of the weapons, you can compensate for a utility gap. This would be an example of N+-1 of a technology saving you from the bad one.
As far as I know the current empirical data shows diminishing returns for current algorithms. This doesn’t prove another algorithm isn’t possible and obviously for specific sub problems like context length, scaling better than quadratic has a dozen papers offering more efficient methods.
If I understand it correctly, Vitaliks main argument for accelerating (in any form) is that human defense has always exceeded expectations. But this is ignoring the whole “with ASI, we (might) have only one try” argument. All the examples he names, like solving smog, acid rain or the ozon layer, were reactions to problems, that were already existing for years. He even states it pretty directly: “version N of our civilization’s technology causes a problem, and version N+1 fixes it.” What if the problem of version N is already big enough to wipe out humanity fast?
The vitalik.ca page is down btw. Here is the link to the decentralized version.
https://vitalik.eth.limo/general/2023/11/27/techno_optimism.html
So you have made an assumption here.
AGI version N : produces m utility in the real world when faced with all the real world noise and obstacles.
Weak ASI version N+1 : produces f(s)*m utility in the real world. S is a term that represents scale times algorithmic gain.
Maximum runtime ASI version N+1: produces f(s)*m utility in the real world.
The doom concern is the thought that the benefits of giving a machine the maximum amount of compute humans are able to practically supply (note that that any given architecture saturates on interconnect bandwidth, you cannot simply rack current gen GPUS without bounds) will result in an ASI that has so much real world utility it’s unstoppable.
And the ASI can optimize itself and fit on more computers than the multi billion dollar cluster it was developed on.
If scaling is logarithmic, this would mean that F(s) = log(s). This would mean that other human actors with their weaker, but stable “tool” AGI will be able to fight back effectively in a world with some amount of escaped or hostile superintelligence. Assuming the human actors (these are mostly militaries) have a large resource advantage they would win the campaign.
I think doomers like Yudnowsky assume it’s not logarithmic, and Geohot and vitalik and others assume some kind of sharply diminishing returns.
Diminishing returns means you just revert back to the last stable version and use that, or patch your ASIs container and use it to fight against the one that just escaped. Your “last stable version” or your “containerized” ASI are weaker in utility than the one that escaped. But assuming you control most of the compute and most of the weapons, you can compensate for a utility gap. This would be an example of N+-1 of a technology saving you from the bad one.
As far as I know the current empirical data shows diminishing returns for current algorithms. This doesn’t prove another algorithm isn’t possible and obviously for specific sub problems like context length, scaling better than quadratic has a dozen papers offering more efficient methods.