I am not an expert in blockchain technology nor AI, so I am asking this community this question as I imagine it is already much explored.
My understanding is that AI alignment is an existential risk because of the fear of an intelligence explosion, and the related paperclip maximization risks (among others). This appear to cause a substantial amount of fear in some people. However, my initial reaction is more sanguine, as it seems to me that we may have tools to avoid these risks—namely, the concentration of a far more capable power than us.
Specifically, two features of blockchain technologies that would appear to reduce the power of a hostile AI would be a) transaction costs and b) decentralization. Say the world interfaced using blockchain technology. Then the transaction costs of taking a series of adversarial action from this hostile AI would either bankrupt the AI, or stall enough to allow people to fork and abandon that AI. But the greater point may be decentralization. If the incentives are set up to minimize collusion, and constraints on concentration imposed, wouldn’t that weaken the power that any given AI could accumulate? That is, if we set up the system such that no one party can dominate, then even a superintelligent AI could not dominate.
That is not the cleanest way to express my intuition, so maybe this a better way: my understanding is that crypto is secured not by trust, guns, or rules, but by fundamental computational limits (regarding e.g., the factorization of large numbers for the cryptographic hash). Those fundamental limits imposed by nature would apply to an AI as well. By connecting money and power to the fundamental limits imposed by nature / math, the ability of any one actor (including AI) to gain arbitrary power without the consent of everyone else would be limited, as even arbitrary intelligence can’t do the physically impossible. That is, by imposing a constraint on the ability to accumulate power in general, that constraint is also imposed on AI.
Where does this logic fail me? I know the logic I presented here is not at all air tight—for example, I could imagine some AI exploiting some incentive structure, some weakness in some chain, or using collusion to make my point moot. These topics are deep, and I know I’m missing a lot of the nuance here. Still, it seems that many people have thought deeply on this subject, so I imagine there is an answer (likely a negative one, if my understanding of others’ sentiment is correct), and I was hoping this community could help me learn. Nonetheless, I am hopeful that my intuition is related to some sort of optimistic vision of the future, and I would welcome any readings linking these two subjects.
One thing you’re missing: cryptography is secured by say, the difficulty of factoring large numbers, but cryptocurrencies are secured by no one being able to take over more than 51% of the network’s processing (in proof-of-work currencies, like Bitcoin). They’re kind of the same technology, but cryptocurrencies are relying on a second-order effect.
Put another way: this “hardness” isn’t a fundamental computational limit, but more of an enforced limitation because other parties want to verify your computation.
Also, one hole in your idea: the actual machine you compute on wouldn’t matter to an AI, so it could easily jump from say, the Ethereum VM to an AWS instance that runs the same code (several orders of magnitude faster). The AI only needs to hit the point where it can buy that AWS instance, so I think you’re only providing a temporary barrier.
I agree that’s a risk for proof-of-work chains like Bitcoin, but say Ethereum completes its move to proof-of-stake. Then the AI would need to own 51% of Ethereum, and if the stakers don’t want to sell, it seems like the AI is stuck. 51% is not as good as 100%, sure, a single veto point is not enough, but it is half as good and would seem to buy you time (as I would imagine people would become suspicious of any entity that accumulated too much ethereum, effectively).
While there are hard physical limits on computation (or at least there seem to be, based on our current knowledge of physics), cryptographic systems are not generally based on those limits, and are not known to be difficult to break. It’s just that we haven’t discovered an easy way to break them yet—except for all the cryptosystems where we have discovered a way, and so we don’t use those systems anymore. This should not inspire too much confidence in the currently used systems, especially against a superhuman adversary.
As long as the AI has something of value to offer, people will have an incentive to trade with it. Even if the increments are small, it could gain control of lots of resources over time. By analogy, it’s not hard to find people who disapprove of how Jeff Bezos spends his money, but who still shop on Amazon.
One typical concern around building friendly AI that is slower or less effective than unfriendly AI is that a smarter unfriendly AI will be able to win against any non-smart friendly AI. Your proposal doesn’t seem to address this problem.
It’s also not clear how your proposal would prevent an AI running on a blockchain from using its (limited due to blockchain stuff?) power to create a copy of itself running on more traditional computing hardware.
My thought is that even if an AI could create a copy of itself on more traditional hardware, it would nonetheless have little functionality to do anything, because everything else would be on the blockchain. If I had an idle computer disconnected to the internet today, even if it were hyperintelligent, what could it do?
It could convince you to connect it to the Internet.
Though this is already a false dichotomy. The negation of “on the blockchain” is not “disconnected to the internet”. Almost all traditional hardware is connected to the internet.
>The negation of “on the blockchain” is not “disconnected to the internet”. Almost all traditional hardware is connected to the internet.
Of course that’s the case today! I’m speaking of a hypothetical future where the entire internet interfaces using blockchain technology.
Do you mean some sort of layer inversion where the only way to send any sort of data packet to some other machine is to … use a blockchain, which relies on the ability to send packets to other machines? I don’t get how this works.