a position of no power and moderate intelligence (where it is now)
Most people are quite happy to give current AIs relatively unrestricted access to sensitive data, APIs, and other powerful levers for effecting far-reaching change in the world. So far, this has actually worked out totally fine! But that’s mostly because the AIs aren’t (yet) smart enough to make effective use of those levers (for good or ill), let alone be deceptive about it.
To the degree that people don’t trust AIs with access to even more powerful levers, it’s usually because they fear the AI getting tricked by adversarial humans into misusing those levers (e.g. through prompt injection), not fear that the AI itself will be deliberately tricky.
But we’re not going to deliberately allow such a position unless we can trust it.
One can hope, sure. But what I actually expect is that people will generally give AIs more power and trust as they get more capable, not less.
Most people are quite happy to give current AIs relatively unrestricted access to sensitive data, APIs, and other powerful levers for effecting far-reaching change in the world. So far, this has actually worked out totally fine! But that’s mostly because the AIs aren’t (yet) smart enough to make effective use of those levers (for good or ill), let alone be deceptive about it.
To the degree that people don’t trust AIs with access to even more powerful levers, it’s usually because they fear the AI getting tricked by adversarial humans into misusing those levers (e.g. through prompt injection), not fear that the AI itself will be deliberately tricky.
One can hope, sure. But what I actually expect is that people will generally give AIs more power and trust as they get more capable, not less.