wedrifid comments on Reply to Holden on ‘Tool AI’

wedrifid 5 Jul 2012 18:13 UTC
3 points
0

If we were smart enough to understand its policy, then it would not be smart enough to be dangerous.

That doesn’t seem true. Simple policies can be dangerous and more powerful than I am.
- Nebu 17 Feb 2016 11:24 UTC
  0 points
  0
  Parent
  To steelman the parent argument a bit, a simple policy can be dangerous, but if an agent proposed a simple and dangerous policy to us, we probably would not implement it (since we could see that it was dangerous), and thus the agent itself would not be dangerous to us.
  
  If the agent were to propose a policy that, as far as we could tell, appears safe, but was in fact dangerous, then simultaneously:
  1. We didn’t understand the policy.
  2. The agent was dangerous to us.