Stuart Armstrong argued that Hanson’s critique doesn’t work because PAL assumes contract enforceability, and with advanced AI, institutions might not be up to the task. Indeed, contract enforceability is assumed in most of PAL, so it’s an important consideration regarding their applicability to AI scenarios more broadly.
This seems kind of off to me. When I think about using the analysis of contracts between humans and AIs, I’m not imagining legal contracts: I’m using it as a metaphor for the human getting to directly set what ‘the AI’* is motivated to do. As such, the contract really is strictly enforced, because the ‘contract’ is the motivational system of ‘the AI’ which there’s reason think ‘the AI’ is motivated to preserve and capable of preserving, a la Omohundro’s basic AI drives.
Now, I think there are two issues with this:
Agents sometimes are incentivised to change their preferences as a result of bargaining. Think: “In order for us to work together on this project, which will deliver you bountiful rewards, I need you to stop being motivated to steal my lightly-guarded belongings, because I’m just not good enough at security to disincentivise you from stealing them myself.”
More generally, we can think of the ‘contract’ as the program that constitutes ‘the AI’, which might be so complicated that humans don’t understand it, and that might include a planning routine. In this case, ‘the AI’ might be motivated to modify the ‘contract’ to make ‘itself’ smarter.
But at any rate, I think the contract enforceability problem isn’t a knock-down against the PAL being relevant.
[*] scare quotes to be a little more accurate and to placate my simulated Eric Drexler
I think it’s worth distinguishing between a legal contract and setting the AI’s motivational system, even though the latter is a contract in some sense. My reading of Stuart’s post was that it was intended literally, not as a metaphor. Regardless, both are relevant; in PAL, you’d model motivational system via the agents utility function, and the contract enforceability via the background assumption.
But I agree that contract enforceability isn’t a knock-down, and indeed won’t be an issue by default. I think we should have framed this more clearly in the post. Here’s the most important part of what we said:
But it is plausible for when AIs are similarly smart to humans, and in scenarios where powerful AIs are used to enforce contracts. Furthermore, if we cannot enforce contracts with AIs then people will promptly realise and stop using AIs; so we should expect contracts to be enforceable conditional upon AIs being used.
I think it’s worth distinguishing between a legal contract and setting the AI’s motivational system, even though the latter is a contract in some sense.
To restate/clarify my above comment, I agree, but think that we are likely to delegate tasks to AIs by setting their motivational system and not by drafting literal legal contracts with them. So the PAL is relevant to the extent that it works as a metaphor for setting an AIs motivational system and source code, and in this context contract enforceability isn’t an issue, and Stuart is making a mistake to be thinking about literal legal contracts (assuming that he is doing so).
Thanks for clarifying. That’s interesting and seems right if you think we won’t draft legal contracts with AI. Could you elaborate on why you think that?
Well because I think they wouldn’t be enforceable in the really bad cases the contracts would be trying to prevent :) And also by default people currently delegate tasks to computers by writing software, which I expect to continue in future (although I guess smart contracts are an interesting edge case here).
This seems kind of off to me. When I think about using the analysis of contracts between humans and AIs, I’m not imagining legal contracts: I’m using it as a metaphor for the human getting to directly set what ‘the AI’* is motivated to do. As such, the contract really is strictly enforced, because the ‘contract’ is the motivational system of ‘the AI’ which there’s reason think ‘the AI’ is motivated to preserve and capable of preserving, a la Omohundro’s basic AI drives.
Now, I think there are two issues with this:
Agents sometimes are incentivised to change their preferences as a result of bargaining. Think: “In order for us to work together on this project, which will deliver you bountiful rewards, I need you to stop being motivated to steal my lightly-guarded belongings, because I’m just not good enough at security to disincentivise you from stealing them myself.”
More generally, we can think of the ‘contract’ as the program that constitutes ‘the AI’, which might be so complicated that humans don’t understand it, and that might include a planning routine. In this case, ‘the AI’ might be motivated to modify the ‘contract’ to make ‘itself’ smarter.
But at any rate, I think the contract enforceability problem isn’t a knock-down against the PAL being relevant.
[*] scare quotes to be a little more accurate and to placate my simulated Eric Drexler
I think it’s worth distinguishing between a legal contract and setting the AI’s motivational system, even though the latter is a contract in some sense. My reading of Stuart’s post was that it was intended literally, not as a metaphor. Regardless, both are relevant; in PAL, you’d model motivational system via the agents utility function, and the contract enforceability via the background assumption.
But I agree that contract enforceability isn’t a knock-down, and indeed won’t be an issue by default. I think we should have framed this more clearly in the post. Here’s the most important part of what we said:
To restate/clarify my above comment, I agree, but think that we are likely to delegate tasks to AIs by setting their motivational system and not by drafting literal legal contracts with them. So the PAL is relevant to the extent that it works as a metaphor for setting an AIs motivational system and source code, and in this context contract enforceability isn’t an issue, and Stuart is making a mistake to be thinking about literal legal contracts (assuming that he is doing so).
Thanks for clarifying. That’s interesting and seems right if you think we won’t draft legal contracts with AI. Could you elaborate on why you think that?
Well because I think they wouldn’t be enforceable in the really bad cases the contracts would be trying to prevent :) And also by default people currently delegate tasks to computers by writing software, which I expect to continue in future (although I guess smart contracts are an interesting edge case here).