“No: such vouchers would not be redeemable in the marketplace: they would be worthless. Everyone would realise that—including the AI.”
The oil bank stands ready to exchange any particular voucher for a barrel of oil, so if the utility function refers to the values of particular items, they can all have that market price. Compare with the price of gold or some other metal traded on international commodity markets. The gold in Fort Knox is often valued at the market price per ounce of gold multiplied by the number of ounces present, but in fact you couldn’t actually sell all of those ingots without sending the market price into a nosedive. Defining wealth in any sort of precise way that captures what a human is aiming for will involve huge numbers of value-laden decisions, like how to value such items.
“This is an example of the wirehead fallacy framed in economic terms.”
Actually this isn’t an example of the AI wireheading (directly adjusting a ‘reward counter’ or positive reinforcer), just a description of a utility function that doesn’t unambiguously pick out what human designers might want.
“As Omohundro puts it, “AIs will try to prevent counterfeit utility”.”
A system will try to prevent counterfeit utility, assessing that via its current utility function. If the utility function isn’t what you wanted, this doesn’t help.
“No: such vouchers would not be redeemable in the marketplace: they would be worthless. Everyone would realise that—including the AI.”
The oil bank stands ready to exchange any particular voucher for a barrel of oil, so if the utility function refers to the values of particular items, they can all have that market price. Compare with the price of gold or some other metal traded on international commodity markets. The gold in Fort Knox is often valued at the market price per ounce of gold multiplied by the number of ounces present, but in fact you couldn’t actually sell all of those ingots without sending the market price into a nosedive. Defining wealth in any sort of precise way that captures what a human is aiming for will involve huge numbers of value-laden decisions, like how to value such items.
“This is an example of the wirehead fallacy framed in economic terms.” Actually this isn’t an example of the AI wireheading (directly adjusting a ‘reward counter’ or positive reinforcer), just a description of a utility function that doesn’t unambiguously pick out what human designers might want.
“As Omohundro puts it, “AIs will try to prevent counterfeit utility”.” A system will try to prevent counterfeit utility, assessing that via its current utility function. If the utility function isn’t what you wanted, this doesn’t help.