Wow. I didn’t see that one coming. Self-reproducing cellular automata. Brings back memories.
If I understand it correctly, the rest of your comment is a quibble about infinity. I don’t “get” that. Why not just take things one output symbol at a time?
Well, it wasn’t just a quibble about infinity. There was also the dig about discount rates. ;)
But I really am mystified. Is a ‘step’ in this kind of computation to output a symbol and switch to a different state? Are there formulas for calculating utilities? What data go into the calculation?
Exactly how does computation work here? Perhaps I need an example. How would you use this ‘utility maximization as a programming language’ scheme to program the machine to compute the square root of 2? I really don’t understand how this is related to either lambda calculus or Turing machines. Why don’t you take some time, work out the details, and then produce one of your essays?
I didn’t (and still don’t) understand how discount rates were relevant—if not via considering the comment about infinite output strings.
What data go into the calculation of utilities? The available history of sense data, memories, and any current inputs. The agent’s internal state, IOW.
Exactly how does computation work here?
Just like it normally does? You just write the utility function in a Turing-complete language—which you have to do anyway if you want any generality. The only minor complication is how to get a (single-valued) “function” to output a collection of motor outputs in parallel—but serialisation provides a standard solution to this “problem”.
Universal action might get an essay one day.
...and yes, if I hear too many more times that humans don’t have utility functions (we are better than that!) - or that utility maximisation is a bad implementation plan - I might polish up a page that debunks those—ISTM—terribly-flawed concepts—so I can just refer people to that.
I would usually answer this with a measure of inclusive fitness. However, it appears here that we are just talking about the agent’s brain—so in this context what that maximises is just utility—since that is the conventional term for such a maximand.
Your options seem to be exploring how agents calculate utilities. Are those all the options? An agent usually calculates utilities associated with its possible actions—and then chooses the action associated with the highest utility. That option doesn’t seem to be on the list. It looks a bit like 1 - but that seems to specifiy no lookahead—or no lookahead of a particular kind. Future actions are usually very important influences when choosing the current action. Their utilities are usually pretty important too.
If you are trying to make sense of my views in this area, perhaps see the bits about pragmatic and ideal utility functions—here:
Yes. In fact, 2 strictly contains both 1 and 3, by virtue of setting the discount rate to either 0 or 1.
Future actions are usually very important influences when choosing the current action.
But not strictly as important as the utility of the outcome of the current action. The amount by which future actions are less important than the outcome of the current action, and the methods by which we determine that, are what we mean when we say discount rates.
Yes. In fact, 2 strictly contains both 1 and 3, by virtue of setting the discount rate to either 0 or 1.
That helps understand the options. I am not sure I had enough info to figure out what you meant before.
1 corresponds to eating chocolate gateau all day and not brushing your teeth—not very realistic as you say. 3 looks like an option containing infinite numbers—and 2 is what all practical agents actually do.
However, I don’t think this captures what we were talking about. Pragmatic utility functions are necessarily temporally discounted—due to resource limitations and other effects. The issue is more whether ideal utility functions can be expected to be so discounted. I can’t think why they should be—and can think of several reasons why they shouldn’t be—which we have already covered.
Infinity is surely not a problem—you can just maximise utility over T years and let T increase in an unbounded fashion. The uncertainty principle limits the predictions of embedded agents in practice—so T won’t ever become too large to deal with.
However, I don’t think this captures what we were talking about. Pragmatic utility functions are necessarily temporally discounted—due to resource limitations and other effects.
My understanding is that “pragmatic utility functions” are supposed to be approximations to “ideal utility functions”—preferable only because the “pragmatic” are effectively computable whereas the ideal are not.
Our argument is that we see nothing constraining ideal utility functions to be finite unless you allow discounting at the ideal level. And if ideal utilities are infinite, then pragmatic utilities that approximate them must be infinite too. And comparison of infinite utilities in the hope of detecting finite differences cannot usefully guide choice.
Hence, we believe that discounting at the ideal level is inevitable. Particularly if we are talking about potentially immortal agents (or mortal agents who care about an infinite future).
Your last paragraph made no sense. Are you claiming that the consequence of actions made today must inevitably have negligible effect upon the distant future? A rather fatalistic stance to find in a forum dealing with existential risk. And not particularly realistic, either.
You seem obsessed with infinity :-( What about the universal heat death? Forget about infinity—just consider whether we want to discount on a scale of 1 year, 10 years, 100 years, 1,000 years, 10,000 years—or whatever.
I think “ideal” short-term discounting is potentially problematical. Once we are out to discounting on a billion year timescale, that is well into the “how many angels dance on the head of a pin” territory—from my perspective.
Some of the causes of instrumental discounting look very difficult to overcome—even for a superintelligence. The future naturally gets discounted to the extent that you can’t predict and control it—and many phenomena (e.g. the weather) are very challenging to predict very far into the future—unless you can bring them actively under your control.
Are you claiming that the consequence of actions made today must inevitably have negligible effect upon the distant future?
No, The idea was that predicting those consequences is often hard—and it gets harder the further out you go. Long term predictions thus often don’t add much to what short-term ones give you.
Wow. I didn’t see that one coming. Self-reproducing cellular automata. Brings back memories.
Well, it wasn’t just a quibble about infinity. There was also the dig about discount rates. ;)
But I really am mystified. Is a ‘step’ in this kind of computation to output a symbol and switch to a different state? Are there formulas for calculating utilities? What data go into the calculation?
Exactly how does computation work here? Perhaps I need an example. How would you use this ‘utility maximization as a programming language’ scheme to program the machine to compute the square root of 2? I really don’t understand how this is related to either lambda calculus or Turing machines. Why don’t you take some time, work out the details, and then produce one of your essays?
I didn’t (and still don’t) understand how discount rates were relevant—if not via considering the comment about infinite output strings.
What data go into the calculation of utilities? The available history of sense data, memories, and any current inputs. The agent’s internal state, IOW.
Just like it normally does? You just write the utility function in a Turing-complete language—which you have to do anyway if you want any generality. The only minor complication is how to get a (single-valued) “function” to output a collection of motor outputs in parallel—but serialisation provides a standard solution to this “problem”.
Universal action might get an essay one day.
...and yes, if I hear too many more times that humans don’t have utility functions (we are better than that!) - or that utility maximisation is a bad implementation plan - I might polish up a page that debunks those—ISTM—terribly-flawed concepts—so I can just refer people to that.
What is it that the agent acts so as to maximize?
The utility of the next action (ignoring the utility of expected future actions)
The utility of the next action plus a discounted expectation of future utilities.
The simple sum of all future expected utilities.
To me, only the first two options make mathematical sense, but the first doesn’t really make sense as a model of human motivation.
I would usually answer this with a measure of inclusive fitness. However, it appears here that we are just talking about the agent’s brain—so in this context what that maximises is just utility—since that is the conventional term for such a maximand.
Your options seem to be exploring how agents calculate utilities. Are those all the options? An agent usually calculates utilities associated with its possible actions—and then chooses the action associated with the highest utility. That option doesn’t seem to be on the list. It looks a bit like 1 - but that seems to specifiy no lookahead—or no lookahead of a particular kind. Future actions are usually very important influences when choosing the current action. Their utilities are usually pretty important too.
If you are trying to make sense of my views in this area, perhaps see the bits about pragmatic and ideal utility functions—here:
http://timtyler.org/expected_utility_maximisers/
Yes. In fact, 2 strictly contains both 1 and 3, by virtue of setting the discount rate to either 0 or 1.
But not strictly as important as the utility of the outcome of the current action. The amount by which future actions are less important than the outcome of the current action, and the methods by which we determine that, are what we mean when we say discount rates.
That helps understand the options. I am not sure I had enough info to figure out what you meant before.
1 corresponds to eating chocolate gateau all day and not brushing your teeth—not very realistic as you say. 3 looks like an option containing infinite numbers—and 2 is what all practical agents actually do.
However, I don’t think this captures what we were talking about. Pragmatic utility functions are necessarily temporally discounted—due to resource limitations and other effects. The issue is more whether ideal utility functions can be expected to be so discounted. I can’t think why they should be—and can think of several reasons why they shouldn’t be—which we have already covered.
Infinity is surely not a problem—you can just maximise utility over T years and let T increase in an unbounded fashion. The uncertainty principle limits the predictions of embedded agents in practice—so T won’t ever become too large to deal with.
My understanding is that “pragmatic utility functions” are supposed to be approximations to “ideal utility functions”—preferable only because the “pragmatic” are effectively computable whereas the ideal are not.
Our argument is that we see nothing constraining ideal utility functions to be finite unless you allow discounting at the ideal level. And if ideal utilities are infinite, then pragmatic utilities that approximate them must be infinite too. And comparison of infinite utilities in the hope of detecting finite differences cannot usefully guide choice. Hence, we believe that discounting at the ideal level is inevitable. Particularly if we are talking about potentially immortal agents (or mortal agents who care about an infinite future).
Your last paragraph made no sense. Are you claiming that the consequence of actions made today must inevitably have negligible effect upon the distant future? A rather fatalistic stance to find in a forum dealing with existential risk. And not particularly realistic, either.
You seem obsessed with infinity :-( What about the universal heat death? Forget about infinity—just consider whether we want to discount on a scale of 1 year, 10 years, 100 years, 1,000 years, 10,000 years—or whatever.
I think “ideal” short-term discounting is potentially problematical. Once we are out to discounting on a billion year timescale, that is well into the “how many angels dance on the head of a pin” territory—from my perspective.
Some of the causes of instrumental discounting look very difficult to overcome—even for a superintelligence. The future naturally gets discounted to the extent that you can’t predict and control it—and many phenomena (e.g. the weather) are very challenging to predict very far into the future—unless you can bring them actively under your control.
No, The idea was that predicting those consequences is often hard—and it gets harder the further out you go. Long term predictions thus often don’t add much to what short-term ones give you.
Flippantly: we’re going to have billions of years to find a solution to that problem.