This does look like a fruitful place to look, but one of the main problems here with demonstrating superiority is that the systems can emulate each other pretty well. Claims of superiority typically take the form of “X seems more intuitive” or “I can encode X in less space using this structure” rather than “X comes to a different, better conclusion.” For example:
If you have a bounded utility function, you may consider a universe with (say) 10^18 tortured people to be equally bad as a universe with any higher number of tortured people
You can have asymptotic bounds that “mostly” solve this problem, or at least they solve this problem about as well as a controller would.
For example, suppose my utility based on the number of people that are alive is the logistic function (with x0 set to, say, 1,000 or 1,000,000). Then I will always prefer a world where X1 people are alive to a world where X2 people are alive iff X1>X2, but the utility is bounded above by 1, and has nice global properties.
Basically, it smooths together the “I would like more people to be alive” desire and the “I would like humanity to continue” desire in a continuous fashion, such that a 50-50 flip that doubles the human population (and wealth and so on) on heads and eliminates them on tails looks like a terrible idea (despite being neutral if your utility function is linear in the number of humans alive). I’m not sure that the logistic has the local behavior that we would want at any particular population size, but something like it probably does.
The solution that a controller would apply to this is typically referring to “upper bound on control effort.” That is, the error can be arbitrarily large, but at some point you simply don’t have any more ability to adjust the system, and so having 1e18 more people tortured than you want is “just as bad” as having 1e6 more people tortured than you want because both situations are bad enough to employ your maximal effort trying to reduce the number. One thing about this approach is that the bound is determined by your ability to affect the world rather than your capacity to care, but it’s not clear to me if that actually makes much of a difference, either mathematically or physically.
On the topic of comparing controllers to utility functions—how does a controller decide what kinds of probabilistic tradeoffs are worth making? For instance, if you have a utility function, it’s straightforward to determine whether you prefer, say, a choice that creates X1 new lives with probability P_x1 and kills Y1 people with probability P_y1, versus a choice that creates X2 new new lives with probability P_x2 and kills Y2 people with probability P_y2. How does one model that choice in a control theory framework?
How does one model that choice in a control theory framework?
I see two main challenges. First, we need to somehow encode distributions, and second, we need to look ahead. Both of those are doable, but it’s worth mentioning explicitly that the bread and butter of utility maximization (considering probabilistic gambles, and looking ahead to the future) are things that need to be built in the control theory framework, and can be built in a number of different ways. (If we do have a scenario where it’s easy to enumerate the choice set, or at least the rules that generate the choice set, and it’s also easy to express the preference function, then utility is the right approach to take.)
The closest to the utility framework is likely to wrap the probability distributions over outcomes as the states, and then the ‘error’ is basically a measure of how much one distribution differs from the distribution we’re shooting for. Possible actions are probably fed into a simulator circuit, that spits out the expected distribution. It looks like we could basically express this problem as “minimize opportunity cost while pursuing many options,” as if we ever simulate a plan and think it’s better than our current best plan we replace the current best plan, but if we simulate a plan and it’s not better than our current best plan we look for a new plan to simulate. (You’d also likely bake in some stopping criterion.)
So it would probably look at choice 1, encode the discrete pmf as the reference state, then look at choice 2, and decide whether or not the error is positive (which it responds to by switching to choice 2) or negative (which it responds to by acting on choice 1). But in order to compare pmfs and get a sense of positive or negative I need to have some mathematical function, which would be the utility function in the utility framework.
We also might notice that this makes it easy for endowment effect problems to creep in- if none of the options are obviously better than any of the other options, it would default to whichever one came first. On the flip side, it makes it easy to start working with the first mediocre plan we come across, and then abandon that plan if a better one shows up. That is, this is more suited to operating in continuous time than a “plan, then act” utility maximization framework.
This does look like a fruitful place to look, but one of the main problems here with demonstrating superiority is that the systems can emulate each other pretty well. Claims of superiority typically take the form of “X seems more intuitive” or “I can encode X in less space using this structure” rather than “X comes to a different, better conclusion.” For example:
You can have asymptotic bounds that “mostly” solve this problem, or at least they solve this problem about as well as a controller would.
For example, suppose my utility based on the number of people that are alive is the logistic function (with x0 set to, say, 1,000 or 1,000,000). Then I will always prefer a world where X1 people are alive to a world where X2 people are alive iff X1>X2, but the utility is bounded above by 1, and has nice global properties.
Basically, it smooths together the “I would like more people to be alive” desire and the “I would like humanity to continue” desire in a continuous fashion, such that a 50-50 flip that doubles the human population (and wealth and so on) on heads and eliminates them on tails looks like a terrible idea (despite being neutral if your utility function is linear in the number of humans alive). I’m not sure that the logistic has the local behavior that we would want at any particular population size, but something like it probably does.
The solution that a controller would apply to this is typically referring to “upper bound on control effort.” That is, the error can be arbitrarily large, but at some point you simply don’t have any more ability to adjust the system, and so having 1e18 more people tortured than you want is “just as bad” as having 1e6 more people tortured than you want because both situations are bad enough to employ your maximal effort trying to reduce the number. One thing about this approach is that the bound is determined by your ability to affect the world rather than your capacity to care, but it’s not clear to me if that actually makes much of a difference, either mathematically or physically.
Thanks, that makes sense.
On the topic of comparing controllers to utility functions—how does a controller decide what kinds of probabilistic tradeoffs are worth making? For instance, if you have a utility function, it’s straightforward to determine whether you prefer, say, a choice that creates X1 new lives with probability P_x1 and kills Y1 people with probability P_y1, versus a choice that creates X2 new new lives with probability P_x2 and kills Y2 people with probability P_y2. How does one model that choice in a control theory framework?
I see two main challenges. First, we need to somehow encode distributions, and second, we need to look ahead. Both of those are doable, but it’s worth mentioning explicitly that the bread and butter of utility maximization (considering probabilistic gambles, and looking ahead to the future) are things that need to be built in the control theory framework, and can be built in a number of different ways. (If we do have a scenario where it’s easy to enumerate the choice set, or at least the rules that generate the choice set, and it’s also easy to express the preference function, then utility is the right approach to take.)
The closest to the utility framework is likely to wrap the probability distributions over outcomes as the states, and then the ‘error’ is basically a measure of how much one distribution differs from the distribution we’re shooting for. Possible actions are probably fed into a simulator circuit, that spits out the expected distribution. It looks like we could basically express this problem as “minimize opportunity cost while pursuing many options,” as if we ever simulate a plan and think it’s better than our current best plan we replace the current best plan, but if we simulate a plan and it’s not better than our current best plan we look for a new plan to simulate. (You’d also likely bake in some stopping criterion.)
So it would probably look at choice 1, encode the discrete pmf as the reference state, then look at choice 2, and decide whether or not the error is positive (which it responds to by switching to choice 2) or negative (which it responds to by acting on choice 1). But in order to compare pmfs and get a sense of positive or negative I need to have some mathematical function, which would be the utility function in the utility framework.
We also might notice that this makes it easy for endowment effect problems to creep in- if none of the options are obviously better than any of the other options, it would default to whichever one came first. On the flip side, it makes it easy to start working with the first mediocre plan we come across, and then abandon that plan if a better one shows up. That is, this is more suited to operating in continuous time than a “plan, then act” utility maximization framework.