Other algorithms are not really “choosing a fixed point somehow”. They’re typically failing to guarantee a fixed point. The mathematician hinted at this by describing how algorithms would not necessarily converge to a self-fulfilling prophecy; they could just as easily go in circles or wander around randomly forever.
At this point, the Predict-O-Matic has stepped into a hall of mirrors. To predict the next prediction made by the Predict-O-Matic in its world model, the Predict-O-Matic needs to run an internal simulation of that Predict-O-Matic. But as it runs that simulation, it finds that simulation kicking off another Predict-O-Matic simulation in the simulated Predict-O-Matic’s world model! Etc, etc.
It made me realize a few things
1. When I imagined an strategy identified fixed points/self-fulfilling prophecies, I was explicitly imagining the kind of strategy that first roughly enumerated a number of possibilities (without zooming in on impacts it may itself cause) and then exploring those possibilities by re-simulating them while conditioning on it giving them as a prediction and noticing that they have fixed point like properties. However, from a practical perspective, this doesn’t seem very feasible considering the massive number of degrees of freedom that the Predict-O-Matic has.
Why did I think along these lines specifically? Well, I read this
The engineer thinks about it for a minute. “I’m not sure. Predict-O-Matic keeps an internal model which has probabilities of events. The answer to a question isn’t really separate from the expected observation. So ‘probability of observation depending on that prediction’ would translate to ‘probability of an event given that event’, which just has to be one.”
and interpreted it to mean that, since observations will depend on the prediction to at least some extent, the algorithm really should sanity-check its prediction to make sure it still happens when it is given as an output. Hence, I imagined a gradual bumbling into of fixed points as the algorithm would sanity check its prediction, modify it slightly to address the changes, and then repeat—falling into a basin of attraction (to be clear, I also interpreted the Predict-O-Matic to be myopic).
2. A prediction doesn’t have to be completely self-fulfilling or completely independent of the observation. The iterative process I described above of sanity-checking a prediction, adjusting it, and iterating is unlikely to produce a prediction that guarantees an outcome but it will probably raise the chances of that outcome happening to some extent. In other words, self-fulfilling predictions themselves might be rare but predictions that slightly raise the chances of themselves happening when given as an output could be relatively normal.
3. As you note here, the actual behavior of the Predict-O-Matic is heavily determined by the details of how the optimizer tunes the strategies it considers over time. Whether it ultimately interacts with fixed points depends on the kind of strategies the optimizer gets to look at. Furthermore, things like selecting for maximally surprising self-fulfilling prophecies as I described could be strategically avoided by using different algorithms. Using a quantilizer (ie, building a bunch of strategies and randomly picking one of the ones within the top 5% historical performance) could yield multiple distinct strategies that each fall into distinct basins of attraction. This would cause any single strategy to suffer if it falls too deeply into a surprising self-fulfilling prediction because it won’t be able to predict whether that prediction will actually be made. To carry the economics metaphors, you might try to shift an “assassination market”-like thing into a “Keynesian Beauty Contest”-like thing.
4. One of the other relevant things in the Parable was that accuracy in the context of the Predict-O-Matic is underspecified in a lot of ways that could be important. When your machine can try to predict any detail of the world, prediction error can vary greatly. If you ask about the stock market, it might be just one number. If you ask about a painting, it might be a lot more things. Since the Predict-O-Matic as described has natural language processing (presumably with the goal of providing a good interface for humans), it’s prediction accuracy might even have to be calculated by weighting the inaccuracy of each specific aspect of the prediction by the anticipated importance of that aspect to the human interacting with it. Even if we try to keep things simple and ignore that aspect, there are a lot of ways that the Predict-O-Matic might output a prediction of the same events. Thus, a given strategy might strategically phrase the predictions in the ways most optimal for ensuring the prediction’s accuracy (or strategically try to minimize prediction error by making the prediction as low impact or generic as possible).
5. A lot of this is obviously getting into the weeds of what really specific thing the Predict-O-Matic is trying to do and I’m not sure how helpful it is in general.
---------
Anyway, moving on....
But many (not all) unsupervised algorithms still have the critical features we’re interested in! Predicting x without any context information y to help still involves (1) getting feedback on what we “should have” expected, and (2) updating to a configuration which would have more expected that. We simply can’t expect the predictions to be as focused, given the absence of contextual information to help. But that just means it’s a prediction task on which we tend to expect lower accuracy.
This is an important but non-obvious thing. Consider kernel density estimation with cross-validation. What you’re actually doing here is taking the distibution x you’ve observedand then testing how a bunch of simple distributional models built on subsets of that observed data generalize to the rest of the set. While you’re learning about a distribution in an unsupervised manor, you can also interpret it as supervised learning:
p(y = unobserved x | x = observed x)
---------
If you have an outer optimizer which is trying to maximize f(x,y) through x while being indifferent about y, it seems sensible to suppose that inner optimizers will want to change y to throw things off, particularly if they can get credit for then correcting x to be optimal for the new y. If so, then inner optimizers will generally be seeking to find y-values which make the current x a comparatively bad choice. So this argument does not establish an incentive to choose y which makes all choices of x poor.
This is a somewhat confusing paragraph for me because the idea of the outer optimizer trying to optimize a function of something that is doesn’t care about seems strange. Based on previous discussions of partial agency, I’m assuming this means something like
-The outer optimizer rewards the inner optimizer whenever it changes x in a way that increases f
-The outer optimizer does not care if f decreases because the inner optimizer has changed y while holding x constant
In the case of the Predict-O-Matic, I just didn’t imagine this happening because I figured the outer optimizer was just trying to myopically minimize the prediction error without adjusting for which of the inputs were modified. However, when this does happen, the results can be chaotic. After all, you can simply an x that optimized f when y=y1 and another x that optimized f when y=y2. Then simply move y back and forth without punishment and get a reward every time you “correct” x to the relevant optimum. This let’s you milk the outer optimizer for infinite reward. For example:
f(x,y) = (1-2^(-yx))
If you start at y0=1, x0=10, you’ll get diminishing returns if you increase x. But, if you change y from y0 to y=-1, f drops dramatically and you can collect all the reward you’d like by decreasing x to −10. Then simply switch y from −1 back to 1 and reverse the shift you just made. Infinite reward cycle. I think Yudkowsky said something about how utility functions are good because they order outcomes well and, if you don’t have ordering, someone can exploit you by repeatedly cycling through a process like the one above.
Since the reward you can get out of a process like this is arbitrarily large and, in practical scenarios, the inner optimizer will only have a model of how its decisions will change the rewards it gets, the actual path traced out above can be highly depending on initial conditions. One algorithm might hold y constant and blindly try to increase x since it increases f, converging on a boring finite reward. Another might discover the loophole and start cycling the outer optimizer for reward but be scared to risk leaving the cycle (since who knows what the function looks like outside the two optima it found, or whether it will ever b e able to get back?). Another might explore and realize it shift y even more dramatically and get even more dramatic rewards faster (if it’s trying to maximize total reward and realizes it only has a finite amount of time).
This kind of maximizing surprise/entropy is very different from the kind of maximization I envisioned. When I imagined an algorithm that tried to throw off the algorithms it was competing against, it was creating surprising outcomes but this surprise was fundamentally relative to the predictor. To the algorithm causing surprise, nothing was surprising. To the algorithms it was competing against, it was all surprising. To the people outside the Predict-O-Matic, who knows if it was surprising? This really depends on whether the inside Predict-o-Matic strategies tend to make predictions similar to the ones people would make. In short, you’re considering objective increases in entropy relative to the target function. I’m considering subjective increases in entropy relative to people/strategies aside from the one chosen by the Predict-O-Matic.
--------------
The idea is that incentivizing agents to lower the error of your predictions (as in a prediction market) looks exactly like incentivizing them to “create” information (find ways of making the world more chaotic), and this is no coincidence. So perhaps there’s a more general principle behind it, where trying to incentivize minimization of f(x,y) only through channel x (eg, only by improving predictions) results in an incentive to maximize f through y, under some additional assumptions. Maybe there is a connection to optimization duality in there.
No idea if its useful but there’s gotta be something here! Per wikipedia:
According to George Dantzig, the duality theorem for linear optimization was conjectured by John von Neumann immediately after Dantzig presented the linear programming problem. Von Neumann noted that he was using information from his game theory, and conjectured that two person zero sum matrix game was equivalent to linear programming. Rigorous proofs were first published in 1948 by Albert W. Tucker and his group. (Dantzig’s foreword to Nering and Tucker, 1993)
It’s been a while since I took my linear optimization class but the primal and dual problems in linear algebra involve switching between a maximization and minimization problem and interchanging the constraints of a linear problem with the coefficients of the target function and vice-versa. For a given strategy in the Predict-O-Matic, you might try and imagine a strategy that tries to minimize prediction error while viewing the performances of its competitors as constraints. When you switch to the dual problem, you switch to minimizing the constraints and constraining on the minimization. This is hand-waivey of course; and I’m not sure it applies in the context of entropy caused by partial agency as you described it.
This is something that John_Maxwell made made think about when discussing the Dualist Predict-o-Matic:
It made me realize a few things
1. When I imagined an strategy identified fixed points/self-fulfilling prophecies, I was explicitly imagining the kind of strategy that first roughly enumerated a number of possibilities (without zooming in on impacts it may itself cause) and then exploring those possibilities by re-simulating them while conditioning on it giving them as a prediction and noticing that they have fixed point like properties. However, from a practical perspective, this doesn’t seem very feasible considering the massive number of degrees of freedom that the Predict-O-Matic has.
Why did I think along these lines specifically? Well, I read this
and interpreted it to mean that, since observations will depend on the prediction to at least some extent, the algorithm really should sanity-check its prediction to make sure it still happens when it is given as an output. Hence, I imagined a gradual bumbling into of fixed points as the algorithm would sanity check its prediction, modify it slightly to address the changes, and then repeat—falling into a basin of attraction (to be clear, I also interpreted the Predict-O-Matic to be myopic).
2. A prediction doesn’t have to be completely self-fulfilling or completely independent of the observation. The iterative process I described above of sanity-checking a prediction, adjusting it, and iterating is unlikely to produce a prediction that guarantees an outcome but it will probably raise the chances of that outcome happening to some extent. In other words, self-fulfilling predictions themselves might be rare but predictions that slightly raise the chances of themselves happening when given as an output could be relatively normal.
3. As you note here, the actual behavior of the Predict-O-Matic is heavily determined by the details of how the optimizer tunes the strategies it considers over time. Whether it ultimately interacts with fixed points depends on the kind of strategies the optimizer gets to look at. Furthermore, things like selecting for maximally surprising self-fulfilling prophecies as I described could be strategically avoided by using different algorithms. Using a quantilizer (ie, building a bunch of strategies and randomly picking one of the ones within the top 5% historical performance) could yield multiple distinct strategies that each fall into distinct basins of attraction. This would cause any single strategy to suffer if it falls too deeply into a surprising self-fulfilling prediction because it won’t be able to predict whether that prediction will actually be made. To carry the economics metaphors, you might try to shift an “assassination market”-like thing into a “Keynesian Beauty Contest”-like thing.
4. One of the other relevant things in the Parable was that accuracy in the context of the Predict-O-Matic is underspecified in a lot of ways that could be important. When your machine can try to predict any detail of the world, prediction error can vary greatly. If you ask about the stock market, it might be just one number. If you ask about a painting, it might be a lot more things. Since the Predict-O-Matic as described has natural language processing (presumably with the goal of providing a good interface for humans), it’s prediction accuracy might even have to be calculated by weighting the inaccuracy of each specific aspect of the prediction by the anticipated importance of that aspect to the human interacting with it. Even if we try to keep things simple and ignore that aspect, there are a lot of ways that the Predict-O-Matic might output a prediction of the same events. Thus, a given strategy might strategically phrase the predictions in the ways most optimal for ensuring the prediction’s accuracy (or strategically try to minimize prediction error by making the prediction as low impact or generic as possible).
5. A lot of this is obviously getting into the weeds of what really specific thing the Predict-O-Matic is trying to do and I’m not sure how helpful it is in general.
---------
Anyway, moving on....
This is an important but non-obvious thing. Consider kernel density estimation with cross-validation. What you’re actually doing here is taking the distibution x you’ve observed and then testing how a bunch of simple distributional models built on subsets of that observed data generalize to the rest of the set. While you’re learning about a distribution in an unsupervised manor, you can also interpret it as supervised learning:
p(y = unobserved x | x = observed x)
---------
This is a somewhat confusing paragraph for me because the idea of the outer optimizer trying to optimize a function of something that is doesn’t care about seems strange. Based on previous discussions of partial agency, I’m assuming this means something like
-The outer optimizer rewards the inner optimizer whenever it changes x in a way that increases f
-The outer optimizer does not care if f decreases because the inner optimizer has changed y while holding x constant
In the case of the Predict-O-Matic, I just didn’t imagine this happening because I figured the outer optimizer was just trying to myopically minimize the prediction error without adjusting for which of the inputs were modified. However, when this does happen, the results can be chaotic. After all, you can simply an x that optimized f when y=y1 and another x that optimized f when y=y2. Then simply move y back and forth without punishment and get a reward every time you “correct” x to the relevant optimum. This let’s you milk the outer optimizer for infinite reward. For example:
f(x,y) = (1-2^(-yx))
If you start at y0=1, x0=10, you’ll get diminishing returns if you increase x. But, if you change y from y0 to y=-1, f drops dramatically and you can collect all the reward you’d like by decreasing x to −10. Then simply switch y from −1 back to 1 and reverse the shift you just made. Infinite reward cycle. I think Yudkowsky said something about how utility functions are good because they order outcomes well and, if you don’t have ordering, someone can exploit you by repeatedly cycling through a process like the one above.
Since the reward you can get out of a process like this is arbitrarily large and, in practical scenarios, the inner optimizer will only have a model of how its decisions will change the rewards it gets, the actual path traced out above can be highly depending on initial conditions. One algorithm might hold y constant and blindly try to increase x since it increases f, converging on a boring finite reward. Another might discover the loophole and start cycling the outer optimizer for reward but be scared to risk leaving the cycle (since who knows what the function looks like outside the two optima it found, or whether it will ever b e able to get back?). Another might explore and realize it shift y even more dramatically and get even more dramatic rewards faster (if it’s trying to maximize total reward and realizes it only has a finite amount of time).
This kind of maximizing surprise/entropy is very different from the kind of maximization I envisioned. When I imagined an algorithm that tried to throw off the algorithms it was competing against, it was creating surprising outcomes but this surprise was fundamentally relative to the predictor. To the algorithm causing surprise, nothing was surprising. To the algorithms it was competing against, it was all surprising. To the people outside the Predict-O-Matic, who knows if it was surprising? This really depends on whether the inside Predict-o-Matic strategies tend to make predictions similar to the ones people would make. In short, you’re considering objective increases in entropy relative to the target function. I’m considering subjective increases in entropy relative to people/strategies aside from the one chosen by the Predict-O-Matic.
--------------
No idea if its useful but there’s gotta be something here! Per wikipedia:
It’s been a while since I took my linear optimization class but the primal and dual problems in linear algebra involve switching between a maximization and minimization problem and interchanging the constraints of a linear problem with the coefficients of the target function and vice-versa. For a given strategy in the Predict-O-Matic, you might try and imagine a strategy that tries to minimize prediction error while viewing the performances of its competitors as constraints. When you switch to the dual problem, you switch to minimizing the constraints and constraining on the minimization. This is hand-waivey of course; and I’m not sure it applies in the context of entropy caused by partial agency as you described it.