This review is more broadly of the first several posts of the sequence, and discusses the entire sequence.
Epistemic Status: The thesis of this review feels highly unoriginal, but I can’t find where anyone else discusses it. I’m also very worried about proving too much. At minimum, I think this is an interesting exploration of some abstract ideas. Considering posting as a top-level post. I DO NOT ENDORSE THE POSITION IMPLIED BY THIS REVIEW (that leaving immoral mazes is bad), AND AM FAIRLY SURE I’M INCORRECT.
The rough thesis of “Meditations on Moloch” is that unregulated perfect competition will inevitably maximize for success-survival, eventually destroying all value in service of this greater goal. Zvi (correctly) points out that this does not happen in the real world, suggesting that something is at least partially incorrect about the above mode, and/or the applicability thereof. Zvi then suggests that a two-pronged reason can explain this: 1. most competition is imperfect, and 2. most of the actual cases in which we see an excess of Moloch occur when there are strong social or signaling pressures to give up slack.
In this essay, I posit an alternative explanation as to how an environment with high levels of perfect competition can prevent the destruction of all value, and further, why the immoral mazes discussed later on in this sequence are an example of highly imperfect competition that causes the Molochian nature thereof.
First, a brief digression on perfect competition: perfect competition assumes perfectly rational agents. Because all strategies discussed are continuous-time, the decisions made in any individual moment are relatively unimportant assuming that strategies do not change wildly from moment to moment, meaning that the majority of these situations can be modeled as perfect-information situations.
Second, the majority of value-destroying optimization issues in a perfect-competition environment can be presented as prisoners dilemmas: both agents get less value if all agents defect, but defection is always preferable to not-defection regardless of the strategy pursued by other agents.
Now, let’s imagine our “rational” agents operate under simplified and informal timeless decision theory: they take 100% predictable opponent’s strategy into account, and update their models of games based on these strategies (i.e. Prisoner’s Dilemma defect/cooperate with two of our Econs has a payout of −1+0*n,5+0*n)
(The following two paragraphs are not novel, they is a summary of the thought experiment that provides a motive for TDT) Econs, then, can pursue a new class of strategies: by behaving “rationally,” and having near-perfect information on opposing strategies because other agents are also behaving “rationally,” a second Nash equilibria arises: cooperate-cooperate. The most abstract example is the two perpetually betrayed libertarian jailbirds: in this case, from the outset of the “market,” both know the other’s strategy. This creates a second Nash equilibria: any change in P1′s strategy will be punished with a change in P2′s strategy next round with extremely high certainty. P1 and P2 then have a strong incentive to not defect, because it results in lots of rounds of lost profit. (Note that because this IPD doesn’t have a known end point, CDT does not mandate constant defection). In a meta-IPD game, then, competitive pressures push out defector agents who get stuck in defect-defect with our econs.
Fixed-number of player games are somewhat more complex, but fundamentally have the same scenario of any defection being punished with system-wide defection, meaning defection in highly competitive scenarios with perfectly rational agents will result in system-wide defection, a significant net negative to the potential defector. The filters stay on in a perfect market operating under TDT.
Then, an informal form of TDT (ITDT [to be clear, I’m distinguishing between TDT and ITDT only to avoid claiming that people actually abide by a formal DT)) can explain why all value is not destroyed in the majority of systems, even assuming a perfectly rational set of agents. Individually, this is interesting but not novel or particularly broad: the vast majority of the real-world examples discussed in this sequence are markets, so it’s hard to evaluate the truth of this claim without discussing markets.
Market-based games are significantly more complex: because free entry and exit are elements of perfect competition, theoretically, an undercutter agent could exploit the vulnerabilities in this system by pursing the traditional strategy, which may appear to require value collapses as agents shore up against this external threat by moving to equilibrium pricing. Let’s look at the example of the extremely rational coffee sellers, who have found a way to reduce their costs (and thus, juice their profits and all them to lower prices, increasing market share) by poisoning their coffee. At the present moment, CoffeeA and CoffeeB both control 50% of the coffee industry, and are entirely homogenous. Under the above simple model, assuming rational ITDT agents, neither agent will defect by poisoning coffee, because there’s no incentive to destroy their own profit if the other agent will merely start poisoning as well. However, an exploiter agent could begin coffee-poisoning, and (again assuming perfect competition) surpass both CoffeeA and CoffeeB, driving prices to equilibrium. However, there’s actually no incentive, again assuming a near-continuous time game, for CoffeeA and CoffeeB to defect before this actually happens. In truely perfect competition, this is irrelevant, because an agent arises “instantly” to do so, but in the real world, this is relaxed. However, it’s actually still not necessary to defect even with infinite agents: if the defection is to a 0-producer surplus price, the presence of additional agents is irrelevant because market share holds no value, so defection before additional agents arrive is still marginally negative. If the defection is to a price that preserves producer surplus, pre-defecting from initial equilibria E1 to E2 price only incentives the stable equilibria to be at a lower price, as the new agent is forced to enter at a sub-E2 price, meaning the final equilibria is effectively capped, with no benefits. Note that this now means that exploiter agents are incentivized to enter at the original equilibria price, because they “know” any other price will trigger a market collapse to that exact price, so E1 maximizes profit.
This suggests that far from perfect competition destroying value, perfect competition may preserve value with the correct choice of “rational agents.” However, any non-rational agents, or agents optimizing for different values, immediately destroy this peaceful cooperation! As EY notes here, TDT conceals a regress when other agents strategy is not predictable, which means that markets that substantially diverge from perfect competition, with non-perfect agents and/or non-perfect competition are not subject to our nice toy model.
Under this model, the reason why the airline industry is so miserable is because bailouts are accepted as common practice. This means that agents can undercut other agents to take on excess risk safely, effectively removing the 0-producer-surplus price (because agents can disguise costs by shuffling them to risk), and make strategy unpredictable and not subject to our cooperative equilibria.
Let’s look at the middle-manager example brought up in the later part of the article. Any given middle manager, assuming all middle managers were playing fully optimized strategies, would not have a strong incentive to (WLOG) increase their hours at the office. However, the real world does not behave like this: as Zvi notes, some peers shift from “successful” to “competent,” and despite the assertion that middle-management is an all-or-nothing game, I suspect that middle management is not totally homogenous in terms of willingness to erode ever-more valuable other time. This means that there are massive incentives to increase time at the office, in the hopes that peers are not willing to. The other dynamics noted by Zvi are all related to lack of equilibria, not the cause thereof.
This is a (very) longwinded way of saying that I do not think Zvi’s model is the only, the most complete way, or the simplest way to model the dynamics of preserved value in the face of Moloch. I find several elements of the ITDT explanation quite appealing: it explains why humans often find traditional Econs so repulsive, as many of the least intuitive elements of traditional “rationality” are resolved by TDT. Additionally, I dislike the vague modeling of the world as fine because it doesn’t satisfy easy-to-find price information intuitively: I don’t find the effect strong enough to substantially preserve value intuitively. In the farmers market scenario specifically, I think the discrepancy between it being a relatively perfect competitive environment and having a ton of issues with competitiveness was glossed over too quickly; this type of disagreement seems to me as though it has the potential to have significant revelatory power. I think ITDT better explains the phenomena therein: farmer’s market’s aren’t nearly as cutthroat as financial markets in using tools developed under decision that fails the prisoner’s dilemma, meaning that prisoner’s dilemmas are more likely to follow ITDT-type strategies. If desired, or others think it would offer clarity, I’d like to see either myself or someone else go through all of the scenarios discussed here under the above lens: I will do this if there is interest and this idea of this post doesn’t have obvious flaws.
However, I strongly support curation of this post: I think it poses a fascinating problem, and a useful framing thereof.
tl;dr: the world could also operate under informal TDT, this has fairly strongly explanatory power for observed Moloch/Slack differentials, this explanation has several advantages.
i found your epistemic status confusing. it reads like it’s about zvi’s post, but i assume it’s supposed to be about your review. (perhaps because you referenced your review as a post/article)
This review is more broadly of the first several posts of the sequence, and discusses the entire sequence.
Epistemic Status: The thesis of this review feels highly unoriginal, but I can’t find where anyone else discusses it. I’m also very worried about proving too much. At minimum, I think this is an interesting exploration of some abstract ideas. Considering posting as a top-level post. I DO NOT ENDORSE THE POSITION IMPLIED BY THIS REVIEW (that leaving immoral mazes is bad), AND AM FAIRLY SURE I’M INCORRECT.
The rough thesis of “Meditations on Moloch” is that unregulated perfect competition will inevitably maximize for success-survival, eventually destroying all value in service of this greater goal. Zvi (correctly) points out that this does not happen in the real world, suggesting that something is at least partially incorrect about the above mode, and/or the applicability thereof. Zvi then suggests that a two-pronged reason can explain this: 1. most competition is imperfect, and 2. most of the actual cases in which we see an excess of Moloch occur when there are strong social or signaling pressures to give up slack.
In this essay, I posit an alternative explanation as to how an environment with high levels of perfect competition can prevent the destruction of all value, and further, why the immoral mazes discussed later on in this sequence are an example of highly imperfect competition that causes the Molochian nature thereof.
First, a brief digression on perfect competition: perfect competition assumes perfectly rational agents. Because all strategies discussed are continuous-time, the decisions made in any individual moment are relatively unimportant assuming that strategies do not change wildly from moment to moment, meaning that the majority of these situations can be modeled as perfect-information situations.
Second, the majority of value-destroying optimization issues in a perfect-competition environment can be presented as prisoners dilemmas: both agents get less value if all agents defect, but defection is always preferable to not-defection regardless of the strategy pursued by other agents.
Now, let’s imagine our “rational” agents operate under simplified and informal timeless decision theory: they take 100% predictable opponent’s strategy into account, and update their models of games based on these strategies (i.e. Prisoner’s Dilemma defect/cooperate with two of our Econs has a payout of −1+0*n,5+0*n)
(The following two paragraphs are not novel, they is a summary of the thought experiment that provides a motive for TDT) Econs, then, can pursue a new class of strategies: by behaving “rationally,” and having near-perfect information on opposing strategies because other agents are also behaving “rationally,” a second Nash equilibria arises: cooperate-cooperate. The most abstract example is the two perpetually betrayed libertarian jailbirds: in this case, from the outset of the “market,” both know the other’s strategy. This creates a second Nash equilibria: any change in P1′s strategy will be punished with a change in P2′s strategy next round with extremely high certainty. P1 and P2 then have a strong incentive to not defect, because it results in lots of rounds of lost profit. (Note that because this IPD doesn’t have a known end point, CDT does not mandate constant defection). In a meta-IPD game, then, competitive pressures push out defector agents who get stuck in defect-defect with our econs.
Fixed-number of player games are somewhat more complex, but fundamentally have the same scenario of any defection being punished with system-wide defection, meaning defection in highly competitive scenarios with perfectly rational agents will result in system-wide defection, a significant net negative to the potential defector. The filters stay on in a perfect market operating under TDT.
Then, an informal form of TDT (ITDT [to be clear, I’m distinguishing between TDT and ITDT only to avoid claiming that people actually abide by a formal DT)) can explain why all value is not destroyed in the majority of systems, even assuming a perfectly rational set of agents. Individually, this is interesting but not novel or particularly broad: the vast majority of the real-world examples discussed in this sequence are markets, so it’s hard to evaluate the truth of this claim without discussing markets.
Market-based games are significantly more complex: because free entry and exit are elements of perfect competition, theoretically, an undercutter agent could exploit the vulnerabilities in this system by pursing the traditional strategy, which may appear to require value collapses as agents shore up against this external threat by moving to equilibrium pricing. Let’s look at the example of the extremely rational coffee sellers, who have found a way to reduce their costs (and thus, juice their profits and all them to lower prices, increasing market share) by poisoning their coffee. At the present moment, CoffeeA and CoffeeB both control 50% of the coffee industry, and are entirely homogenous. Under the above simple model, assuming rational ITDT agents, neither agent will defect by poisoning coffee, because there’s no incentive to destroy their own profit if the other agent will merely start poisoning as well. However, an exploiter agent could begin coffee-poisoning, and (again assuming perfect competition) surpass both CoffeeA and CoffeeB, driving prices to equilibrium. However, there’s actually no incentive, again assuming a near-continuous time game, for CoffeeA and CoffeeB to defect before this actually happens. In truely perfect competition, this is irrelevant, because an agent arises “instantly” to do so, but in the real world, this is relaxed. However, it’s actually still not necessary to defect even with infinite agents: if the defection is to a 0-producer surplus price, the presence of additional agents is irrelevant because market share holds no value, so defection before additional agents arrive is still marginally negative. If the defection is to a price that preserves producer surplus, pre-defecting from initial equilibria E1 to E2 price only incentives the stable equilibria to be at a lower price, as the new agent is forced to enter at a sub-E2 price, meaning the final equilibria is effectively capped, with no benefits. Note that this now means that exploiter agents are incentivized to enter at the original equilibria price, because they “know” any other price will trigger a market collapse to that exact price, so E1 maximizes profit.
This suggests that far from perfect competition destroying value, perfect competition may preserve value with the correct choice of “rational agents.” However, any non-rational agents, or agents optimizing for different values, immediately destroy this peaceful cooperation! As EY notes here, TDT conceals a regress when other agents strategy is not predictable, which means that markets that substantially diverge from perfect competition, with non-perfect agents and/or non-perfect competition are not subject to our nice toy model.
Under this model, the reason why the airline industry is so miserable is because bailouts are accepted as common practice. This means that agents can undercut other agents to take on excess risk safely, effectively removing the 0-producer-surplus price (because agents can disguise costs by shuffling them to risk), and make strategy unpredictable and not subject to our cooperative equilibria.
Let’s look at the middle-manager example brought up in the later part of the article. Any given middle manager, assuming all middle managers were playing fully optimized strategies, would not have a strong incentive to (WLOG) increase their hours at the office. However, the real world does not behave like this: as Zvi notes, some peers shift from “successful” to “competent,” and despite the assertion that middle-management is an all-or-nothing game, I suspect that middle management is not totally homogenous in terms of willingness to erode ever-more valuable other time. This means that there are massive incentives to increase time at the office, in the hopes that peers are not willing to. The other dynamics noted by Zvi are all related to lack of equilibria, not the cause thereof.
This is a (very) longwinded way of saying that I do not think Zvi’s model is the only, the most complete way, or the simplest way to model the dynamics of preserved value in the face of Moloch. I find several elements of the ITDT explanation quite appealing: it explains why humans often find traditional Econs so repulsive, as many of the least intuitive elements of traditional “rationality” are resolved by TDT. Additionally, I dislike the vague modeling of the world as fine because it doesn’t satisfy easy-to-find price information intuitively: I don’t find the effect strong enough to substantially preserve value intuitively. In the farmers market scenario specifically, I think the discrepancy between it being a relatively perfect competitive environment and having a ton of issues with competitiveness was glossed over too quickly; this type of disagreement seems to me as though it has the potential to have significant revelatory power. I think ITDT better explains the phenomena therein: farmer’s market’s aren’t nearly as cutthroat as financial markets in using tools developed under decision that fails the prisoner’s dilemma, meaning that prisoner’s dilemmas are more likely to follow ITDT-type strategies. If desired, or others think it would offer clarity, I’d like to see either myself or someone else go through all of the scenarios discussed here under the above lens: I will do this if there is interest and this idea of this post doesn’t have obvious flaws.
However, I strongly support curation of this post: I think it poses a fascinating problem, and a useful framing thereof.
tl;dr: the world could also operate under informal TDT, this has fairly strongly explanatory power for observed Moloch/Slack differentials, this explanation has several advantages.
i found your epistemic status confusing. it reads like it’s about zvi’s post, but i assume it’s supposed to be about your review. (perhaps because you referenced your review as a post/article)
Oops, you’re correct.
Nice, much clearer now :)
I would be very interested in your proposed follow-up but don’t have enough game theory to say whether the idea has obvious flaws.