Based on this pet theory and post hoc rationalization about Bella, I might argue that the place where Bella went wrong was in becoming a vampire and accepting apparently permanent modifications to her mind despite not being forced into it by a true emergency or verifying that the post-modification state passes the “self critiquing reversibility” test.
Possibly, but keep in mind she has evidence that this irreversible transition would make her better at improving. Not wanting to become superior because that might make you overconfident is a pretty self-defeating strategy; though constantly checking plans for signs of overconfidence is a good plan. (That is, if she thought about it beforehand and was more self-aware, she would understand journaling is valuable as more than a memory aid, and keep it up or find a substitute as a vampire. But she’d be able to journal / self-critique way better as a vampire than as a human.)
Not wanting to become superior because that might make you overconfident...
...is not what I’m talking about.
The self-critiquing-reversibility test is designed specifically to prevent apparent self improvements which are not actual self improvements and from which you cannot retreat. If the test is passed then it should give you more room to play and explore because you actually have a safety net in the form of a “bailout option”.
The test is designed to prevent you from, for example, getting addicted to a purported nootropic that turns out to be more like crystal meth than like caffeine. Avoiding “belief in the value of irrational belief” is another place where the heuristic might be applied.
For Bella, thing vampires can’t do include turning off their desire for blood, or changing their emotional connection to their mates. These are, in some sense, “permanent utility function tweaks” rather that simple “optimization power upgrades”.
If Harry had applied the test in the first handful of chapters of MoR, he would have asked McGonagall if it was possible for him to explore the wizarding world but then back out somehow if he decided it was better to be a muggle instead of a wizard after educating himself about the costs and benefits of both states. The best answer from McGonagall (though I don’t think she can actually do this, which may be relevant) is “Here, let me take veritaserum… Now… Yes, easily, because memories can be erased with an obliviation spell and returning to a naive state will be basically the same as never having learned about the wizarding world in the first place, but you’ll find that the cost benefit analysis is unambiguously positive because of things like X and Y which appeal to you right now. The biggest downsides are P and Q and similar issues which are obviously negligible in the face of X and Y.”
keep in mind she has evidence that this irreversible transition would make her better at improving
Absolutely. Resilience and naive optimization are often in conflict.
The highest expected value strategy in investing is to put all your money in the single investment that itself has the highest expected value (assuming the opportunity is large enough that your whole contribution doesn’t push the project very far down its marginal utility curve so the last dollar invested will have lower return on investment than some other investment). Nonetheless an index fund can be a better strategy based on variance estimates and more or less sophisticated risk of ruin calculations combined with the value of “avoiding ruin”. Nearly all billionaires are massively “over invested” in their own companies and they frequently stop being billionaires for this very reason. The fortune 500 has substantial turnover decade-over-decade in part because a company has to sacrifice some resilience to get onto that list and in the long run (since corporations are potentially immortal), a lack of resilience catches up to them.
This is what I was trying to get at with the link about causal density. Applying the epistemic reversibility test too diligently can be inferred from first principles to hurt you if you are in a “get big fast” regime where the only survivors are lucky risk takers. Or maybe it can hurt you for some other reason I don’t know about yet that will make more sense to me if I apply it some day and then get hurt in a novel way...
And, honestly speaking, for any given heuristic I consciously apply, I expect to gain some benefit, while generally expecting to get hurt sometimes. If I keep doing novel stuff with an eye towards rational self improvement it seems inevitable that I’ll get hurt in a way I wasn’t expecting—however it seems reasonable to suppose the damage will be limited because I’m on the lookout for it. In working in this area at all I’m either implicitly or explicitly guesstimating that there is an upside to “rationality in general” that beats the downside.
Rationally speaking, it would make sense to make the risks of active rationality cultivation explicit and then subject the the calculation to conscious analysis, and then abandon active rationality cultivation if the expected value is honestly negative. It is precisely the fact that rationality basically demands this kind of bailout analysis at some point that has generally helped me to feel safe(ish) when experimenting with this particular package of memes.
The highest expected value strategy in investing is to put all your money in the single investment that itself has the highest expected value (assuming the opportunity is large enough that your whole contribution doesn’t push the project very far down its marginal utility curve so the last dollar invested will have lower return on investment than some other investment).
I wanted to comment on this example: the benefits to index funds are more than in variance. Trading costs make it a superior long-term strategy to managed funds / researching your own stock picks (the highest expected value investment will change from moment to moment), and the fact that stock prices are not independent means a well-chosen larger subset of stocks will have higher expected value than a poorly-chosen smaller subset of stocks.
To reword that last sentence (and explain what I mean by well-chosen and poorly chosen): if I sort stocks by expected return over the next month and pick the top five, my expected value is worse than or the same as if I also model the effect stock prices have on other stocks, and then pick the set of five stocks whose expected return over the next month is highest, even though I have access to the same set of stocks.
That is to say, the EV of a single stock being highest is due to the rudimentary nature of that EV. You can improve the EV without discarding that method of analysis, and without touching on utility concerns (where risk of ruin comes into play in a big way).
I agree with you that irreversibility should raise giant red flags and suggest that an EU (or however you want to abbreviate expected utility) calculation is a better choice than an EV calculation, and plans which are reversible significantly decrease the risk of ruin. But I think Bella’s overall risk of ruin decreased with the transition to a vampire (and then massively increased with the transition to a revolutionary), and she had good reason to expect that would be the case.
That is interesting to think about, though, the optimal way to manage a transition like that- hmm.
Possibly, but keep in mind she has evidence that this irreversible transition would make her better at improving. Not wanting to become superior because that might make you overconfident is a pretty self-defeating strategy; though constantly checking plans for signs of overconfidence is a good plan. (That is, if she thought about it beforehand and was more self-aware, she would understand journaling is valuable as more than a memory aid, and keep it up or find a substitute as a vampire. But she’d be able to journal / self-critique way better as a vampire than as a human.)
...is not what I’m talking about.
The self-critiquing-reversibility test is designed specifically to prevent apparent self improvements which are not actual self improvements and from which you cannot retreat. If the test is passed then it should give you more room to play and explore because you actually have a safety net in the form of a “bailout option”.
The test is designed to prevent you from, for example, getting addicted to a purported nootropic that turns out to be more like crystal meth than like caffeine. Avoiding “belief in the value of irrational belief” is another place where the heuristic might be applied.
For Bella, thing vampires can’t do include turning off their desire for blood, or changing their emotional connection to their mates. These are, in some sense, “permanent utility function tweaks” rather that simple “optimization power upgrades”.
If Harry had applied the test in the first handful of chapters of MoR, he would have asked McGonagall if it was possible for him to explore the wizarding world but then back out somehow if he decided it was better to be a muggle instead of a wizard after educating himself about the costs and benefits of both states. The best answer from McGonagall (though I don’t think she can actually do this, which may be relevant) is “Here, let me take veritaserum… Now… Yes, easily, because memories can be erased with an obliviation spell and returning to a naive state will be basically the same as never having learned about the wizarding world in the first place, but you’ll find that the cost benefit analysis is unambiguously positive because of things like X and Y which appeal to you right now. The biggest downsides are P and Q and similar issues which are obviously negligible in the face of X and Y.”
Absolutely. Resilience and naive optimization are often in conflict.
The highest expected value strategy in investing is to put all your money in the single investment that itself has the highest expected value (assuming the opportunity is large enough that your whole contribution doesn’t push the project very far down its marginal utility curve so the last dollar invested will have lower return on investment than some other investment). Nonetheless an index fund can be a better strategy based on variance estimates and more or less sophisticated risk of ruin calculations combined with the value of “avoiding ruin”. Nearly all billionaires are massively “over invested” in their own companies and they frequently stop being billionaires for this very reason. The fortune 500 has substantial turnover decade-over-decade in part because a company has to sacrifice some resilience to get onto that list and in the long run (since corporations are potentially immortal), a lack of resilience catches up to them.
This is what I was trying to get at with the link about causal density. Applying the epistemic reversibility test too diligently can be inferred from first principles to hurt you if you are in a “get big fast” regime where the only survivors are lucky risk takers. Or maybe it can hurt you for some other reason I don’t know about yet that will make more sense to me if I apply it some day and then get hurt in a novel way...
And, honestly speaking, for any given heuristic I consciously apply, I expect to gain some benefit, while generally expecting to get hurt sometimes. If I keep doing novel stuff with an eye towards rational self improvement it seems inevitable that I’ll get hurt in a way I wasn’t expecting—however it seems reasonable to suppose the damage will be limited because I’m on the lookout for it. In working in this area at all I’m either implicitly or explicitly guesstimating that there is an upside to “rationality in general” that beats the downside.
Rationally speaking, it would make sense to make the risks of active rationality cultivation explicit and then subject the the calculation to conscious analysis, and then abandon active rationality cultivation if the expected value is honestly negative. It is precisely the fact that rationality basically demands this kind of bailout analysis at some point that has generally helped me to feel safe(ish) when experimenting with this particular package of memes.
I wanted to comment on this example: the benefits to index funds are more than in variance. Trading costs make it a superior long-term strategy to managed funds / researching your own stock picks (the highest expected value investment will change from moment to moment), and the fact that stock prices are not independent means a well-chosen larger subset of stocks will have higher expected value than a poorly-chosen smaller subset of stocks.
To reword that last sentence (and explain what I mean by well-chosen and poorly chosen): if I sort stocks by expected return over the next month and pick the top five, my expected value is worse than or the same as if I also model the effect stock prices have on other stocks, and then pick the set of five stocks whose expected return over the next month is highest, even though I have access to the same set of stocks.
That is to say, the EV of a single stock being highest is due to the rudimentary nature of that EV. You can improve the EV without discarding that method of analysis, and without touching on utility concerns (where risk of ruin comes into play in a big way).
I agree with you that irreversibility should raise giant red flags and suggest that an EU (or however you want to abbreviate expected utility) calculation is a better choice than an EV calculation, and plans which are reversible significantly decrease the risk of ruin. But I think Bella’s overall risk of ruin decreased with the transition to a vampire (and then massively increased with the transition to a revolutionary), and she had good reason to expect that would be the case.
That is interesting to think about, though, the optimal way to manage a transition like that- hmm.