One of the subskills mentioned in Eliezer’s Security Mindset post is mitigating assumption risk–that is, the risk of losing utility because some of your assumptions are wrong. There are two main ways to do this:
Gain more information about whether your assumptions hold
Make the assumption irrelevant (such as the hashing passwords example)
Here are a bunch more examples:
Repeating back what someone said in your own words, to check understanding
Adding a margin of safety when estimating how much load a bridge can bear
Using Statistical models that make fewer assumptions, or have fatter tails
Exposing your work to attack in low-risk situations, such as comedians testing new material in small clubs, or Netflix’s Chaos Monkey
Emphasizing fast adaption to unexpected circumstances over better forecasting
Putting spare capacity in steps in your process that aren’t the bottleneck
Testing code frequently while refactoring to check functionality doesn’t unintentionally change
Doing an analysis in different ways on different datasets, and only trusting them when the conclusions match
Examples of Mitigating Assumption Risk
One of the subskills mentioned in Eliezer’s Security Mindset post is mitigating assumption risk–that is, the risk of losing utility because some of your assumptions are wrong. There are two main ways to do this:
Gain more information about whether your assumptions hold
Make the assumption irrelevant (such as the hashing passwords example)
Here are a bunch more examples:
Repeating back what someone said in your own words, to check understanding
Adding a margin of safety when estimating how much load a bridge can bear
Using Statistical models that make fewer assumptions, or have fatter tails
Exposing your work to attack in low-risk situations, such as comedians testing new material in small clubs, or Netflix’s Chaos Monkey
Emphasizing fast adaption to unexpected circumstances over better forecasting
Putting spare capacity in steps in your process that aren’t the bottleneck
Testing code frequently while refactoring to check functionality doesn’t unintentionally change
Doing an analysis in different ways on different datasets, and only trusting them when the conclusions match