If the time at which anyone activates a uFAI is known, SI should activate their current FAI best effort (CFBE) one day before that.
If the time at which anyone activates a GAI of unknown friendliness is known, SI should compare the probability distribution function for the friendliness of the two AIs, and activate their CFBE one day earlier only if it has more probability mass on the “friendly” side.
If the time at which anyone makes a uFAI is unknown, SI should activate their CFBE when the probability that they’ll improve the CFBE in the next day is lower than the probability that someone will activate a uFAI in the next day.
If the time at which anyone makes a GAI of unknown friendliness is unknown, SI should activate their CFBE when the probability that CFBE=uFAI is less than the probability that anyone else will activate a GAI of unknown friendliness, multiplied by the probability that the other GAI will be unfriendly.
...I think. I do tend to miss the obvious when trying to think systematically, and I was visualizing gaussian pdfs without any particular justification, and a 1-day decision cycle with monotonically improving CFBE, and this is only a first-order approximation: It doesn’t take into account any correlations between the decisions of SI and other GAI researchers.
If the time at which anyone activates a uFAI is known, SI should activate their current FAI best effort (CFBE) one day before that.
If the time at which anyone activates a GAI of unknown friendliness is known, SI should compare the probability distribution function for the friendliness of the two AIs, and activate their CFBE one day earlier only if it has more probability mass on the “friendly” side.
If the time at which anyone makes a uFAI is unknown, SI should activate their CFBE when the probability that they’ll improve the CFBE in the next day is lower than the probability that someone will activate a uFAI in the next day.
If the time at which anyone makes a GAI of unknown friendliness is unknown, SI should activate their CFBE when the probability that CFBE=uFAI is less than the probability that anyone else will activate a GAI of unknown friendliness, multiplied by the probability that the other GAI will be unfriendly.
...I think. I do tend to miss the obvious when trying to think systematically, and I was visualizing gaussian pdfs without any particular justification, and a 1-day decision cycle with monotonically improving CFBE, and this is only a first-order approximation: It doesn’t take into account any correlations between the decisions of SI and other GAI researchers.