Thomas Kwa comments on MIRI announces new “Death With Dignity” strategy

Thomas Kwa 7 May 2022 19:13 UTC
7 points
Suppose that to solve alignment the quality of our alignment research effort has to be greater than some threshold. If the distribution of possible output quality is logistic, and research moves the mean of the distribution, then I think we gain a constant amount of log-odds per unit of research quality, regardless of where we think the threshold is.