Reposting (after a slight rewrite) from the telegram group:
This might be a nitpick, but to my (maybe misguided) understanding, alignment is only a very specific subfield of ai safety research, which basically boils down to “how do I give a set of rules/utility function/designs that avoid meta or mesa optimizations that have dramatic unforseen consequences” (This is at least how I understood MIRI’s focus pre-2020)
For instance, as I understand it, interpretability research is not directly alignment research. Instead, it is part of the broader “AI safety research” (which includes alignment research, interpretability, transparency, corrigeability, …)
With that being said, I do think that your points apply for renaming “AI safety research” to Artifical Intention Research still hold, and I would be very much in favor of it. It is more self-explanatory, catchier, does not require doom-assumptions to be worth investigating which I think matters a lot in public communication.
Reposting (after a slight rewrite) from the telegram group:
This might be a nitpick, but to my (maybe misguided) understanding, alignment is only a very specific subfield of ai safety research, which basically boils down to “how do I give a set of rules/utility function/designs that avoid meta or mesa optimizations that have dramatic unforseen consequences” (This is at least how I understood MIRI’s focus pre-2020)
For instance, as I understand it, interpretability research is not directly alignment research. Instead, it is part of the broader “AI safety research” (which includes alignment research, interpretability, transparency, corrigeability, …)
With that being said, I do think that your points apply for renaming “AI safety research” to Artifical Intention Research still hold, and I would be very much in favor of it. It is more self-explanatory, catchier, does not require doom-assumptions to be worth investigating which I think matters a lot in public communication.