I’m 94% confident it came from a Facebook thread where you blegged for help naming the concept and Rob suggested it. I’ll have a look now to find it and report back.
Edit: having a hard time finding it, though note that Paul repeats the claim at the top of his post on corrigibility in 2017.
Ok, I’ve given this some thought, and I’d call it:
“Corrigible Reasoning”
using the definition of corrigible as “capable of being corrected, rectified, or reformed”. (And of course AIs that don’t meet this criterion are “Incorrigible”)
Thank you very much! It seems worth distinguishing the concept invention from the name brainstorming, in a case like this one, but I now agree that Rob Miles invented the word itself.
The technical term corrigibility, coined by Robert Miles, was introduced to the AGI safety/alignment community in the 2015 paper MIRI/FHI paper titled Corrigibility.
Eg I’d suggest that to avoid confusion this kind of language should be something like “The technical term corrigibility, a name suggested by Robert Miles to denote concepts previously discussed at MIRI, was introduced...” &c.
Thanks at lot all! I just edited the post above to change the language as suggested.
FWIW, Paul’s post on corrigibility here was my primary source for the into that Robert Miles named the technical term. Nice to see the original suggestion as made on Facebook too.
I’m 94% confident it came from a Facebook thread where you blegged for help naming the concept and Rob suggested it. I’ll have a look now to find it and report back.
Edit: having a hard time finding it, though note that Paul repeats the claim at the top of his post on corrigibility in 2017.
Here it is: https://www.facebook.com/yudkowsky/posts/10152443714699228?comment_id=10152445126604228
Rob Miles (May 2014):
Thank you very much! It seems worth distinguishing the concept invention from the name brainstorming, in a case like this one, but I now agree that Rob Miles invented the word itself.
Eg I’d suggest that to avoid confusion this kind of language should be something like “The technical term corrigibility, a name suggested by Robert Miles to denote concepts previously discussed at MIRI, was introduced...” &c.
You’re welcome. Yeah “invented the concept” and “named the concept” are different (and both important!).
Thanks at lot all! I just edited the post above to change the language as suggested.
FWIW, Paul’s post on corrigibility here was my primary source for the into that Robert Miles named the technical term. Nice to see the original suggestion as made on Facebook too.
Note that the way Paul phrases it in that post is much clearer and more accurate:
> “I believe this concept was introduced in the context of AI by Eliezer and named by Robert Miles”