Richard Hollerith. 15 miles north of San Francisco. hruvulum@gmail.com
My probability that AI research will end all human life is .92. It went up drastically when Eliezer started going public with his pessimistic assessment in April 2022. Till then my confidence in MIRI (and knowing that MIRI has enough funding to employ many researchers) was keeping my probability down to about .4. (I am glad I found out about Eliezer’s assessment.)
Currently I am willing to meet with almost anyone on the subject of AI extinction risk.
Last updated 26 Sep 2023.
You do realize that by “alignment”, the OP (John) is not talking about techniques that prevent an AI that is less generally capable than a capable person from insulting the user or expressing racist sentiments?
We seek a methodology for constructing an AI that either ensures that the AI turns out not to be able to easily outsmart us or (if it does turn out to be able to easily outsmart us) ensures (or makes it unlikely) that it won’t kill us all or do something other terrible thing. (The former is not researched much compared to the latter, but I felt the need to include it for completeness.)
The way it is now, it is not even clear whether you and the OP (John) are talking about the same thing (because “alignment” has come to have a broad meaning).
If you want to continue the conversation, it would help to know whether you see a pressing need for a methodology of the type I describe above. (Many AI researchers do not: they think that outcomes like human extinction are quite unlikely or at least easy to avoid.)