Something along the lines that value drift is inevitable and utility-functions are unstable under recursive self-improvement.
That doesn’t seem like the only circumstances in which FAI is not possible. If moral nihilism is true, then FAI is impossible even if value drift is not inevitable. In that circumstance, shouldn’t we try to make any AI we decide to build “friendly” to present day humanity, even if it wouldn’t be friendly to Aristotle or Plato or Confucius. Based on hidden complexity of wishes analysis, consistency with our current norms is still plenty hard.
My concerns are more that it will not be possible to adequately define “human”, especially as, transhuman tech develops, and that there might not be a good enough way to define what’s good for people.
As I understand it, the modest goal of building an FAI is that of giving an AGI a push in the “right” direction, what EY refers to as the initial dynamics. After that, all bets are off.
That doesn’t seem like the only circumstances in which FAI is not possible. If moral nihilism is true, then FAI is impossible even if value drift is not inevitable.
In that circumstance, shouldn’t we try to make any AI we decide to build “friendly” to present day humanity, even if it wouldn’t be friendly to Aristotle or Plato or Confucius. Based on hidden complexity of wishes analysis, consistency with our current norms is still plenty hard.
My concerns are more that it will not be possible to adequately define “human”, especially as, transhuman tech develops, and that there might not be a good enough way to define what’s good for people.
As I understand it, the modest goal of building an FAI is that of giving an AGI a push in the “right” direction, what EY refers to as the initial dynamics. After that, all bets are off.