Human value is fragile as well as complex, so if you create an AGI with a roughly-human-like value system, then this may not be good enough, and it is likely to rapidly diverge into something with little or no respect for human values
… that doesn’t seem quite right. The main problem with values being fragile isn’t that a “roughly-human-like value system” might diverge rapidly; it’s that properly implementing a “roughly-human-like value system” is actually quite hard and most AGI programmer seem to underestimate it’s complexity, and go for “hacky” solutions, which I find somewhat scary.
Ben seems aware of this, and later goes on to say:
This is related to the point Eliezer Yudkowsky makes that “value is complex”—actually, human value is not only complex, it’s nebulous and fuzzy and ever-shifting, and humans largely grok it by implicit procedural, empathic and episodic knowledge rather than explicit declarative or linguistic knowledge.
… which seems to be one of the reasons to pay extra attention to it (and this also seems to be a reason given by Eliezer, whereas Ben almost presents it as a counterpoint to Eliezer).
Human evaluation of human values under specific instances is everything that Ben says it is (complex, nebulous, fuzzy, ever-shifting, and grokked by implicit rather than explicit knowledge).
On the other-hand, evaluation of a points in the Mandelbroit set by a deterministically moving entity that is susceptible to color-illusions is even more complex, nebulous, fuzzy, and ever-shifting to the extent that it probably can’t be grokked at all. Yet, it is generated from two very simple formulae (the second being the deterministic movement of the entity).
Eliezer has provided absolutely NO rational arguments (much less proof) that the core of Friendly is complex at all. Further, paying attention to the fact that ethical mandates within the obviously complex real world (particularly when viewed through the biased eyes and fallible beings) are comprehensible at all would seem an indication that maybe there are just a small number of simple laws underlying them (or maybe only one—see my comment on Ben’s post cross-posted at http://becominggaia.wordpress.com/2010/10/30/ben-goertzel-the-singularity-institutes-scary-idea/ for easy access).
When he was paraphrasing the reasons:
… that doesn’t seem quite right. The main problem with values being fragile isn’t that a “roughly-human-like value system” might diverge rapidly; it’s that properly implementing a “roughly-human-like value system” is actually quite hard and most AGI programmer seem to underestimate it’s complexity, and go for “hacky” solutions, which I find somewhat scary.
Ben seems aware of this, and later goes on to say:
… which seems to be one of the reasons to pay extra attention to it (and this also seems to be a reason given by Eliezer, whereas Ben almost presents it as a counterpoint to Eliezer).
Human evaluation of human values under specific instances is everything that Ben says it is (complex, nebulous, fuzzy, ever-shifting, and grokked by implicit rather than explicit knowledge).
On the other-hand, evaluation of a points in the Mandelbroit set by a deterministically moving entity that is susceptible to color-illusions is even more complex, nebulous, fuzzy, and ever-shifting to the extent that it probably can’t be grokked at all. Yet, it is generated from two very simple formulae (the second being the deterministic movement of the entity).
Eliezer has provided absolutely NO rational arguments (much less proof) that the core of Friendly is complex at all. Further, paying attention to the fact that ethical mandates within the obviously complex real world (particularly when viewed through the biased eyes and fallible beings) are comprehensible at all would seem an indication that maybe there are just a small number of simple laws underlying them (or maybe only one—see my comment on Ben’s post cross-posted at http://becominggaia.wordpress.com/2010/10/30/ben-goertzel-the-singularity-institutes-scary-idea/ for easy access).
My take on the optimisation target of all self-organising systems:
http://originoflife.net/gods_utility_function/
Eliezer Yudkowsky explains why he doesn’t like such things:
http://lesswrong.com/lw/lq/fake_utility_functions/